After writing the script to create an image collage using Python and its OpenCV2 library, I got curious about using FFMPEG to achieve the same results. The Python script works, but it's kinda slow. FFMPEG is written in C, so it should be a lot faster. Digging into the topic, I found a FFMPEG "filter" called xstack which operates on a concept simple to understand, but somewhat onerous to type out. For example, if you want to generate a grid that's 16x16, you'd have to type out,
xstack=inputs=16:layout=0_0|0_h0|0_h0+h1|0_h0+h1+h2|w0_0|w0_h0|w0_h0+h1|w0_h0+h1+h2|w0+w4_0| w0+w4_h0|w0+w4_h0+h1|w0+w4_h0+h1+h2|w0+w4+w8_0|w0+w4+w8_h0|w0+w4+w8_h0+h1|w0+w4+w8_h0+h1+h2
which represents,
input1(0, 0) | input5(w0, 0) | input9 (w0+w4, 0) | input13(w0+w4+w8, 0)
input2(0, h0) | input6(w0, h0) | input10(w0+w4, h0) | input14(w0+w4+w8, h0)
input3(0, h0+h1) | input7(w0, h0+h1) | input11(w0+w4, h0+h1) | input15(w0+w4+w8, h0+h1)
input4(0, h0+h1+h2)| input8(w0, h0+h1+h2)| input12(w0+w4, h0+h1+h2)| input16(w0+w4+w8, h0+h1+h2)
There is also a grid option, which is a little simpler to type out. I don't know which one is better, so let's start with xstack and go from there.
Starting with a directory with a bunch of random images, it'd be nice to make them easy to refer to in the cli. We can enumerate all files ending with .jpg so they become 001.jpg, 002.jpg, etc. using,
find -name "*.jpg" | cat -n | while read n f; do mv -n "$f" `printf "%03d.jpg" "$n"`; done
Note: mv's -n flag is used to avoid overwriting existing files.
So after experimenting for a while, I ended up with the following xstack solution, which I think is pretty nifty.
ffmpeg -i 2.png -i 001.jpg -i 003.jpg -filter_complex "[0:v] scale=-1:800 [o1]; [1:v] scale=400:-1 [o2]; [2:v] scale=400:-1 [o3]; [o1][o2][o3] xstack=inputs=3:layout=0_0|w0_0|w0_h1:fill=black [o_final1]; [o_final1] scale=600:-1 [o_final2]" -map "[o_final2]" output.jpg
ffmpeg -i 2.png -i 001.jpg -i 003.jpg -filter_complex "[0:v] scale=-1:800 [o1]; [1:v] scale=400:-1 [o2]; [2:v] scale=400:-1 [o3]; [o1][o2][o3] xstack=inputs=3:layout=0_0|w0_0|w0_h1:fill=black [o_final1]; [o_final1] scale=600:-1 [o_final2]" -map "[o_final2]" output.jpg
ffmpeg -i 2.png -i 001.jpg -i 003.jpg
-filter_complex
sets up a "filtergraph" which allows us
to do a bunch of cool stuff.[0:v], [1:v], and [2:v]
are labels for our image
inputs. ffmpeg uses square brackets as labels, and in this case, they
are generated automatically.[0:v] scale=-1:800 [o1]
takes the first image, 2.png,
sets its height to 800, and maintains its aspect ratio by using
-1 as the scale's width parameter. This scaled image is
then labeled as [o1]
.[o1][o2][o3] xstack=inputs=3
layout=0_0|w0_0|w0_h1
means the first image's upper
left corner gets placed at (0, 0), and the second and third image will
form a column beside the first image.fill=black [o_final1]
tells ffmpeg the uncovered
background should be black, and we label this collage as
[o_final1]
for later use.[o_final1] scale=600:-1 [o_final2]
we apply a
scaling on the final image to have a max width of 600 pixels.-map "[o_final2]" output.jpg
tells ffmpeg to take
[o_final2]
and save it as output.pngEasy, right?
Here are the three images (not to scale).
Here is what the ffmpeg pipeline does with these three images (not to scale).
How about doing something similar with video?
ffmpeg -i 111.webm -i 222.webm -i 333.webm -filter_complex "[0:v] scale=500:-1 [o1]; [1:v] scale=500:-1 [o2]; [2:v] scale=500:-1 [o3]; [o1][o2][o3] xstack=inputs=3:layout=0_0|w0_250|0_h0:fill=black [o_final]" -map [o_final] -c:v libvpx-vp9 -b:v 500k -fs 3M output.webm
ffmpeg -i 111.webm -i 222.webm -i 333.webm -filter_complex "[0:v] scale=500:-1 [o1]; [1:v] scale=500:-1 [o2]; [2:v] scale=500:-1 [o3]; [o1][o2][o3] xstack=inputs=3:layout=0_0|w0_250|0_h0:fill=black [o_final]" -map [o_final] -c:v libvpx-vp9 -b:v 500k -fs 3M output.webm
This command is pretty much the same as the image collage, but here we,
-c:v libvpx-vp9
.-b:v 500k
.-fs 3M
to set a weakly-enforced file
size on the file.Here is an example of the output.