Dare I say this is somewhat unintuitive?
$ avconv -y -i input_video_A.mp4 -vf 'movie=input_video_B.mp4[inputB] ; [in]pad=1280:0,[inputB] overlay=640:0[out]' -c:v libx264 output_video.mp4
This overlays one 640×480 video next to another one of the same size. You can stick a setpts=PTS-STARTPTS
in there (between pad=… and [inputB]
, separated by commas) to have the videos “begin in the same zero timestamp”, but what that means in practice I have yet to figure out.
Note that this doesn’t do any audio mixing. AFAICT avconv currently in Precise can’t do mixing, but newer versions have the amix
filter for it.
Edit: Once you’ve done mixing elsewhere, you can bring the mix in too:
$ avconv -y -i input_audio.wav -i input_video_A.mp4 -vf 'movie=input_video_B.mp4[inputB] ; [in]pad=1280:0,[inputB] overlay=640:0[out]' -c:v libx264 -c:a libfaac output_video.mp4
Scaling goes after overlay:
overlay=640:0,scale=640:240' -c:v libx264...
Edit: Something crazier still: place one video below the other and turn the whole thing sideways before scaling down (plus make it phone-compatible by using mpeg4 instead of libx264).
$ avconv -y -i input_audio.wav -i input_video_A.mp4 -vf 'movie=input_video_B.mp4[inputB] ; [in]pad=0:960,[inputB] overlay=0:480,transpose=2,scale=640:424' -c:v mpeg4 -c:a libfaac output_video.mp4