Computer Vision

In progress...

FFmpeg FOURCC values

In using OpenCV’s VideoWriter class, I spent a lot of time confused about the FOURCC values for the video codec that has to be supplied to the VideoWriter::Open method. It looks like OpenCV uses FFmpeg for compression. But what are the valid FOURCC values? The listing at helped and searching the web showed a number of standard FOURCC values.

I was curious as to the full list so I downloaded the FFmpeg source and checked through it. Looks like FFmpeg uses the MKBETAG and MKTAG macros to create the integer representation of the FOURCC value. Running the search:

grep -re MKBETAG -e MKTAG *

on the source directory generates a lot of entries (around 1500), many with comments describing the codec. A Python script to look for common patterns pulled out the FOURCC codes and hints as to the type of codec. The results, including the script, can be found in

The FOURCC just specifies the codec, not the container. The container format in VideoWriter seems to set by the file name extension, e.g. .avi generates an AVI container. Containers act as multiplexers that take a number of streams (e.g. video, audio, extra data) and time multiplexes them by breaking each into packets and interleaving these packets. Wikipedia has a nice table showing codecs and containers.

Video over remote connection (Ubuntu)

Access to /dev/video* is blocked by default when running from remote connection (e.g. X Windows tunneled over SSH). To allow this, add yourself to the video group:

usermod -a -G video user

and then re-login. The system should now allow access to the video devices.

Structure tensor

The structure tensor provides the orientation of details in the image. Calculating the tensor begins with taking the gradient across the image (e.g. using the Sobel operator) and then calculating a total gradient of (dI/dx)2, (dI/dy)2, and dI/dx*dI/dy across the ROI.

The eigenvectors of the resulting matrix provide the two orthogonal detail directions. The larger the eigenvalue, the larger the gradient in the direction of the associated eigenvector.

Structure tensor example

Here is a video showing the results of calculating the structure tensor on a sample set of images. The eigenvectors are shown in yellow with lengths set proportional to the associated eigenvalues. I have included noise in the source images to mimic noise in video images. The red box is the ROI used for the structure calculation (the gradient calculation might use a few pixels ouside this box). I used cv2::cornerEigenValsAndVecs to perform the calculations.