**Playing with temporal contrast and low-pass filtering for motion detection with OpenCV**

I have been recently looking at real-time motion detection and various ways of motion extraction and object tracking from a live video stream. Most motion detection algorithms are based on temporal contrast (a complex term maths and image processing guys use for frame subtraction) some sort of low-pass filtering and an ROI (Region Of Interest) and object detection algorithm.

I discovered the OpenCV library and decided to give it a try and do some frame subtraction tests and low-pass filtering. Here are some primitive test results and an engineer's explanation of frame subtraction and simplest possible image low-pass filtering.

**Absolute difference**

Our first stop is frame subtraction. Subtracted frame-to-frame time sets the potential processing (subtraction) load and motion detection sensitivity. In general, simply $\tau_{s} = \frac{1}{fps} [s]$ which would give us a linearly dependent processing load of $\eta = \frac{fps}{2} [op/sec]$.

On an image sensor one can do frame subtraction in both the analog and digital domains, but generally in both cases some form of single frame memory buffering is needed. You can see an image of an 8-bit grayscale absolute frame subtraction (absdiff) below.

**Applying threshold**

Normally ROI and object detection algorithms need a basic sensitivity tuning function, so a natural step is to add threshold to the image and extract event-based binary motion information. For this the OpenCV cvThreshold() function comes handy. Below is an image with added binary threshold of 10 steps (out of 255). Adding a small threshold also helps in noise removal. The single-bit binary event representation conversion helps reducing the processing load during further low-pass filtering steps.

Apart from having a static threshold, other truncation/moving threshold techniques exist.

**Dilation**

Two very basic low-pass filtering operations used in image processing are *dilation* and *erosion*. Both methods base on element-by-element comparison with a reference (structuring) element. A one-bit binary image map makes dilation and erosion easier to implement with simple NAND logic gates.

Here is a simple example of a binary morphological dilation. Let's imagine that we have the following binary image map: $$ Img = \begin{matrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 0 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 0 & 1 & 1 \\ 1 & 1 & 0 & 0 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \end{matrix} $$ In order to perform dilation, as like many digital filters we need to set particular filter coefficients. In the current case we can define a reference element, which defines the surrounding pixels of the pixel of interest. We can for example define: $$ RefEl = \begin{matrix} 1 & \fbox{1} & 1 \end{matrix} $$ The dilation function applies the corresponding reference element to the pixels surrounding the pixel of interest and assigns a value to the pixel of interest depending on the value of the neighbouring elements.

In the current case with the example of the binary image, the single 'zeros' would be substituted by 'ones' because the elements defined by the reference element and the neighbouring pixels in the image are ones. Binary dilation appears to be not so computationally expensive as it can be implemented with basic N/AND gates. The filtered image would result in: $$ FiltImg = \begin{matrix} 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & \fbox{1} & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & \fbox{1} & 1 & 1 \\ 1 & 1 & 0 & 0 & 1 & 1 & 1 \\ 1 & 1 & 1 & 1 & 1 & 1 & 1 \end{matrix} $$

In general the dilation and erosion filtering can be very non-linear depending on the shape and size of the reference element. Here is an example of a dilated image:

It is noticeable that the random single pixel noise in the image would not necessarily fully disappear with dilation.

**Erosion**

An eroded image results by subtracting all pixels covered by the reference element if the latter does not completely fit the binary image. An intuitive example:$$ $$ Img = \begin{matrix} 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 1 & 1 & 1 & 1 & 1 & 0 \\ 0 & 1 & 1 & 1 & 1 & 1 & 0 \\ 0 & 0 & 1 & 1 & 1 & 1 & 0 \\ 0 & 0 & 0 & 1 & 1 & 1 & 0 \\ 0 & 1 & 0 & 1 & 1 & 1 & 0 \\ 1 & 0 & 0 & 0 & 0 & 0 & 1 \end{matrix} $$ $$ An erosion with a reference element of: $$ RefEl = \begin{matrix} 1 & 1 & 1 \\ 1 & \fbox{1} & 1 \\ 1 & 1 & 1 \end{matrix} $$ Leads to: $$ FiltImg = \begin{matrix} 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & \fbox{1} & \fbox{1} & 0 & 0 \\ 0 & 0 & 0 & 0 & \fbox{1} & 0 & 0 \\ 0 & 0 & 0 & 0 & \fbox{1} & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \\ 0 & 0 & 0 & 0 & 0 & 0 & 0 \end{matrix} $$ It is noticeable that erosion shrinks the details in an image and also removes any random noise pixels scattered over the image. Here is a visual example:

OpenCV is a library for image processing, it was very suitable for my experiments, trying to gather more understanding about image processing in general. I have attached the code I used for my experiments here.

It would be very interesting to have a look and investigate if simple image filtering algorithms can actually be implemented on an image sensor directly in the analog domain, or even further, why not on a pixel level.

**Mechanical combination locks**

My new "home" has an idiotic mechanical combination lock at the entrance. One lucky evening I had to spend a few hours outside due to a pure mechanical failure of the lock shown below.

Such primitive and low-reliability mechanical locking systems seem to be used everywhere in England. This made me think how fast can one possibly crack such a lock, provided that he needs very strong fingers and about 5 seconds to type the code in.

For a long time I was thinking that the combinations of a standard mechanical 4-digit lock are not that many because no consecutive digits can be used due to a limitation of the mechanical mechanism. Writing out this trivial solution:

**1.** The first digit can be from 0 to 9 so we have 10 possibilities.

**2.** The latter digits until the fourth can be from 0 to 9 but should not repeat with the previous digit, so we have 9 possibilities for all the rest.

We therefore end up with all possible combinations being: $$10\times 9\times 9\times 9 = 7290$$ An intuitive engineering thinking this times did not fail dramatically as after calming down I was imagining about $\frac{2}{3}$ of all the possible combinations which are 10000. If one masters a good code punching technique then $$7290\times 5 \text{sec} = 607 \text{h} = 25 \text{days}$$

At the end of the day, maybe electronic locks are more vulnerable to elegant cracking techniques as compared to their mechanical siblings.

**A strategy for the RTTY receiver implementation**

I recently had some spare time to think about the RTTY receiver/decoder I would like to implement. My inital attempt for recovering the FSK bits from an input FSK sample stream was to perform a DFT with window's length roughly equal to the bit burst time of the transmitter. This approach however seems to be extremely inefficient, as the brute force DFT loop I had was too slow. In addition some sort of periodic synchronization is required, as otherwise a single sample mismatch between the input stream and the DFT window would lead to dis-synchronization and potentially wrongly recovered stream bits.

The FFT-based idea above also seems impractical for real applications where we have burst time mismatch and noise. Instead of going forward in the f-domain, one can easily try to filter out specific frequencies and then try to correlate and compare both for the bit recovery. Here is a picture of my receiver implementation thoughts. Apparently this has been a quite common approach in the old QPSK dial-up modem chips.

Now I have a plan for my spare time in the future... one bright and lucky day:

**1.** Implement a function with single pole LP and HP filters in series.

**2.** Implement a power/summation/LPF function with zero cross detection.

**3.** A self-syncing bit-recovery block function.

**4.** A look-up table translation function.

**5.** Test all above with a real FSK stream (recorded from a radio transmission /w noise)

**6.** Glue the TX and RX functions together in a more functional form.

Come to think of it, all this starts to more and more look like a Lab 0 task of a typical Signals and systems course. Well, bar the fact that I am trying to write this in C to avoid cathing rust.

**Inspiring 13-17 y.o. high school students to take the engineering path**

I was recently invited to say some "inspiring" words to high school students in Varna (the largest sea port town in Bulgaria). The whole idea of the event was to provoke some thinking from the kids' side and act somehow as a career guide.

The moment I accepted the invitation, I knew that drawing their attention on electronics/microchips would be a very hard task. Being able to give clear explanations, especially about high-level (general) electrical engineering topics is an art that not every engineer can master. I brought some PCBs, demonstrated an FM radio transmission, gave some explanations about time clocks and where the clocks usually get their 1Hz reference from, demonstrated some applications of microcontrollers and their connection with internet, but somehow I felt that the kids were not inspired enough. Here are my pre-demo slides, hopefully someone can give me some criticism. No matter how much I like Latex, beamer seems to be a horribly boring solution for the current purporse.

How can one explain best what electrical engineering is about?

**Reading raw PCM data files into an array in ANSI C**

I am working on the Rx part of the RTTY decoder. The first stage to start with is to find a way to read the encoded by the Tx in my previous post raw PCM file and dump it into an array. A natural act was to look online for already existing code for reading little endian 16-bit raw PCM files, to my surprise a lot about PCM is written out there, but strangely nothing in pure ANSI C which is small and selfcontained enough to be copied and used directly. An equivalent in function ANSI C code I am posting here probably exists out there online, but I will anyway print my implementation here, hopefully anyone would benefit from it. It is also a good way of storing it here for my own reference.

// Reads raw PCM file format and prints the samples in signed integer format on the standard output. // Initial version A, Deyan Levski, deyan.levski@gmail.com, 15.07.2014 // // #include stdio.h #include stdlib.h #include math.h #include string.h #define SAMPLE_RATE 44100 // Hz #define BITS_PER_SAMPLE 16 // bits #define N_SAMPLES 5000 // n int convertBitSize(unsigned int in, int bps) { const unsigned int max = (1 << (bps-1)) - 1; return in > max ? in - (max<<1) : in; } int readPCM(int *data, unsigned int samp_rate, unsigned int bits_per_samp, unsigned int num_samp) { FILE *fp; unsigned char buf; unsigned int i, j; fp = fopen("ttyburst.pcm", "r"); for (i=0; i < num_samp; ++i) { unsigned int tmp = 0; for (j=0; j != bits_per_samp; j+=8) { fread(&buf, 1, 1, fp); tmp += buf << j; } data[i] = convertBitSize(tmp, bits_per_samp); } } int main(void) { printf("SAMPLE RATE = [%d] BITS PER SAMPLE = [%d] N SAMPLES = [%d]\n", SAMPLE_RATE, BITS_PER_SAMPLE, N_SAMPLES); int *data = (int *) malloc(N_SAMPLES * sizeof(int)); readPCM(data, SAMPLE_RATE, BITS_PER_SAMPLE, N_SAMPLES); unsigned int i; for (i = 0; i < N_SAMPLES; ++i) { printf("%d\n", data[i]); } return 0; }

Compiles without errors on a standard gcc compiler with standard ANSI C option set. When run, the executable would look for ttyburst.pcm in the root folder and print all samples on the standard output. The raw PCM is assumed to be little endian, signed 16 bit, sampling rate and number of samples are set with the #define statements. Other formats can be supported with minor code modifications.