This was a pretty short and minor project I did for my "gymnasiearbete" way back. The idea started as, "What if top down reversing camera, but obstacles are highlighted as red?"

After politely begging the school to buy a stereoscopic camera for my project I could get to work. Unfortunately I don't have many pictures from this project. (This will be based on the paper I wrote on it at the time.)

A desk scene with open books, a bright red box labeled 'S217', a connector, and a stapler. Overlaid on the desk are four rectangular sections displaying delta values: Δ=54, Δ=49, Δ=20, and Δ=63. The text 'Most Likely Disparity' is prominently visible above these sections connecting to the section with Δ=20

I start by introducing how stereoscopic camera feeds can be used to determine a depth buffer. This was also a bit before computer vision started getting very machine learning oriented, so it was just good old comparing 2 pictures and trying different offsets to see where there was the least disparity between the two in a small subsection of the image and then applying some filtering on the result.

A black and white projection of a scene with gravel for ground and 2 cars from above.

I projected the pixels into 3D space with the acquired depth to make projections from different angles a more fancy version of those car cameras that just warp the picture to make it look top down. If I had done this today, I'd probably make it a mesh and then render that mesh to have no gaps in the point cloud. At the time i decided to instead do a morphology type filter to fill the empty gaps between the points in the point cloud.

A histogram showing the frequency distribution of height measurements in millimeters. The x-axis is labeled 'height (mm)' with values ranging from approximately 1200 to 1300 mm, and the y-axis represents frequency. Green bars depict the frequency of each height interval, with a peak around 1336 mm. A black annotation with the text 'height = 1336 mm' highlights this specific value on the chart, with a red normal distribution curve fitted to the bars.

Then I get into how to determine the height of the ground, so that I later can classify things above and below ground level to be obstacles. I start by stepping from the right towards the left which in the case of this graph is from below ground and upwards. Once I find the the peak I grab the samples close to the peak to do a Gaussian distribution curve fitted to those samples. The center of the curve is used as the ground height with some margin for error.

A picture of a driveway with two cars and a house next to it. The ground is highlighted in green and the cars and house in red indicating they are obstacles.

I then classify each pixel as an obstacle or ground and I applied a red or green hue to the pixel based on the classification to get an image like the one above. Conveniently the camera I used had a built in IMU that I used to determine orientation and could use to help calculate the normal of the ground plane so the fitting would work in real-time without manual adjustments.

The same parkway image from earlier focusing on just the cars but this time projected from above with a point-cloud.

I could then project this image into top down. Admittedly the irl testing I did seemed to indicate most people preferred the camera pov with red and green obstacle detection over top-down pov obstacle detection, camera pov color and top-down pov color.

A black and white road projected from above with a point-cloud using a technique to fully fill the gaps between the points visually. Effectively making the road as though it was seen from above.

I'll also include this picture of some top down tests I did. This one completely filled gaps but couldn't run real-time, as it was made in python (sorry) and it was not hardware accelerated.


I remember this project being a lot of fun to work on. At the time when I made this being 18 I found this quite challenging and fun to have a crack at. Looking back with the skill-set I have today, I would do things very differently. Although I would probably not work on autonomous car tech in the first place today, it's really just emulating trains, but less efficient if you think about it, which is of course an incredibly boring answer to cool tech, but infrastructure is better when it is boring and reliable vs cool but experimental and potentially unreliable.