The project started with the idea of essentially making an improved AprilTag. I figured this wouldn't need as many unique IDs as AprilTag uses, and so with less bits of data on the tags, you could theoretically get away with lower resolution camera. So I started prototyping.

A desk setup featuring a blue mouse pad with a black keyboard (with visible keys) placed centrally, four black rectangular cards each displaying a white square pattern arranged in a row below the keyboard, and a computer mouse positioned to the right of the keyboard. The desk surface is light - colored wood.

The design relied on a 2x2 bit field for identification and a 1 bit field for orientation. While experimenting with this I quickly realized that to figure out distance from camera (without stereo-vision) I also essentially need to fit the entire square to it, hence distance relying on orientation which means less depth accuracy, at least that's how I remember it, this is almost 6 years ago at the time of writing.

A close-up view of a light brown wooden floor with visible grain patterns, knots, and natural wood texture. A circular retro-reflector emits a warm glow at the center of the image.

I realized by changing the design to circles using retro-reflectors instead distance was no longer relying on the orientation. The major axis of the observed ellipse would be the same at the same distance no matter the orientation, so then I could just fit an ellipse to it and derive distance based on the major axis. Eventually I also ended up doing some fancy sub-pixel fitting and stuff like that for added accuracy, but I won't go into that.

A close-up of a small rectangular electronic device placed on a light - colored surface. The device has a black pcb, a central metallic component, and visible ports including a USB connector. The background is a plain, light surface with minimal details.

I was not quite happy with the accuracy of this, it required a lot of smoothing / denoising, especially on the depth axis which caused a lot of latency, so this is when I decided to instead switch to do something else, a purely IMU based system drawing inspiration from Xsens. It's also around this time we became a team of people working on it.

A close-up view of a hand holding a small black tracker with a black wrench taped to it and glowing red LED lights, set against a blurred wooden floor background.

Where our tracker really managed to shine was the magnetometer calibration, we had on-device soft- and hard-iron calibration. The MCU was admittedly incredibly outdated, we had to pack in the entire fitting algorithm in less than 1kB of RAM on a processor that is 16 times slower than an ESP32. It was an insane engineering challenge, but after a lot of hard work on my side with optimizing and implementing very advanced fitting algorithms and assembly based dynamic memory management, we were able to do it. We were able to fit 100 samples of magnetometer data in less than 5s to a 9-Dimensional deform matrix. Something our competitors needed more than 5s for on a full desktop computer to fit at the time. Once it was calibrated it ran in real-time correcting any drift produced by the gyroscope and accelerometer.

A desk surface with 14 small black square trackers, each featuring glowing red lights, arranged in a grid pattern. A computer mouse is positioned at the bottom of the image.

Unfortunately due to monetary reasons the project failed to commercialize.

For a more information read this article