Two different approaches allow mono vision to function like stereo vision.
One important factor associated with machine vision is depth perception. This refers to the ability to perceive objects in three dimensions and determine their size, distance, and how they are moving in relation to the sensing platform, which might be an autonomous vehicle or a robot, for example.
When queried as to what is required for depth perception, most people would use biological binocular vision as an example. In the case of humans with two eyes facing forward and slightly separated, each eye receives a slightly different image. Any positional differences—which are known as horizontal or binocular disparities—are processed in the visual cortex of the brain to yield depth perception.
Our brains also employ knowledge of the expected sizes of objects and any changes in the sizes of those objects to determine what’s happening along with any required response (“That’s a baseball… it’s getting bigger… time to DUCK!”).
We also take advantage of other visual cues, including perspective (parallel lines like railway tracks get closer together the further away they are) and motion parallax (things appear to move at different rates depending on their distance from a moving observer).
In the case of machine vision, depth perception is central for tasks such as understanding scenes and identifying, recognizing, and avoiding objects. One way the developers of machine vision systems can achieve depth perception is to use dual cameras to implement binocular vision. However, using two cameras means two lens assemblies and two CMOS sensors, thereby increasing the size, cost, and power consumption of the system. These issues are exacerbated in applications like autonomous vehicles that demand multiple systems (e.g., forward, side, and rear-facing).
What developers really want is to be able to extract depth data from a scene using a single sensor. Two companies—AIRY3D and Owl Autonomous Imaging—have come up with very different solutions to this problem. Owl is dedicated to reducing collisions between automobiles and pedestrians (or animals) at night by providing actionable object identification and location data to effect automatic emergency braking systems.
On the hardware side of things, Owl is a fabless semiconductor company that creates thermal imagers. On the software side, Owl uses sophisticated artificial intelligence (AI) and machine learning (ML) to perform object detection, classification, and ranging (i.e., depth perception). Based on their knowledge of the typical sizes of things and how they behave in the real-world, humans can determine depths in scene using just one eye coupled by a lot of processing in their brains. Owl can achieve the same results using a single thermal imager coupled with powerful AI/ML processing.
AIRY3D has taken a very different approach, resulting in the ability to provide both a regular image and corresponding depth map using a standard off-the- shelf CMOS sensor.
The only hardware modification is to apply a diffraction mask on top of the sensor. This mask modulates the light hitting the sensor. Everything else takes place in software. The raw data from the sensor is fed into a pre-filtering decoder, which extracts the modulated depth information, leaving standard image data to be passed through a traditional image signal processor (ISP). Meanwhile, the modulated depth data is passed through an image depth processor (IDP). The resulting image and depth map—correlated on a pixel-by-pixel basis— are available for use by higher-level applications.
These two approaches to enabling mono vision to function like stereo vision, thereby providing machine vision with depth perception, are applicable to a wide range of applications and markets.