What is the technical principle of 3D vision sensors? Does this technology not require large models for training? If it does not require large model training, how can we ensure that robots equipped with our company's 3D vision sensors can recognize all objects or people?

0
Occipital-UW: Hello! There are many principles of 3D vision sensing technology, such as structured light, binocular stereo vision, ToF, etc., among which binocular stereo vision is closer to human binocular vision. Occipital has a full-field layout of 3D vision perception technology, including six mainstream 3D vision perception technology routes such as structured light, iToF, binocular, dToF, Lidar and industrial 3D measurement. At present, 3D vision perception technology is mainly used for the acquisition of video streams such as RGB, IR, Depth, etc. Recognition is performed on the basis of these video streams. There is no large model training in the multimodal video stream acquisition link (3D vision perception technology itself does not necessarily require a large model for training, but requires the three-dimensional data obtained by the sensor to be combined with the relevant comparison algorithm to achieve the recognition of faces, human bodies or objects), but it is constantly tried out through the experience of 3D vision sensing experts and engineering micro-innovation. At present, we have also begun to try to use models for Depth estimation in binocular stereo vision. In the future, there will be opportunities to adapt to more complex environments through models and obtain more complete RGB-D data. Thank you for your attention and support to the company!