physical intelligence_Solution_ShenYing VISION (Shenzhen) Technology Co., Ltd.

1. Scheme Overview

The core value of an embodied smartphone is to achieve autonomous, real-time, and precise interaction with the physical environment through a "perception-decision-execution" closed loop, and three-dimensional perception is the core prerequisite for opening up this closed loop. This solution is based on the self-developed binocular depth vision camera, which integrates the depth of AI-ISP, depth vision application algorithm, end-side AI computing power and so on, and provides a low-cost, high-robustness and light-weight three-dimensional perception solution for all kinds of body-oriented intelligent machines (humanoid robot, cooperative manipulator, service robot, autonomous mobile platform, etc.), targeted solutions to the core pain points such as depth perception, target positioning, human-machine collaboration in complex environments, empowering its large-scale landing in industry, services, consumption and other fields.

2. industry pain points

1, the traditional monocular vision can only obtain two-dimensional images, can not directly output object distance, size, three-dimensional position, resulting in intelligent machine in navigation obstacle avoidance, grasping operation easy to misjudge, rely on complex algorithms to estimate depth, real-time and reliability is insufficient.

2, the traditional depth vision camera in low light, backlight, weak texture (white wall, solid color ground), rain and fog and other scenes, a single sensor is prone to imaging distortion, feature point loss, resulting in depth calculation failure, can not meet the needs of all-weather operations.

3, depth data need to rely on the upper computer computing force processing, transmission and calculation delay is too high, can not match the mechanical arm grasp, humanoid robot gait adjustment and other millisecond-level real-time interaction requirements, action execution accuracy is greatly reduced.

4, laser radar and other high-precision sensing equipment is large, high power consumption, high cost, can not adapt to small service robots, portable intelligent terminals and other lightweight carriers, restricting large-scale applications.

5, a single sensor has a perception blind area, multi-sensor combination needs to solve calibration, data synchronization, coordinate system alignment and other problems, algorithm and hardware integration cost is high, landing efficiency is low.

3. Solutions and Core Benefits

The binocular depth vision camera independently developed by Deep Shadow Wisdom Mapping relies on the core characteristics of "RGB-D synchronous perception + low cost + light weight + end-side computing force integration" to accurately match the perception requirements of the body-based intelligent machine and solve the above pain points from the root. Compared with the traditional perception scheme, it has significant advantages: it can directly output absolute depth without relying on motion sequences, and has stronger stability in static scenes; Both color and depth information, support "semantic recognition and spatial positioning" integrated perception, small size, low power consumption, suitable for all kinds of lightweight carriers, the cost is only 1/5-1/10 of lidar, conducive to large-scale deployment.

1. Accurate and efficient three-dimensional perception capability: the end side outputs dense point cloud and RGB-D fusion data in real time, taking into account depth accuracy and real-time performance, and meeting the requirements of precise operation and rapid response of the body intelligent machine.

2. Strengthen environmental adaptability: Built-in self-developed deep shadow dark light full-color algorithm engine, with improved stereo matching algorithm, can work stably in complex scenes such as low light, backlight and backlight, breaking environmental restrictions.

3, lightweight and cost-effective: compared to multi-class sensor fusion scheme, binocular depth camera structure is simple, greatly reduce hardware costs, small size, low power consumption, suitable for all kinds of lightweight body intelligent machine, conducive to large-scale landing.

4. Low threshold integration and high scalability: standardized interface, perfect SDK and development documents, support rapid integration and multi-sensor expansion, and adapt to the function upgrade requirements of different scenarios.

5. Perception-execution collaborative optimization: end-side computing force integration + pixel-level RGB-D alignment, greatly reducing data delay, realizing precise coordination of perception and execution, and improving the execution accuracy of intelligent machine actions.

Based on the binocular depth vision camera independently developed by Deep Shadow Intelligence, a full-link three-dimensional perception solution adapted to the field of physical intelligence is constructed. By accurately cracking the core pain points such as lack of perception dimension, weak environmental adaptability and insufficient real-time performance, it provides low-cost, high-performance and easy-to-integrate perception capability support for physical intelligence. The scheme takes into account the advanced technology and the feasibility of landing, can be widely adapted to humanoid robots, collaborative robotic arms, service robots and other products, to help customers accelerate the development and large-scale landing of physical intelligence machines, and promote the intelligent upgrading of the industry.