The burgeoning field of intelligent machine vision, fueled by the Internet of Things (IoTs) and artificial intelligence (AI), demands efficient acquisition of complex, multidimensional visual information encompassing spatial, temporal, and spectral data. Conventional full-frame sensors struggle to handle this deluge of data, facing limitations in bandwidth, storage, and processing power. To address these challenges, researchers are exploring innovative in-sensor computing paradigms that can perform data processing directly at the sensor level, minimizing the need for extensive data transfer. A particularly promising approach involves compressive sensing (CS) with a 2D optoelectronic sensor and its hardware implementation, snapshot compressive imaging (SCI), which can capture multidimensional data in a compressed format within a single snapshot.
However, existing SCI systems often rely on bulky and complex external optical modulators like digital mirror devices (DMDs) or liquid crystal encoders, which suffer from limitations in speed, resolution, and integration. Furthermore, the separation between the encoder and the sensor necessitates precise alignment and introduces processing latency. To overcome these hurdles, researchers are turning to the unique properties of 2D optoelectronic sensors; two-dimensional (2D) van der Waals (vdW) heterostructures, which can integrate sensing, memory, and computation functionalities within a single, ultra-thin device.
This study presents a novel 2D optoelectronic sensor—a 2D vdW programmable photoinduced memory sensor (PPMS)—designed for in-sensor data compression, specifically targeting dynamic videos and three-dimensional spectral data. Leveraging the photosensitive properties of its constituent 2D materials, the PPMS can simultaneously sense optical signals, encode them based on its programmable non-volatile conductance under electro-optical co-modulation, and compress the information directly at the sensor level.
The researchers demonstrated the 2D optoelectronic sensor’s capability to achieve an 8:1 compression ratio for both grayscale videos and hyperspectral images, effectively capturing 3D data into a 2D compressed snapshot. Remarkably, the reconstruction quality of the original data from these compressed 2D images, as indicated by a peak signal-to-noise ratio (PSNR) of 15.81 dB, is comparable to the 16.21 dB achieved through software-based compression and reconstruction. This high fidelity reconstruction highlights the potential of the 2D sensor for lossless-like compression of complex visual information.
Furthermore, the study showcases the intelligence of this in-sensor compression approach. Compressed action videos, represented as 2D images, retain all crucial semantic information. This allows for direct in-sensor classification of actions using a convolutional neural network (CNN) without the need for prior decompression. The classification accuracy achieved with the compressed data (93.18%) was even superior to that obtained using the uncompressed videos (83.43%), demonstrating the sensor’s ability to preserve and even enhance relevant features during the compression process.
The development of this 2D optoelectronic sensor with integrated sensing, memory, and computation for in-sensor data compression represents a significant advancement towards efficient intelligent vision systems at the edge. By minimizing data transfer, reducing energy consumption, and simplifying system complexity, such sensors based on vdW heterostructures hold immense promise for enabling a new generation of compact and powerful machine vision applications in areas like industrial inspection, security monitoring, and autonomous driving.
Related Content: In-Line Machine Vision – Automating Metrology In Manufacturing