US20260094633
2026-04-02
Physics
G11C7/1084
A computational memory device is designed to enhance the performance of neural network models by integrating memory banks and a computational memory block. The memory banks store weight data, while the computational memory block, stacked on top of the weight memory block, performs multiply-accumulate (MAC) operations. This setup allows for efficient processing of neural network layers by using a bit cell array for bitwise operations.
Key components include the weight memory block, computational memory block, and a communication interface. The weight memory block provides weight data upon request, and the computational memory block uses this data to perform MAC operations with input data. The communication interface facilitates data exchange between these blocks, using parallel-to-serial conversion and through silicon vias (TSVs) for connectivity.
The device processes neural network layers by receiving input data, requesting corresponding weight data, and executing MAC operations. The computational memory block handles data in a multi-bit format, performing operations sequentially, bit by bit. This method allows for efficient handling of large datasets, crucial for applications like machine learning and neural network authentication.
The communication interface manages data flow through dual channels, enabling concurrent processing of different weight data sets. Memory banks consist of non-volatile and volatile memory types, such as MRAM and SRAM, to store and process data effectively. The computational memory block includes components like adders and accumulators to perform and accumulate results of multiplication operations.
This technology is particularly beneficial for applications requiring extensive vector matrix multiplication, common in neural networks. By optimizing the MAC operations, the device improves the overall performance of neural network models, making it suitable for various fields, including image processing and data analysis. The described architecture provides a scalable and efficient solution for complex computational tasks.