Energy-Efficient In-Memory Computing using 3D Flash Memory with Sequential Multi-Block Activation and Current Control Cell (CC cell)

August 7, 2025

With the wide spread of AI and machine learning, high-performance computing is beginning to be used not only in specialized fields such as supercomputers, but also in general purpose. On the other hand, high-performance computing consumes huge amounts of energy and has a large environmental impact. Therefore, a solution that can improve both computing performance and energy efficiency is strongly needed. One solution is “in-memory computing,” which reduces energy consumption associated with data transfers between processor and memory by providing computing functions in memory.

In our previous work, we developed the in-package boost converter [1]. This technology significantly reduces power consumption of the read operation. By extending previous work, we explored new energy-efficient in-memory computing using 3D flash memory.

AI and machine learning, there is a process to search similar vector data for vectorized sentences and images. It is used in the step of AI recognizing the object in the image and the context. This process is called approximate search[2].

In this work, we proposed a novel approximate search method for in-memory computing that combines the sequential multi-block activation with the current control cell (CC cell) (Figure 1). The key vector data is stored along the bit line. The query vector data to be searched is applied to the selected word line. A CC cell is placed for each string, and its role is to limit the current of the on-string to a constant lower value. The on-current of each bit line is determined by the CC cell. The inner product of vectors which is obtained from the sum of the on-currents indicates the similarity between the key and query vector data.

Fig.1 A schematic diagram of approximate search method that uses sequential multi-block activation and current control cell (CC cell) [3].
©2025 IEEE

We demonstrated approximate search with 128-dimensional vector data. It shows that key vector data with high similarity can be distinguished effectively (Figure 2).

Fig.2 The experimental results of computing inner product for 128-dimensional vector data [3].
©2025 IEEE

Compared to the conventional read operation, Energy efficiency can be significantly improved in 128-dimensional in-memory computing. The memory access energy per bit is reduced from 30 pJ/bit to 0.17 pJ/bit.(Figure 3). This technology is fully compatible with conventional 3D flash memory and it is considered essential for achieving energy-efficient in-memory computing.

Fig.3 Energy consumption reduction effect with in-memory computing [3].
©2025 IEEE

This achievement was presented at the IMW 2025.

Reference
[1] Kazuma Hasegawa et al., "Low Power and Thermal Throttling-less SSD with In-Package Boost Converter for 1000-WL Layer 3D Flash Memory", IEEE International Memory Workshop (IMW), 2023.
[2] Shinichi Sasaki et al., "Mitigation of Accuracy Degradation in 3D Flash Memory Based Approximate Nearest Neighbor Search with Binary Tree Balanced Soft Clustering for Retrieval-Augmented AI", IEEE Interregional NEWCAS Conference, 2024, pp. 238-242.
[3] Kana Kudo et al., “Energy-Efficient In-Memory Computing using 3D Flash Memory with Sequential Multi-Block Activation and Current Control Cell (CC cell)”, IEEE International Memory Workshop (IMW), 2025.