# Time-domain analog computing and VLSI systems toward ultimately high-efficient brain-like hardware

#### Takashi Morie

Kyushu Institute of Technology, Japan





# Outline

- Introduction
  - Our brain-like VLSI chips
  - My approach toward brain
- Time-domain analog computing and VLSI systems

   Time-domain energy-efficient weighted sum calculation based on simple spiking neuron model
  - Chaotic Boltzmann machine circuit based on oscillator neuron model
- Conclusion

#### **Our brain-like VLSI chips**



#### **Different approaches to brain functions**



# Outline

- Introduction
  - Our brain-like VLSI chips
  - My approach toward brain
- Time-domain analog computing and VLSI systems

   Time-domain energy-efficient weighted sum calculation based on simple spiking neuron model
  - Chaotic Boltzmann machine circuit based on oscillator neuron model
- Conclusion

# **Digital and analog processor architectures**



## **DL Processors in ISSCC 2017**



#### **SESSION 14** Tuesday February 7th, 1:30 PM **Deep-Learning Processors** Session Chair: Takashi Hashimoto, Panasonic, Kadoma, Japan Associate Chair: Mahesh Mehendale, Texas Instruments, Bangalore, India 1:30 PM 14.1 A 2.9TOPS/W Deep Convolutional Neural Network SoC in FD-SOI 28nm for Intelligent Embedded Systems G. Desoli<sup>1</sup>, N. Chawla<sup>2</sup>, T. Boesch<sup>3</sup>, S-P. Singh<sup>2</sup>, E. Guidetti<sup>1</sup>, F. De Ambroggi<sup>4</sup>, T. Majo<sup>1</sup>, P. Zambotti<sup>4</sup>, M. Ayodhyawasi<sup>2</sup>, H. Singh<sup>2</sup>, N. Aggarwal<sup>2</sup> <sup>1</sup>STMicroelectronics, Cornaredo, Italy; <sup>2</sup>STMicroelectronics, Greater Noida, India <sup>3</sup>STMicroelectronics, Geneva, Switzerland; <sup>4</sup>STMicroelectronics, Agrate Brianza, Italy 2:00 PM 14.2 DNPU: An 8.1TOPS/W Reconfigurable CNN-RNN Processor for General-**Purpose Deep Neural Networks** D. Shin, J. Lee, J. Lee, H-J. Yoo, KAIST, Daejeon, Korea 2:30 PM 14.3 A 28nm SoC with a 1.2GHz 568nJ/Prediction Sparse Deep-Neural-Network Engine with >0.1 Timing Error Rate Tolerance for IoT **Applications** P. N. Whatmough, S. K. Lee, H. Lee, S. Rama, D. Brooks, G-Y. Wei Harvard University, Cambridge, MA Latest digital DL processors ~10TOPS/W 14.4 A Scalable Speech Recognizer with Deep-Neural-Network Acoustic Models and Voice-Activated Power Gating M. Price, J. Glass, A. Chandrakasan Massachusetts Institute of Technology, Cambridge, MA 3:45 PM 14.5 ENVISION: A 0.26-to-10TOPS/W Subword-Parallel Computational Accuracy-Voltage-Frequency-Scalable Convolutional Neural Network Processor in 28nm FDSOI

B. Moons, R. Uytterhoeven, W. Dehaene, M. Verhelst, KU Leuven, Leuven, Belgium

#### Measure of energy efficiency of processors

FLOPS -> OPS (Fixed-point operations per sec.) e.g. PC(Core i7) ~500GFLOPS

Operation performance: TOPS/GOPS (Tera/Giga Operations Per Second)

#### **Energy efficiency: TOPS/W**

- = Tera Ops. per sec. / Joule per sec.
- = Tera ops. / Joule

#### Energy consumption per op.: 1/(TOPS/W) [pJ/op] = 1 [pJ/op]

#### Latest digital DL processors:

~10TOPS/W

Synapse op. in **brain**: 0.1~1 fJ/op 1,000~10,000 TOPS/W =1~10 POPS/W # of neurons: ~10<sup>11</sup> # of synapses: ~10<sup>15</sup> Power cons.: 20(~1) W Op. freq.: 10~100 Hz Activity: ~10 %

## Two types of neuron models



## **Energy consumption using resistive elements**

Voltage-domain circuits based on analog neuron model



Assuming *R*=100MΩ, *Vin*=0.1~1V,  $\tau$ =1µs  $E_w = \tau V^2/R = 0.1 \sim 10 \text{fJ}$ 

**Op-amps consume much more energy** than resistors.

## **Functions of PSPs in spiking neurons**



#### Integrate & fire neuron



#### **Time-domain weighted-sum calculation model**



#### **Time-domain MAC calculation**



## **Energy consumption using resistive elements**



## Time-domain analog cross-bar circuit



# **Comparison among different approaches**

| Approach        |                                | # of<br>neurons/<br>synapses                       | Energy per<br>synapse<br>operation                            | Processing<br>frequency                    | Power<br>consump-<br>tion |
|-----------------|--------------------------------|----------------------------------------------------|---------------------------------------------------------------|--------------------------------------------|---------------------------|
| Biologi-<br>cal | Human brain                    | ~10 <sup>11</sup> /<br>~10 <sup>15</sup>           | 10 <sup>-16</sup> ~10 <sup>-15</sup> J<br>( <b>0.1~1 fJ</b> ) | 10~100 Hz<br>(10% active)                  | 20 (~1) W                 |
| Digital         | Super Comput.<br>(Kei「京」) (*1) | 1.7x10 <sup>9</sup> /<br>1.0x10 <sup>13</sup>      | (6.5x10⁻⁴J)<br><b>(~1 mJ)</b>                                 | Brain:1s =<br>Kei: 2,400s<br>(4.4 fires/s) | ~12 MW                    |
|                 | Digital chips<br>(TrueNorth~)  | 1x10 <sup>6</sup> /<br>256x10 <sup>6</sup>         | ~10 <sup>-13</sup> J<br>(< 0.1 pJ)                            | 1 kHz                                      | 86 mW                     |
| Analog          | Voltage/current<br>-domain     | [1x10 <sup>6</sup> /<br>256x10 <sup>6</sup> ] (*2) | >~10 <sup>-15</sup> J<br><b>(~ 1 fJ) (*3)</b>                 | <~MHz                                      | [~300 mW]<br>(*4)         |
|                 | Time-domain                    | [1x10 <sup>6</sup> /<br>256x10 <sup>6</sup> ] (*2) | >~10 <sup>-17</sup> J<br>(~10 aJ) (*3)                        | <~MHz                                      | [~3 mW]<br>(*4)           |

(\*1) http://www.riken.jp/pr/topics/2013/20130802\_2/

- (\*2) Assuming the same values as TrueNorth
- (\*3) Estimation when assuming ideal condition

(\*4) Only for weighted summation

# Outline

- Introduction
  - Our brain-like VLSI chips
  - My approach toward brain
- Time-domain analog computing and VLSI systems
  - Time-domain energy-efficient weighted sum calculation based on simple spiking neuron model
  - Chaotic Boltzmann machine circuit based on oscillator neuron model
- Conclusion

# **Original and chaotic Boltzmann machines**

#### Boltzmann machines (BMs)<sup>[1]</sup>

- Stochastic operation of binary neurons
- Symmetrically connected networks
- Solving optimization problems using energy minimization
- Success of deep learning by restricted BMs

#### Chaotic Boltzmann machines (CBMs)<sup>[2]</sup>

- Deterministic operation
- Using chaotic dynamics instead of stochastic operation
- <u>Computing ability comparable</u> <u>to BMs</u>

[1] D. H. Ackley et al., Cog. Sci. 9, 1985.[2] H. Suzuki et al., Sci. Rep., 2013.



## **Chaotic Boltzmann machines (CBMs)**



#### **A CMOS unit circuit for CBMs**



#### **Measurement results of CBM VLSI chip**



## Conclusions

- Time-domain analog VLSI implementation can achieve extremely energy-efficient operation including nonlinear transforms, which is difficult for digital VLSI implementation.
- Weighted-sum calculation as a simple synaptic function can be achieved with extremely low energy consumption based on time-domain operation of simple spiking neuron model, but high-resistance element is required. Also, to reduce parasitic interconnection capacitance is another challenge to achieve energy-efficient operation.
- Nonlinear transforms performed in charging operation to a capacitor were successfully applied to implementation of nonlinear dynamical models, such as chaotic Boltzmann machines.