Upload
others
View
3
Download
0
Embed Size (px)
Citation preview
Hyperdimensional Computing & Applications
Prof. Tajana S. Rosing &System Energy Efficiency (SEE) LabUniversity of California San Diego
• Today’s computing platforms do not scale well– Big data ‐> memory wall, limited interconnect BW, demand for real‐time response– High power density ‐> dark silicon, cooling
• We have new opportunities with new technology– 3D stacking, nanoscale devices, emerging non‐volatile memories– But new technology has challenges: endurance, variability, yield, SNR
Computing Challenges
Henkel, Jörg, et al. "New trends in dark silicon." 2015 52nd ACM/EDAC/IEEE Design Automation Conference (DAC). IEEE, 2015.
Source: Intel Newsroom
Human brain does it way better!
Learning ability Supervised & unsupervised
Highly parallelExtremely compact
Fault tolerance Noisy input, Neurons may die
Low power~20 W power consumption
State‐of‐the‐Art: Deep Neural Networks
Sources: Zeiler, Matthew D., and Rob Fergus. "Visualizing and understanding convolutional networks." European conference on computer vision. Springer, Cham, 2014.
• DNNs have been behind impressive results in many fields:• Image processing, natural language
processing etc.• Learn hierarchy of representations
• Performance is heavily reliant on hyperparameter tuning:• Learning rate, regularization, dropout, batch size• Multi‐epoch training for every hyperparam combo
Sparse Hierarchical Processing in the Brain“Sparse and expansive transformations entail a fundamental computational advantage for sensory processing” ‐ [Babadi and Sompolinsky 2014]1
Dense input signal (1M)
High dimensional sparse representation (190M)
Information Processing Hierarchy (10M)
Source: DiCarlo, James J., Davide Zoccolan, and Nicole C. Rust. "How does the brain solve visual object recognition?." Neuron 73.3 (2012
● Dense sensory input is hierarchically mapped to high‐dimensional sparse representation on which brain operates
● How can we leverage the sparsity in the computation to address the challenges of big data and emerging technology?
Hyperdimensional Computing: Encoding, Training & Inference
‐1 +1 ‐1 ‐1 +1 . . . +1
‐1 ‐1 +1 ‐1 ‐1 . . . +1
Dog hypervector
Cat hypervectorEncoding
Encoding
Encoding+1 +1 +1 ‐1 ‐1 . . . +1
Query hypervector
Training
Testing
Similarity Check
• Training HD is much faster than for DNNs=> Perfect for adaptive online learning– E.g: HD is 100x faster than DNNs when training human activity recognition models
• HD computing is robust: 50% corrupted bits > accuracy unaffected! – Example: Gaussian noise introduced with up to 50% bits corrupted in hypervectors
HD Computing vs. DNNs
0% 10% 20% 30% 40% 50%% Bits corrupted
Online Retraining/Model Adjustment
Class1Class 2
Class 3
Class 4Binary Model
QueryHamming Distance
Incorrect Match
Class 2
Class 4
- SubtractionC2 = C2 [-] H
AccumulationC4 = C4 [+] H
+Correct class
Incorrect classTraining Dataset
Encoding
Many Applications Benefit from HD Computing
Image Classification
Object recognition
DNA Sequencing
Activity recognition
Speech Recognition
Clustering
[DAC’19]M. Imani, J. Morris, J. Messerly, H. Shu, Y. Deng, T. Rosing, “BRIC: Locality‐based Encoding for Energy‐Efficient Brain‐Inspired Hyperdimensional Computing”, IEEE/ACM Design Automation Conference (DAC), 2019.[DAC'18] M. Imani, C. Huang , D. Kong, T. Rosing, “Hierarchical Hyperdimensional Computing for Energy Efficient Classification”, IEEE/ACM Design Automation Conference (DAC), 2018.[DATE’19] M. Imani, J. Messerly, F. Wu, W. Pi, T. Rosing, “A Binary Learning Framework for Hyperdimensional Computing”, IEEE/ACM Design Automation and Test in Europe Conference (DATE), 2019 .
HD computing benefits:• Energy-efficient computation• Data represented with bits
encoded at large dimensionality (> 10,000)
• Purely statistical -> robust• Ultra-fast single-pass
training• Leverages full algebra and
works on well-defined set of operations
• Supports real-time learning and reasoning
Recommendation Systems
Regression
HDcomputing C
lass
ifica
tion
HD Apps: Classification
• Similarity check: • Binarized model: Hamming distance similarity• Non‐binarized model: Cosine similarity
Row
Driv
er
Encoding + Training on PIMR
ow D
river
ρ (permutation)
Read1
23 Write back
AdditionAddition
write
Full ArithmeticFull Arithmetic
Training/LearningTraining/Learning
Encoded dataEncoded data
EncodingEncodingEncoded
dataEncoded
data
Lw + ρ(Lb)Example Encoding
LbLb LwLw
Digital‐based HD Acceleration
Digital hyperdimensional associative memory (D‐HAM) XOR array: comparison of input vector with all stored vectors Counter: counting the number of mismatches in each row Comparator: finding a row with minimum Hamming distance
The First HD Chip:Analog Associative Search
0 1 0 0
1 0 1 1
1 1 0 0
Class 1
Class 2
Class 4
1 0 1 0
Hamming Distance
Query Hypervector
Mismatched cells discharge the row
Matched cells don’t discharge the Match line
0 01
0
Vdd
1 0 0 1Class 3 0 1
Current Mirror
LTA*
LTA
LTA
Current Comparator
Find a row with a minimum current
* LTA: Loser‐Take All
0
1
PIM with analog HD associative memory has ~106x better energy‐delay product vs GPU
UCB & Stanford Collaboration [ISSCC’18]
M. Imani, A. Rahimi, D. Kong, T. Rosing, J. M. Rabaey “Exploring Hyperdimensional Associative Memory”, IEEE International Symposium on High-Performance Computer Architecture (HPCA), 2017.T. Wu, et al, “Brain-Inspired Computing Exploiting Carbon Nanotube FETs and Resistive RAM: Hyperdimensional Computing Case Study,” In IEEE Intl. Solid-State Circuits Conference (ISSCC), 2018.
HD Apps: Clustering
• Encode all input data into high‐dimensional space• Cluster using simple, memory‐centric Hamming distance similarity
15[DATE’19] M. Imani, Y. Kim, T. Worley, S. Gupta, T. Rosing, “HDCluster: An Accurate Clustering Using Brain-Inspired High-Dimensional Computing”, IEEE/ACM Design Automation and Test in Europe Conference (DATE), 2019.
Secure Distributed HD Learning
Centralized Learning(Cloud Computing)
✔ Send hypervectors during training
✔ High speed learning with large data☺
Federated Learning(Edge Computing)
✔ Send pretrained hypervectors for a single data type
✔ Drastically reduce bandwidth ☺
• Learning happens in the HD space by mapping data using randomly generated hypervectors ‐> more secure
Hierarchical Learning(Hybrid Computing)
✔ Distributed learning using different data types
✔ Combine hypervectors to aggregate information
[CLOUD'19] M. Imani, Y. Kim, S. Riazi, J. Merssely, P. Liu, F. Koushanfar, T. Rosing “A Framework for Collaborative Learning in Secure High-Dimensional Space”, Cloud Computing (CLOUD), 2019
● HD computing is an attractive, promising solution for future computing systems○ Inherently robust against noise and failures○ Address the production challenges of advanced technology ○ Elegantly handles big data
● Next steps○ Exploit emerging HW technology & computing paradigms
○ Novel NVMs, stochastic computing with HD etc.○ Adaptive & scalable encoding for diverse data○ Automated code mapping to architecture○ Distributed and secure computing at scale
The future for HD is bright!