VAD、KWS、ASR

语音活性检测 (Voice activity detection,VAD),检测信号中语音成分和非语音成分

 

关键词检测 (Keyword Spotting, KWS),检测语音信号中的关键词与非关键词部分

 

自动语音识别 (automatic speech recognition, ASR),  使得机器能够识别出人的语音

 

目前采用ASIC的方案,直接实现VAD,KWS,ASR的超低低功耗电路设计成为主流的趋势,本文总结了关于上述内容的最新论文。

 

KWS

1.会议:ISSCC

Shan, Weiwei, et al. "14.1 A 510nW 0.41 V Low-Memory Low-Computation Keyword-Spotting Chip Using Serial FFT-Based MFCC and Binarized Depthwise Separable Convolutional Neural Network in 28nm CMOS." 2020 IEEE International Solid-State Circuits Conference-(ISSCC). IEEE, 2020.

 

2.期刊:JSSC

Giraldo, Juan Sebastian P., et al. "Vocell: A 65-nm Speech-Triggered Wake-Up SoC for 10-$\mu $ W Keyword Spotting and Speaker Verification." IEEE Journal of Solid-State Circuits 55.4 (2020): 868-878.

 

3.会议:VLSI

Giraldo, JS P., et al. "18μW SoC for near-microphone Keyword Spotting and Speaker Verification." 2019 Symposium on VLSI Circuits. IEEE, 2019.

 

4.会议:ESSCIRC

Giraldo, Juan SP, and Marian Verhelst. "Laika: A 5uW programmable LSTM accelerator for always-on keyword spotting in 65nm CMOS." ESSCIRC 2018-IEEE 44th European Solid State Circuits Conference (ESSCIRC). IEEE, 2018.

 

5.会议:VLSI

Yin, Shouyi, et al. "A 141 UW, 2.46 PJ/Neuron Binarized Convolutional Neural Network Based Self-Learning Speech Recognition Processor in 28NM CMOS." 2018 IEEE Symposium on VLSI Circuits. IEEE, 2018.

 

6.期刊:JSPS

Shah, Mohit, et al. "A fixed-point neural network architecture for speech applications on resource constrained hardware." Journal of Signal Processing Systems 90.5 (2018): 727-741.

 

7.期刊:JSSC

Price, Michael, James Glass, and Anantha P. Chandrakasan. "A low-power speech recognizer and voice activity detector using deep neural networks." IEEE Journal of Solid-State Circuits 53.1 (2017): 66-75.

 

8.会议:ISSCC

Price, Michael, James Glass, and Anantha P. Chandrakasan. "14.4 a scalable speech recognizer with deep-neural-network acoustic models and voice-activated power gating." 2017 IEEE International Solid-State Circuits Conference (ISSCC). IEEE, 2017.

 

9.期刊

Zhang, Yundong, et al. "Hello edge: Keyword spotting on microcontrollers." arXiv preprint arXiv:1711.07128 (2017).

 

VAD

1.会议:ISSCC

Yang, Minhao, et al. "Design of an Always-On Deep Neural Network-Based 1-$\mu $ W Voice Activity Detector Aided With a Customized Software Model for Analog Feature Extraction." IEEE Journal of Solid-State Circuits 54.6 (2019): 1764-1777.

 

2.会议:ISSCC

Yang, Minhao, et al. "A 1μW voice activity detector using analog feature extraction and digital deep neural network." 2018 IEEE International Solid-State Circuits Conference-(ISSCC). IEEE, 2018.

 

3.期刊:JSSC

Yang, Minhao, et al. "A 0.5 V 55$\mu\text {W} $64$\times $2 Channel Binaural Silicon Cochlea for Event-Driven Stereo-Audio Sensing." IEEE Journal of Solid-State Circuits 51.11 (2016): 2554-2569.

 

4.会议:ISSCC

Badami, Komail, et al. "24.2 Context-aware hierarchical information-sensing in a 6μW 90nm CMOS voice activity detector." 2015 IEEE International Solid-State Circuits Conference-(ISSCC) Digest of Technical Papers. IEEE, 2015.

 

ASR

1.期刊:Transactions

Tsai, Wei-Yu, et al. "Always-on speech recognition using truenorth, a reconfigurable, neurosynaptic processor." IEEE Transactions on Computers 66.6 (2016): 996-1007.

 

2.期刊:JSSC

Price, Michael, James Glass, and Anantha P. Chandrakasan. "A 6 mw, 5,000-word real-time speech recognizer using wfst models." IEEE Journal of Solid-State Circuits 50.1 (2014): 102-112.

 

 

小知识:

集成电路学科的两大顶会一大顶刊:

会议:国际固态电路会议(IEEE International Solid-State Circuits Conference,ISSCC)

超大规模集成电路研讨会(Symposia on VLSI Technology and Circuits,VLSI)

期刊:固态电路期刊(IEEE Journal of Solid-State Circuits,JSSC)

 

欢迎关注公众号“芯设计”

VAD、KWS、ASR