## Cochlea, Mel-Scale, Filterbanks

From human speech perception to considerations for features for automatic speech recognition

Cochlea
Different places along the cochlea respond to the incoming frequency.

Mel scale
Nonlinearity in hertz scale, linear in mal scale, for cochlea.

Filter banks
Simplify of cochlea is like a bank of bandpass filters.
lower frequency limit and higher frequency limit.

Wider and wider in higher frequency. triangular filters is more appropriate than the rectangle one shown in figure above.

## Feature vectors, sequences, and sequences of feature vectors

Representing speech as a sequence of feature vectors

Features
Wavepoint is not useful,m magnitude spectrum (DFT) is better and spectrum envelop as feature is better better. To use spectrum envelop, we decide to use filter bank to encode feature envelop. Feature vector stores the encoded feature envelop or the feature banks.

sequences are everywhere in language

sequence of feature vectors

## Exemplars and Distances

We start to look at the concepts of distance and alignment between sequences of speech data

Exemplar 范例: a stored feature vectors of a word.
Distance (dissimilarity) between two sequence of feature vectors.
Create alignment of exemplars and the unknown is the first step to calculate the global distance.

## Pattern Matching, Alignment, Dynamic Time Warping

Search a grid with Dynamic Time Warping

Dynamic programming:

Dynamic time warping, pattern matching, aligning frames

## More Dynamic Time Warping

Origin: Module 7 – Speech Recognition – Pattern matching
Translate + Edit: YangSier (Homepage)

:four_leaf_clover:碎碎念:four_leaf_clover:
Hello米娜桑，这里是英国留学中的杨丝儿。我的博客的关键词集中在编程、算法、机器人、人工智能、数学等等，点个关注吧，持续高质量输出中。
:cherry_blossom:唠嗑QQ群兔叽的魔术工房 (942848525)
:star:B站账号杨丝儿Online（活跃于知识区和动画区）