Krishnakant Saboo

Formant Tracking using Extended Kalman Filter


Guide: Prof. Navin Khaneja , IIT Bombay (visiting faculty)

Background

Formants are useful for phoneme detection which is paramount for speech recognition. Though it is relatively easy to find formants for single utterances, formant tracking is more challenging. Because speech is not uncorrelated across time, Linear Predicitice Codes have been used for formant tracking. Here, study the application of Extended Kalman Filter for formant tracking in comparison with other methods.

Approach

We studied methods based on Linear Predictive Codes (LPC) for formant tracking in speech signal. In these methods, the vocal filter is assumed to be an all pole filter and the coefficients are estimated from the signal value. The performance of this method degrades if the number of polesis not chosen properly. Extended kalman filter performs better than LPC if the matrices that capture model parameters are estimated correctly. These methods' performances were compared with the formant tracker in PRAAT and the advantages and pitfalls of the methods were highlighted. In particular, the EKF method can track vowels, plosives and fricatives better than LPC. Also, EKF method continues tracking after silence much better than LFC.

Fig1: 3 format tracking by Extended Kalman Filter for the sentence "Machali jal ki hi rani". The tracking is smooth and even low frequencies can be effectively captured. Tracking is smoother compared to LPC.

Fig2: 3 formant tracking for the same sentence using Linear Predictive Codes. The method is unable to track the first formant after 700ms. Also, the value changes abruptly during silence.