Back to Blog
23 march speech in urdu6/8/2023 ![]() ![]() This has been a source of inspiration to develop a speech recognition framework for Urdu, based upon the new Discrete Wavelet Transform based features. Thus, research has been directed towards the use of Wavelet Transforms for feature extraction (Tan et al.ġ998). The fixed window size results in a fixed resolution of the time-frequency representation of the STFT. Similarly, if the window duration is increased, this may improve the frequency resolution but will degrade the time resolution of the representation. Furthermore, in order to guarantee the signal to be stationary, short window duration may be used resulting in high time resolution but poor frequency resolution. This, in fact, has a lack of compliance to the actual scenario. The features extraction based on STFT has an inherited assumption that the audio signal remains stationary throughout the period of analysis. Although the Mel Frequency Cepstral Coefficients (MFCC) and the Linear Predictive Coding (LPC) based features (Hachkar et al.Ģ006) have been very famous for speech recognition applications, the basic approach for these features extraction has always been based upon Short Time Fourier Transform (STFT). Besides the sophisticated language resource for these languages, one of the optimization tasks for the realization of a more robust ASR system has been the extraction of features which are robust against noise.
0 Comments
Read More
Leave a Reply. |