Thursday, April 1, 2010

Using neural networks for speech recognition


Our 'Sinhala Speech Recognition system' is now showing an elementary behavior, but we need to improve it more in order to get a satisfactory performance. To achieve that we are using optimization techniques on several faces such as improving and training language model and acoustic model, noise filtering techniques and machine learning approaches.
I’m going to use a neural network based approach to overcome the uncertainty due to variations of user and environment noise.  My intention is to use this blog entry to provide a step by step process to illustrate how to use a neural network in speech recognition.

Introduction to neural networks
A neural network is a collection of unique processing elements named neurons which are connected with similar other elements. First artificial neural network was designed in late 50’s and it’s much simpler than any biological neural network. Biological neural networks are far more complex and lots of studies are happening to discover the secrets behind these biological systems. Neural networks perform their role in applications that have limitations of using regular computer programs such as image recognition, speech recognition and making decisions.
Neural network differ from regular programming due to its requirement of training before performing task where as in regular programming task is programmed.

Structure of a neural network
A neural network consists of set of inputs, a weight (that multiplies each input ) and output per neuron. The output is calculated by an activation function applied on the sum of all inputs that are multiplied by the weights.

No comments: