Convolutional Neural Network (CNN) for Audio [closed]
We used deep convolutional networks on spectrograms for a spoken language identification task. We had around 95% accuracy on a dataset provided in this TopCoder contest. The details are here. Plain convolutional networks do not capture the temporal characteristics, so for example in this work the output of the convolutional network was fed to a … Read more