Homepage - Krishnakant Saboo

Music and Rhythms Perception using Spiking Neural Networks

Guide: Prof. Bipin Rajendran, IIT Bombay

Background

Humans have an inherent propensity for melody and rhythm and find them pleasant while noise is unpleasant. What is it about melodies that makes them sound pleasant? We hypothesize that it is the inherent periodicity in them which makes them predictable and hence we enjoy them. The aim was to show that there exists a network such that it learns notes of a tune, hence making the melody predictable.

Approach

We designed a 2-layered winner-take-all spiking neural network with plastic synpases. The synaptic weights had a spike timing dependent plasticity rule. AEF model was used for first layer neurons while LIF neurons were used in the second layer. Each neuron in the first layer was assigned to a particular note.

The first layer was given as input an audio sequence with one note or multiple notes playing concomitantly. In case a melody was presented, the weights modified themselves such that at the end of the song, each note in audio was learnt by at least one neuron in the output layer. In case noise was input into the network, then no learning happened. We also tested the case where a noisy melody was input into the network and observed partial learning for it.

We can differentiate between the noise and melody by looking at the final weight configuration (receptor fields). We also evaluted the capacity of the network (the number of notes that can be learnt) and the performance for different number of output layer neurons.

Fig1: The synaptic weight for each neuron in the output layer for all the input layer neurons. All weights were initialised to 4000. After input was presented, the weights modified as shown above. Each output neuron has learned 3 tones that were occuring simultaneously.

Fig2: The synaptic weights for the output layer neurons when noise was presented as input. As can be seen, the weights are very close to the initialisation weight (4000); much higher as compared to the case of melody. It can be said that no notes were learned by the network.