Youtube content identification technology?

Pedro Moreno and others at Google/Youtube work on it. They use finite-state transducers to recognize sequences of music phone units, similar to phonemes in automatic speech recognition.

Check out this article:

  • Eugene Weinstein, Pedro J. Moreno;
    Music Identification with Weighted
    Finite-State Transducers,
    Proceedings of the International
    Conference in Acoustics, Speech and
    Signal Processing (ICASSP), 2007.

If you change the speed or pitch throughout the whole song I’m surprised that these algorithms still recognize the song. But maybe they normalize the pitch and speed (using the time between beats) to be able to recognize covered versions as well, not just the original ones. But it’s not surprising that it can ignore the beeps you added, since there is enough similarity in your audio stream otherwise.

(Actually the finite-state-based algorithm would be awesome to apply to my iTunes library, to tag the files correctly. Because services like MusicBrainz rely on more or less exact hash matches of your audio and the database entry, whereas the transducer method seems to be more difference-tolerant in recognizing the files.)

Leave a Comment

Hata!: SQLSTATE[HY000] [1045] Access denied for user 'divattrend_liink'@'localhost' (using password: YES)