Improving Music Search With Machine Learning
Popular music search and discovery systems such as Pandora and Last.fm rely primarily upon human entered annotations to properly classify songs for search retrieval. Though effective, human centric approaches to music classification are labor intensive and the recommendations that can be generated are limited in scope. For instance, a person must know the name of a particular artist or track in order to receive a recommendation. This situation is not a problem for music fans and aficionados, but it tends to limit the discovery possibilities for casual listeners who may not know a wide variety of artists and track names.
Researchers at the University of California San Diego Computer Audition Lab have developed a system that could address this problem by allowing people to find music using descriptive words rather than artist names and song titles. For instance, a person could enter the words “high energy guitars” or “romantic vocals” and then receive a list of tracks that match that description.
The USCD system is capable of ingesting songs and automatically tagging them with annotation data without human intervention. To provide accurate results, the system must first be taught to hear music and describe it using natural language. The training process uses digital signal processing and machine learning algorithms to expose the system to a broad array of music along with the words people use to describe it. For example, to be able to accurately identify music that is referred to as “driving rock”, the system must analyze a large number of driving rock songs and then identify signal patterns that make that particular style of song unique.
The researchers have been gathering training data through crowdsourcing using an innovative Facebook game called “Herd-It”. In this game, users are played a song snippet and asked to associate descriptive words and phrases with it. Users earn points based on how well their answers match those of previous players.
The research group’s latest work in improving automatic music analysis was recently presented at the 2009 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP) in the paper “Dynamic Texture Models of Music,” by Luke Barrington, Antoni Chan, and Gert Lanckriet.
With the continuing decline of the radio DJ as taste maker, web based music search and discovery tools will become increasingly important. With further development, machine learning driven music search systems such as this one could provide an intuitive and compelling method for listeners to find music they will enjoy.