
Yves Raimond: With the increasing amount of musical audio available on the web, and the increasing amount of music you can have in a personal music collection, we see knowledge management problems arise. These problems are similar to the ones that happen in large libraries: an item which is not properly described is lost -- nobody will ever be able to find it.
The more information you have about a particular item, the easier it will be to locate it. You have a wide range of such information: editorial information (e.g. artist, album, track, etc), cultural information (e.g. genre), and information about the actual content (acoustic similarities amongst items, structural segments, etc). You need to handle all that within a unified framework.
Moreover, what if you want to have more information about a particular artist, a particular work or a particular performance? You need to consider far more types of resources than just the audio items themselves.
The Music Ontology (MO) provides a Semantic Web vocabulary encompassing such information.
Yves Raimond: Most of them are based on two main approaches: collaborative filtering (e.g. last.fm) or content-based filtering (e.g. Pandora). The former recommends items to a user based on his taste and the tastes of other related users. The latter uses a musicological description of the actual audio items, which can either be manually asserted (as in Pandora and the Music Genome Project) or automatically extracted. I do not have any favourites, but I think interesting things will happen when “hybrid” recommenders, which combine these two approaches, emerge.
Yves Raimond: Oscar Celma's FOAFing-the-music project is really interesting from a technological point-of-view. It uses a user's profile, available anywhere on the web, and derives recommendations from this profile and a lot of aggregated feeds and content-based data. These days, I use last.fm a lot, although it really insists on making me listen to Dutch hip-hop (which I don't particularly like).
I think there is something quite frustrating about music recommender systems at the moment though. First, they do not explain how a particular recommendation was derived. I would really like them to tell me "I recommended this track because the harmonies are similar to other tracks you liked according to such and such criteria". I think I would place more trust in a recommender system that actually explains recommendations, like a friend would do.
Another frustration is that we now have a really huge music-related web of data, created within the scope of the Linking Open Data project, which is not used at all by current recommender systems.
We started some work with Alexandre Passant, driven by these two frustrations. Using all these interlinked data for recommendation purposes allows us to break free from the traditional 'information barriers', and use all sorts of data as a basis for a musical recommendation.
For example, using the datasets currently available and interlinked on the web, you can already provide recommendations such as “You're interested in intentional living and the Beastie Boys? Did you know that B.B. King is a vegetarian, as is Adam Yauch, who is a member of the Beastie Boys?”
Yves Raimond: An example of such a linkage is the one between Musicbrainz and DBpedia. Musicbrainz is highly structured with high-quality and moderated data, whereas DBpedia has a wider scope but is messier. However, artists in Musicbrainz and artists in DBpedia are interlinked.
In my opinion, this is not really a problem. When you get some RDF data, you get it from somewhere. You can always filter out some statements depending on their origin. For example, for editorial data, I trust Musicbrainz more than DBpedia.
Yves Raimond: Apart from the recommendation systems mentioned above, interlinked data is really useful for managing audio collections. Usually, media players just use embedded ID3 tags within audio files (eg. artist name, track title, album title) for searching and browsing purposes.
Using a tool such as GNAT (available in the Music Ontology tools Sourceforge project), you can get from an audio file to a corresponding web identifier, which gives an entry point into the web of data. Then, your media player can act as an aggregator. It crawls the web, and aggregates structured data about the different audio items in your collection.
You end up with a 'tailored' database, describing a particular audio collection. Then, you can navigate your collection in interesting ways, as illustrated in the DBtuneFacet Demo. You can plot your collection on a map, based on the location of the different artists involved in it. You can plot it on a time-line, based on the birth dates of the artists, on the recording dates, etc. You can search using aggregated tags, browse from a particular performance to the corresponding musical work and look for other performances of that work, discover other albums made by the artists in your collection, etc.
Of course, these are the applications I can think of right now, but the real value of Linked Open Data is its unexpected re-use! For example, my colleague Kurt Jacobson analyses the community structure of artists on MySpace to see whether this structure is related to the actual audio content they put online.
Yves Raimond: I think it is crucial for music labels to open up their catalogue, and publish as much structured data as they can on the web. This allows users to find their content through multiple platforms, in a multitude of ways. For example, the aggregator mentioned above could give me new albums made by my most-played artists, new performances of my favourite works, etc.
Creating new data silos, making content accessible in one unique way on one unique platform will just worsen the current situation...
Yves Raimond: This is another really interesting point. Machine-readable licensing information for musical items is becoming more and more of a reality. Creative Commons already did a lot of work in this domain. An exciting application of such data would be to integrate it within the production process.
For example, when using a bunch of samples coming from different tracks, my audio editing application could use machine-readable licensing information to handle the publishing of my creation. It could let me choose amongst the different licenses that are compatible with the audio materials I used. It could derive which of my samples come from tracks with an attribution license and publish the corresponding attributions.
Yves Raimond: I would really like to have a small RDF cache on the iPhone which would hold an aggregation of structured web data describing the music collection, to drive similar applications as above, but in a mobile context. We also worked a bit with Christian Becker to integrate last.fm recommended events in DBpedia mobile, and it works quite well! You can display these events alongside nearby sights, all that on your mobile device.
Yves Raimond: I am really looking forward to the event in general, and especially to the Semantic Desktop and the Multimedia tracks. I am especially interested in learning more about Nepomuk, and would like to see if the “media player as an aggregator” idea could be integrated in the Nepomuk framework.
Yves Raimond is a researcher and PhD student at the Electronic Engineering Dept., Queen Mary University of London, where he is associated with the Centre for Digital Music. His research interests include Semantic Web technologies and automated music analysis for enhanced access to music-related information. He is one of the main contributor of the Music Ontology community project and is also involved in the Semantic Web Education and Outreach interest group Linking Open Data on the Semantic Web community project, where he deals with publishing and interlinking music-related structured data on the web.
Yves Raimond is a keynote speaker at the Web of Data Practitioners Days, taking place in Vienna, Austria, 22-23 October 2008.
The Web of Data Practitioners Days is a new application-oriented event for Semantic Web practitioners and interested newcomers. The Web of Data Practitioners Days aim to communicate the results of the past years' semantic systems activities to a broader audience, especially to practitioners from the industry and academia. In a cooperative effort, four major Austrian institutions, which have actively been conducting research in that area throughout the past years, will set the stage for this event.
Participants are going to have the opportunity to see how semantic technologies may improve and enhance existing Web-based software systems and how the Web of Data will provide a completely new paradigm of managing globally interlinked information. As a result, attendees will have a better understanding of the practical benefits of semantic solutions and researchers obtain valuable feedback for further research directions aiming at productivity and applicability of the existing technology for real-world use cases.

[1] Yves Raimond's homepage at Queen Mary University, personal homepage
[2] Top 23 Music Recommender Websites
[3] DBTune
[4] Triplification Challenge
[5] Automated Content Access Protocol (ACAP)
[6] DBpedia Mobile
Comments
Add new comment