One of the things that frustrates me with having a local music library is the tedium of tagging. While there are tools (like beets or MusicBrainz Picard) to make it easier, I feel there are fundamental issues with the design of tags for music, and the way we apply tags. This doesn’t just make exploring music harder, it leaves a lot of possibilities on the table that will be very hard to implement without significant changes to our approach.
Schema design: Who needs normal form?
Good database schema design teaches us to avoid duplication. Having duplicate data isn’t just wasteful, it’s actually (perhaps counter-intuitively) fragile. This is one of the big tenets of relational database design – the notion of normal form. Why store album information in the song table, when you can store it in an album table, and do a join instead?
Unfortunately, tagging formats never really got this lesson. This is likely because of two factors – there’s no standardized way (but plenty of proprietary ones) to have metadata in an attendant database, and many people only download the songs they cared about. This meant that the metadata about not just the song, but the album holding it, went into the the song’s file – a violation of normal form. It gets not just conceptually problematic, but actually wasteful when you have album art and potentially other parts of album packaging in the file’s metadata – megabytes of images, per file. Of course, people usually make the images separate files, but it’s still not applied consistently beyond heuristics for what the front cover is called.
Incorrect or incongruous metadata is just a mild annoyance if you only have loose MP3s you collected from Napster and dropping into Winamp. If you’re trying to manage a music library, it becomes incredibly annoying. Individual tracks have no relation to each other, and you can only use heuristics within the tag, or maybe their place in the filesystem hierarchy, to determine if they’re logically grouped together. If the user’s data is subtly wrong between some tracks, but not the others, they’ll appear split up within the library. This requires re-tagging the tracks, on the file level, or perhaps more likely, in your software’s database representation of the library, which is more efficient than consulting the separate files. It’s a lot of effort to spend, and mistakes make a catalog less useful.
While not fatal, also annoying is the rigid tag scheme. This usually means further information (such as a remaster, performers, featured artists, remix including variant and remixing artist, etc.) is usually fit into album or song titles. Who wants to see a mouthful like “Track (feat. Rapper) [2019 Remaster] (Live at Concert Hall) (EDM Remix)” – especially without any consistency? Or have to worry about if the year on an album should be the date of original issue, or the date of its reissue? Or how to handle the most prolific musician of them all, Various Artists? What about different transliterations or translations – is “Molchat Doma” the same as “Молчат Дома”? Tag formats occasionally offer extended metadata as a better place to put these, but they’re rarely used, or rarely used in a consistent matter, if even present.
There’s also user data like play counts and ratings too. These can be put into the tag, but would pointlessly modify the original file, with the integrity and sharing issues that has. Not to mention the common case of holding files on a read-only medium like a file share.
All of these issues aren’t theoretical either. I dealt with inconsistent metadata before (one such example), and it only became a problem because some software applied different heuristics over some kinds of metadata. Efficiently correcting it was a pain. It’s not good that we’re trying to construct a relational database out of several disparate little files, with inconsistent amounts of rigour put into them. It feels like an inversion of how a database should be built; when we have to manipulate the file instead of running simple queries, it’s a flashback to the bad old days.
Sidebar: Julien Voisin has a blog post where he covers many ugly edge cases that tag schemas don’t handle well. The post is more about bizarre names that fit into existing schemas, but it’s worth reading to see how poor some software’s assumptions can be.
Schema design: more structure, more assumptions
Audio tagging is designed around the assumption that it’s music that has tracks in albums, made by usually singular (as in a single person or group) artists. There are some variations around this model, but systems are generally based around this core assumption. While some formats are more freeform name-value internally, the way most software surfaces this to a user is the fixed artist-album-track hierarchy. Arguably, tagging is a bit of a misleading term, considering people associate that more with freeform tags that are less intrinsic to the work itself. (Hillel Wayne covers tag systems in that context in this article, if you want a refresher.)
Our current tag systems were designed for music, but people have audio in other forms too – audiobooks and podcasts being notable examples. The two have been wedged into this hierarchy, but it can be tricky to adapt in a consistent or logical manner. Is the “artist” the narrator of the book, or the author? Is a season of a podcast the “album”, or the podcast itself? These might not be music, but people use normal music players to listen to them, so their tags are also worth considering. There’s also other kinds of audio too, that doesn’t even map well at all to current tagging – field recordings, loose audio snippets, etc. – I can’t begin to think of how to tag these properly.
The way we tag music is also a poor fit beyond “modern” popular music of the Western tradition. For example, Western Classical music has less emphasis on the “performer” (what we consider the artist tag; some formats internally call it performer though), and more on the composer (which is off to the side and rarely filled in by most people when tagging). Anastasia Tsiouclas of NPR and Charles Petzold have articles covering this in more detail. That said, Apple also bought the classical music streaming service Primephonic a while back, so perhaps the state of classical music tagging might improve for Apple Music users. I’m not familiar enough with other kinds of musical traditions, but I have to imagine this could be a problem for them too.
Versions and cataloguing: WEMI and friends
The library cataloguing world has been dealing with the issues of metadata for decades. While mostly focused on the issues of books (in physical and nowadays, digital form), they’ve had to deal with many other kinds of media, many of the issues they face have equivalents for dealing with a personal music catalog. For a very in-depth overview of the metadata and schema issues dealt with there (and the conflicts between them), I strongly recommend you read Karen Coyle’s FRBR: Before and After. I’ll just be scratching the surface.
One of the most interesting notions is some kind of separation – the FRBR model (and its many confusing descendants) uses the work, expression, manifestation, item scheme, or WEMI for short. The idea is to represent something in its most abstract (the artistic idea, the work), then relate it to more and more concrete forms (such as the English-language version of a text as the expression, the paperback book published in 1996 as the manifestation, all the way to your copy of it as the item). This makes it easier to group related things together – different publishers’ version of the same book, different language versions, etc. Without some kind of form of expressing this, it would be very easy to get bogged down by several duplicate or irrelevant results when looking up a book.
Of course, it’s not all that simple – there are other views on how to relate different levels of abstraction (especially in how it affects user experience for finding something), what changes require a new WEMI-level item (what’s the criteria for i.e. a new manifestation? is a compilation a work?), the relations that result with translation and annotation (making it not quite a simple linear relation), and what properties belong to what level of abstraction (is the key a song composed in a property of the work and/or the expression?). In particular, there are many competing views on the relations between the abstract and concrete should be laid out. Explaining the differences between them would take a lot of time, so I defer to Part 1 of FRBR: Before and After, where you can check out various library scientists’ perceptions if you wish to learn more.
Sidebar: For an example of how all of this is applied for music in actual libraries, take a look at Best Practices for Music Cataloging. There’s a thick layer of it being practice (in RDA/MARC) for librarians (who are also dealing with the physical items too, whereas you may not be) rather than theory for the layman, but it doesn’t assume too much prior library science knowledge. While your personal tags might not need such rigour, it demonstrates how WEMI is applied for things like popular music on a large scale. And if you’re pulling metadata from an authority that has a lot more stuff and knows more about it, you better hope it’s rigourous.
While the various kinds of separations are up to debate in the cataloguing community (as is schema design in general), having some form of them is useful. I myself am not too picky. There may be albums, but there’s various versions issued for different countries in their physical form, even if the contents are identical. The variations in physical form might not be too relevant for modern digital music, but it would affect packaging like album art and booklets – the former of which is usually included in your file’s metadata, the latter occasionally as an attendant file beside it. Sometimes the contents aren’t identical – albums have changed their contents for various regions. For example, Depeche Mode’s Speak & Spell has a different track list for the US and UK version of the album. Not to mention if an album is reissued with additional tracks or a remaster of the original tracks – pretty much any sufficiently old album will acquire these kinds of variations, and there’s a chance you might want multiple versions of the album. Why shouldn’t that be cleanly representable?
There’s also more complex cases too, and they’re surprisingly common. Kanye West’s The Life of Pablo had the track list changed soon after release. David Byrne and Brian Eno’s My Life in the Bush of Ghosts had a controversial track omitted on its re-release. Metric’s Grow Up and Blow Away was distributed amongst the underground scene for years, but had its track list changed for its official release. Kraftwerk’s albums were changed for the US versus German markets. Numerous albums keep getting reissued in box sets (sometimes with other albums combined – significantly complicating relationships in a schema), extended track lengths, or simply with different artwork. What if you want either version?
The scheme could also be extended to other things as well. As a very simplified WEMI-ish model, imagine the “essence” (for lack of a better term) of the song (which could be decomposed further – arrangement, lyrics, etc.), and “renditions” by various artists, including covers, remixes, censorship, and mastering variations. Or relating an individual artist or composer to the bands they were in. Unfortunately, this would require metadata reaching outside of the traditional per-file approach.
What could we do if the state of things were better? Right now, we look at music libraries as a collection of songs, and only an ill-defined, ad-hoc hierarchy of tags binding them. If we flipped the script and made metadata the focus, I think we could have nicer things.
Even within songs, we could have richer information. Imagine being able to find all the covers of a song, or alternative versions. Or the opposite – have all the duplicate versions of a song’s rendition across albums, reference all the other albums, and de-duplicate the versions on disk. Artist information, album packaging like booklets, and other such data that’s de-prioritized in the systems we have now could become first-class concepts that you can browse, instead of being figured out from metadata. These might not exist in the form of files, but they’re important information.
The possibilities here suggest integration with a larger system providing authoritative data. For example, a lot the information you would want could be derived from sources like Allmusic or Discogs. The semantic web (brought up a lot by library science people) also has a lot of promises for even personal music libraries, which could take the silo nature out of your sources and provide a scheme for sharing facts about things. It’s unfortunate about its stillborn growth as the the real Web3. Or even something smaller scale – imagine playlist portability between services, if they point to the same identities for each track.
Ultimately, I find it sad that the local FLAC collection enthusiasts are stuck in their ways when it comes to not wanting music libraries. It also makes me wonder if we’ve been focusing too much on raw files in general, beyond music. Are they the wrong level of abstraction for human-meaningful data? Could we have better user experiences by some layering on top? Or using some kind of database instead of files? It’s an interesting thought I’ve been grappling with, but that’s a topic for another day…
Appendix: The state of streaming services: a very small case study
It’s interesting to note some streaming services can make somewhat of a break from things; I’ve subscribed to Apple Music to give it a try (I have too much obscure music locally to not have a collection, but streaming is useful for discovery.). They can offer a lot more related media (like time-synced lyrics, music videos, and curation via playlists and reviews) than the frozen state of affairs with local music libraries. However, due to compatibility, they still have to work within existing tagging concepts, and this can lead to confusing incongruencies between tags the user sees.
For example, the New Order album Movement is unfortunately labelled with additional text on the album and track names to indicate that this is a remaster with additional bonus tracks (a case of an unfortunate lack of proper tags to indicate versions), in additional to an editorial:
But you can also see, there’s a crude WEMI-like separation, because we can pick the original version of the album (as well as see things like music videos related to the album):
But it’s not all roses. Of which of these versions of Pere Ubu’s The Modern Dance, which one is the authoritative version? Is there really such a thing as an authoritative version? As far as I can tell, these only differ by track length; it would be useful to have more metadata like the issue number from the label to determine which is which. And how did it even decide on this version being the default anyways?
Other albums can be even denser with options, though what version you have may not be obvious as well. Halsey‘s If I Can’t Have Love, I Want Power has five different versions available, though one can tell from the title what expanded version they are, and from the explicit indicator on the album for a non-censored version (and for this album, down to the album art). Other times the alternative mix isn’t obvious. For example, the Atmos mix of Talking Heads‘ Fear of Music sounds different enough what version one prefers is probably subjective. One can only tell what they’re looking at by looking at the mastering notices at the top (i.e. digital master, lossless, Atmos, etc.).
For just annoying-if-you-think-of-it-too-hard, here’s Marcus Mumford‘s new not self-titled album, (self-titled), with inconsistent titling. Or Caetano Veloso and David Byrne’s live album with inconsistent live markings?
For a slightly more complex case of how metadata can be confusing, let’s look at Current 93’s The Inmost Light, a “box set” of two EPs and an LP from 1995-1996 that was issued in 2007. The actual albums inside are presented as discs on a single album. Despite the issue state being 2007 (and Apple Music displaying it as such), if you look at the songs’ metadata, it displays the original issue dates of each of the constituent albums (which are made unclear), and an inaccurate “n1 of n2” count.
Fixing all this looks like a hard job – it’s hard to add proper metadata at such a scale.