I listen to a lot of music, some of it good.

I listen to MFDOOM. I listen to Paul Simon. I listen to Kanye. I listen to Stars, and The Hold Steady, and Fall Out Boy. I listen to The Weepies and Katy Perry and Nujabes and Chance the Rapper. I listen to pop and Sub Pop and rap and ‘indie’. There is nothing analytic about my taste in music: it would take nobody very long to find something in my iTunes library 1 at which to recoil in disgust.

Despite music being incredibly industrial, there’s nothing inherently numerical about it. But you can gather a thousand people in a room and none of them can convince me that the way a song hits you, the way a song throws you to the ground, the way a song grabs your limbs and forces you to dance – none of them can convince me that that feeling isn’t immortal and immeasurable.

The above paragraphs are an ambling prelude to this: I have acquired a large file filled with data for thirty thousand pop songs, let’s have some fun.

Let’s pretend, for a fleeting moment, that you are a data-driven songwriter.

How long should your song be?

This is an easy question to answer: the best songs in the world are the ones which appear on Billboard, and the best songs in the world must have optimal timing. We can figure this out by crafting a histogram of song lengths, as seen below:

Interestingly, there are two peaks in song lengths: you can be a three minute song or a four minute song. There are many good three-and-a-half minute songs, but why struggle in that nadir? Either cull that unnecessary bridge or add an extra verse.

This is a somewhat flawed picture, though, because the will of the populace in the 1970’s may have clamored for a different optimal song length than those of day. Graphing average song lengths over time reveals an interesting trend:

Note that we hit a valley of expediency 2 in the years leading up to the 1960’s, with a steady climb in length for the three decades following. With the mid rush of the 90’s came the appreciated death of the flue-minute RnB slow jam, and we’re slowly approaching a length of – three and a half minutes? Really? Shit.

What should you name your song?

We can grab each word and each track, take out the boring words like “The” and “and” and “me” and “you” 3, and count them all up, getting something like this:

Counter({'love': 2866, "don't": 899, "i'm": 735, 'little': 714, 'heart': 598, 'baby': 549, 'one': 536, 'time': 535, "it's": 514, 'get': 487, 'girl': 480, 'like': 474, 'got': 458, 'man': 450, 'night': 447, 'song': 426, 'go': 425, "can't": 421, 'back': 385, "you're": 376, 'good': 372, 'way': 364, 'want': 359, 'old': 349, 'come': 347, 'home': 341, 'never': 334, 'let': 329, "i'll": 328, 'blue': 319, 'know': 317, 'world': 298, 'day': 296, 'take': 288, 'blues': 284, 'sweet': 275, 'away': 267, 'make': 265, 'life': 257, 'say': 243, 'eyes': 235... # omitted for length.

To make this more visually appealing, we can put it in a word cloud, like this:

The conclusion here is painfully obvious: if you are a songwriter and you want to become famous, name your song One Love Don’t Hurt My Little Heart, Baby. Or, alternatively, My Baby, My One Little Loving Heart. Or, alternatively, Get My Baby In the Night, Love Her Heart.

How many hit songs will your band produce?

First, I will give you the most honest answer possible: judging by the past hundred or so years, you will produce 4.42 hit songs over the course of your lifetime. Four of them will be glorious: one will be 42% as glorious as the rest.

We can take a look at this broken down annually, but there is no escaping the inexorable truth: you will probably not produce more than a dozen good songs in your lifetime 4.

Still, out of reverence to the greats, it might be pleasant to contact the most prolific of your peers:

Artist Count
Bing Crosby 293
Guy Lombardo & His Royal Canadians 218
Glee Cast 5 208
Paul Whiteman & His Orchestra 188
Elvis Presley 159
Frank Sinatra 159
Tommy Dorsey & His Orchestra 137
Perry Como 135
Glenn Miller & His Orchestra 127
Billy Murray 122
Ted Lewis & His Band 103
Ben Selvin & His Orchestra 102

Please note that history has deemed the cast of Glee more noteworthy than Frank Sinatra.

Coda

Feeble attempts at humor aside, I hope you gleamed some modicum of knowledge or joy from the above! I honestly do love the nerdier side of music consumption – one of these days I’ll take a pickaxe to my scribbles – and I think this kind of stuff is immensely interesting, even if it doesn’t necessarily apply to the core act of listening. You can download the IPython Notebook i used for all of the above here.


  1. Or, more recently, my LastFM feed. [return]
  2. Less than two and a half minutes! My stars and garters! [return]
  3. These are called stopwords. This knowledge will never be useful. [return]
  4. Though your chances increase considerably if you invent a time machine and travel back to the 1940’s. Who would have guessed that America had a thing for institutions? [return]
  5. Yep, the cast of Glee. [return]
Liked this post? Follow me!
TwitterRSSEmail