"Look out honey, 'cause I'm using technology..."

2010-02-25

Generating Band Names and Song Titles

Yesterday my distinguished colleague Stuart Langridge said he could use some fake band names for functional tests in the Ubuntu One Music Store he's working on. Since I'd done a similar thing once before (for realistic sounding Dutch city/village names.) I figured I could hash an algorhithm out pretty quickly, and so I did, last night.

I started with a simple Markov chain class that can analyse a body of data and then generate text that is like the source data, but does not actually occur in it. At first I used the same algorithm for artist names and song titles, but I soon decided generating songs word by word, and artist names character by character made more sense, since proper names (and a lot of band names) don't adhere to spelling rules anyway.

The song titles came out pretty great (as in, in every batch I generated there were at least a few funny ones,) but the artist names remained problematic, so the next thing I tried was splitting the artists into groups and people. This seems to be generating even better results, but splitting the list of artists (which I generated with a throwaway plugin for quodlibet, see below) turns out to be a lot of work. I did a few hundred manually, and the results below are quite cool, but there's a lot of partial duplication. Perhaps I shall finally download the musicbrainz data set, and see if I can generate separate lists of people and groups/bands from that easily. (HINT: if someone has such lists lying about, I would be immensely grateful if you could mail them to thisfred at gmail. That would be me.)

Anyway, here's a single unedited run of the script:

Jack Planter - First true love will die
The Islandry The Mountals - And We Wake Up (live acoustic)
Eef Bunyan - 08-Welcome to Rock Remix
Luna - The Deacon (Duke Dumont Remix)
Jolie Newsom - What A Christmas Duel
Dj Funky Banhardo Villah Priest - King Of The Enemy
The Ian Mouse - Carol Of The Goober Woobers
Robby Brokend - The Same Machine
Princeformeroon) - Long Dark Blues
Joha - What The Fuck Out
Cartripes Young Lips - How Blue You Can Live Without You
Rude Stooges - I Called Out Your Window?
El Perrown - And a She Wolf (Moto Blanco Radio Edit)
Jana Talmann - When My Broken Shield
Flip Kowlie Newsom - Lookin' For A Propulsion Device Based On Heim's Quantum Theory)
Kelle Shock - Used to Hate Us)
Billiams - Forever On The Verge
Iggy Polly - Extraball ft. Amanda Blank
Marton - We Are Decided
...And thers of Leon & Kypski - One Of These Days (Clifton Chenier cover)
Misse Dayton - Home On The Edge
Whisperdrag - Son of Rio Mix (Single Version)
Raftwer - Song A Day Another Day
Stard, Run Run - Begin to See What I Meant to Be Glad
Asobius Pip - Music Sounds Better with You
Ill Girl Seed - Never an Easy Way to the Center of the South
The Weeper Girls - Mind How You Feel It?
Flip Kowlie Newsom - Sondre Lerche - To Plant A Seed.mp3 [Unknown]
Alexand - Far Cry (live in the Jungle
Rocker - Veins To The City
Sean Lionhearthan - put it on the Dancefloor (John B Remix)
Trail Riot - Islands In The Gale/Josephine
Williott - Make It Home
Death - Section 7 (Hanging Around the Christmas Tree On Fire (Holy Ghost! Remix)
C-Monobotix - Never Make a Noise
MC Ricard Coxon - Hell - Part Four
Williams Jebeniana Nastarr - Drop Some Silver In The Dead
Williott - Le Le
Alexandt - Pop song for our City
Beth Lakemann - A Friend (That I've Never Understood
Corns of Happies - The Only Healer - Featuring Caroline Schutz Of Folksongs For The Winter
The Naturday - Awoken By a Horse
The Plaza Cent - Like a Mama
Emma Pop - Mogwai and Summer Walks
Ra Ricord Citi 80 - I.C. Y'All (feat. Busta Rhymes, Raekwon & Lil Wayne)
Wints Sected The Tegenwoording Cooks - To Save You
Madow - I Sold My Hands Are Made
Read - 94b Christmas in July
Franco et and - Can't Turn You Into the Pit
The Do Roots - Remember When I Was a Lover
Killalobos - How We Do Is Wrong
David Krauss - Remedy (A1 Bassline Remix) - L2
Eringfielson - Rock The Beach (Neil Young Cover Live 8/15/2003)
Mitch Harcourtis Pilar - Bathroom Gurgle (Duke Dumont Ode To Todd Mix)
Shape - Lost in the game (pt 1)
Tweakes - If You've Got Hopes
Steve Elliams - Dis policeman keeps on kicking me to the Mardi Gras In New Orleans
Billaloner - Motown Never Sounded So Good They Named it Thrice
Van Lidbo - Take me Down
Elastin Trainfully Bessy Bean Moby Grape. - How's The World Can Stop Me Worryin' Bout That Girl
Jural - Long Live The Fallen Aristocracy.mp3 [Unknown]
The Machiefs - Hunt Like the Real Thing
Case - Back In Your Window
Bennie Hollalobos - Shoes (A Bang Gang Remixxx)
Foung Afteras - It Aint Me Babe.mp3 [Unknown]
Dar Willalobotnicks - You Still Believe In Christmas
Anthony Robinson - The Pink Wig To My World Fell Down (Single Version)
erlights - Fall From Your Bed
Eddie Kennor - Last Kiss (Originally recorded by J. Bryson & 1st draft by Zaki Ibrahim)
Chrissy Elliams - Sunday Kind of Chill
Erlendrick Ense - Get Up I Feel For You
Mic Spareck Plan - Walking With a Mixx
Jay Bird - I'm In Your Area
Neko Catra - More Like It
Сергей Шнургей Шнургей Шнургей Шнургей Шнуров - Back of the Dead
Doctors - What Once Was Will Be Free
Emmy Cliff - We Are Golden (Jokers of the seasons
Sean Lionheart - CrowdedHouse - Something Special
Lesbian Cobra - If I Got 5 On It (Clean Edit)
Del Maar - ...Has A Way
Brooks Stra - The Hazards of Love
Chiness Candy - Brahms: Studies, Anh 1A/1 - Presto - Allegro con spirito
Hermanna Nadle - The Other Version (ft. Kid Cudi)
Tokyo Police - Got It and Grab It
A Silents - Your Ex-Lover Is Dead (Remaster)
Tigerince - Now I'm Here You're There (Mexicans with Guns Remix)
Page Fays - I Won't Be That Way
J Dieneman - Got To Make You Strong
The Walkmena Vistener - Someone to Love You Until My Veins Again
Digable Strung - Standing At the speed of life
Willalobos - Sinatra - It Was
Wayne Staalendricks - Steven McCauley for President (Exclusive)
Bonna - Devil Made Us Do It Again
Palaxy - small town (live)
Territsen - We Got The Money I've Got It Bad and Young Jeezy: National Anthem
Shears - A Lonely Construction Worker
Garvie - Walking On A Cloud Of Smoke and Sassafras
Broobinski - Lords of The World, Jonah
Williams - Lake Shore Drive (Todd Terje Edit)
Two Bassibles - Madmen's Discotheque (Disconet Casey Jones (On The Road

As you can see there's a lot of names that are too close to real names to be interesting, and quite a few common patterns. Also broken parentheses, due to (my implementation of) Markov chains being only mildly context sensitive. (See how I used that term in an actual sentence? Totally worth it, that education.) Also, I can't guarantee that none of these titles or artist names aren't actually real, because of trivial lower case/upper case and or white space and punctuation differences, or, due to the artist not being in my source data set.

And here's the code that generated it:


import random

class Markov(object):

    def __init__(self, words=False):
        self.db = {}
        self.lines = set([''])
        self.words = words
        if words:
            self.prevs = 2
        else:
            self.prevs = 3

    def process_file(self, filename):
        with open(filename, 'r') as file:
            for line in file:
                self.process_line(line)

    def process_line(self, line):
        self.lines.add(line.strip())
        prevs = []
        for i in range(self.prevs):
            prevs.append(None)
        if self.words:
            line = line.split()
            line.append('\n')
        for character in line:
            self.db.setdefault(
                tuple(prevs), []).append(character)
            prevs.append(character)
            prevs = prevs[1:]

    def generate_line(self):
        line = ''
        tries = 0
        while line.strip() in self.lines and tries < 100:
            tries += 1
            prevs = []
            for i in range(self.prevs):
                prevs.append(None)
            line = ''
            while True:
                char = random.choice(self.db[tuple(prevs)])
                if char == '\n':
                    break
                prevs.append(char)
                prevs = prevs[1:]
                line += char
                if self.words:
                    line += ' '
        return line.strip()

n = Markov()
n.process_file('names.txt')

g = Markov()
g.process_file('groups.txt')

t = Markov(words=True)
t.process_file('titles.txt')

for i in range(100):
    x = random.choice([n, g])
    print x.generate_line() + ' - ' + t.generate_line()

[Edit]: removed a redundancy left by earlier refactoring.

And here's the dead simple quodlibet plugin, just to show how cool quodlibet is. Note that quodlibet, unlike for instance the also quite nice Rhythmbox, would allow you to do this (or much more interesting things) with *any* id3 tag, including ones you make up yourself.:
import os
import const
from plugins.songsmenu import SongsMenuPlugin

class AddToListPlugin(SongsMenuPlugin):
    PLUGIN_ID = "Export artist list"
    PLUGIN_NAME = _("Export artist list")
    PLUGIN_DESC = _("Add artist name to artists.txt.")
    PLUGIN_ICON = "gtk-find-and-replace"
    PLUGIN_VERSION = "0.1"

    def player_get_userdir(self):
        """get the application user directory to store files"""
        try:
            return const.USERDIR
        except AttributeError:
            return const.DIR

    def plugin_songs(self, songs):
        f = open(os.path.join(self.player_get_userdir(), "artists.txt"), 'a')
        artists = set()
        for song in songs:
            artist = song("artist")
            if artist in artists:
                continue
            artists.add(artist)
            f.write('%s\n' % artist)