Microsoft Research

Data-Driven Exploration of
Musical Chord Sequences


Eric Nichols

Indiana University

Dan Morris

Microsoft Research

Sumit Basu

Microsoft Research

We present data-driven methods for supporting musical creativity by capturing the statistics of a musical database. Specifically, we introduce a system that supports users in exploring the high-dimensional space of musical chord sequences by parameterizing the variation among chord sequences in popular music. We provide a novel user interface that exposes these learned parameters as control axes, and we propose two automatic approaches for defining these axes. One approach is based on a novel clustering procedure, the other on principal components analysis. A user study compares our approaches for defining control axes both to each other and to an approach based on manually-assigned genre labels. Results show that our automatic methods for defining control axes provide a subjectively better user experience than axes based on manual genre labeling.

Our IUI 2009 paper: [ pdf ]

Nichols E, Morris D, Basu S. Data-Driven Exploration of Musical Chord Sequences. Proceedings of Intelligent User Interfaces (IUI) 2009, February 2009.

Our Chord Transition Matrices

Our paper discusses the automatic and semi-automatic generation of chord transition matrices from databases of popular music. We have made those transition matrices available for readers here. The full set of transition matrices, along with an explanation of their format, is available in a single zip file:

nichols et al transitions.zip

The remainder of this page presents the format of these matrices; this description is identical to the one available in the zipfile.

Songsmith Compatibility

To enable experimentation and exploration with the data presented on this page, we highlight that these data files (except for the PCA-based files) are in the same format as those used for the Microsoft Research Songsmith automatic accompaniment system, which implements the HMM-based chord-generation system described in Simon et al, 2008. Songsmith uses a separate set of transition matrices which represent aggregate data from several genres that has been hand-tuned for the product; for convenience and comparison, we also include the Songsmith transition matrices here (they are included with the Songsmith product as well):

Songsmith major-key transition matrix
Songsmith minor-key transition matrix



FILES


The four directories contained here correspond to the four experimental conditions described in our paper. Each directory contains the corresponding data for the four control axes used in that condition, in ASCII text files whose format is described below:

Genre (transitions matrices derived from labeled genre data):

Genre+AbsDiff (transitions matrices derived from our AbsDiff clustering routine using genre labels as cluster seeds):

PCA (principal components analysis of all chord transitions in our database)

Random+AbsDiff (transitions matrices derived from our AbsDiff clustering routine using random seeds):




FILE FORMATS


All of the files contained here describe transition probabilities among chords. All songs were transposed into the key of C before analysis. Our dictionary of chords includes five triad types - major, minor, diminished, augmented, suspended - so we define 60 types of chords, one triad type on each of 12 possible roots, numbered as follows:

1 = C Major
2 = C Minor
3 = C Dim
4 = C Aug
5 = C Suspended
6 = C# Major
7 = C# Minor
...
59 = B Aug
60 = B suspended

Additionally we define two special chords that represent "start of song" and "end of song", so the probability of each chord starting and ending a song can be reflected in our data.

All probabilities are represented as _log_-probability values, so all values are negative.

For all conditions other than the PCA condition, each text file contains one transition matrix, in the following format:

n (# chords, always 60)
logP(chord 1 appearing at the start of a song)
logP(chord 2 appearing at the start of a song)
....
logP(chord 60 appearing at the start of a song)
n n (matrix size, always 60 60)
logP(chord 1 -> chord 1)
logP(chord 1 -> chord 2)
logP(chord 1 -> chord 3)
....
logP(chord 2 -> chord 1)
logP(chord 2 -> chord 2)
...
logP(chord 60 -> chord 59)
logP(chord 60 -> chord 60)
n (# chords, always 60)
logP(chord 1 appearing at the end of a song)
logP(chord 2 appearing at the end of a song)
...
logP(chord 60 appearing at the end of a song)

The files in the 'pca' directory, which correspond to the PCA-based axes presented in our paper, are in the following format:

variance
n (# chords, always 60)
P(chord 1 appearing at the start of a song)
P(chord 2 appearing at the start of a song)
....
P(chord 60 appearing at the start of a song)
n n (matrix size, always 60 60)
P(chord 1 -> chord 1)
P(chord 1 -> chord 2)
P(chord 1 -> chord 3)
....
P(chord 2 -> chord 1)
P(chord 2 -> chord 2)
...
P(chord 60 -> chord 59)
P(chord 60 -> chord 60)
n (# chords, always 60)
P(chord 1 appearing at the end of a song)
P(chord 2 appearing at the end of a song)
...
P(chord 60 appearing at the end of a song)

Note that unlike the other conditions, these are transition probabilities, not log-probabilities, so they can be combined directly without exponentiating. Also note the introduction of an additional line at the beginning of the file, specifying the variance for this component. We provide four principal components in the files pca_transmodel_[1-4].txt, and the mean - to which scaled principal component values are added to produce a transition matrix - in the file pca_transmodel_mean.txt.


Contact Us Terms of Use Trademarks Privacy Statement ©2010 Microsoft Corporation. All rights reserved.Microsoft