Showing posts with label midi. Show all posts
Showing posts with label midi. Show all posts

Tuesday, June 16, 2015

Digging Into Questions About Entropy

Here's a graph that represents two things. First, a lot of work completed. Second, a lot of work that needs to be done!

These three curves are the bits of entropy per sliding window location in a work by Buxtehude (Prelude and Fugue in G Minor). The width of the window is set by the Kemeny constant of the MIDI track. There are three tracks: Swell, Great, and Pedal. 
On a pipe organ, the main manual (keyboard) is called the Great. It is usually the bottom manual on two-manual instruments, or the middle manual on three-manual instruments. The upper manual is called the Swell. The Pedal is usually the very lowest notes in the piece. This is often played with the feet on the pedalboard. 
Each of these tracks represents the music that would be played on the corresponding part of the instrument. Each part can be voiced on a completely separate rank of pipes, creating a layered sound. 
In the MIDI version, each of the tracks is examined mathematically. First, a Markov chain is derived. This is a table that shows how likely it is for any particular note to follow a particular note. Starting with the first note and going all the way to the end, all combinations of the notes that follow the previous note are recorded, and then the probabilities calculated.
Next, the Kemeny Constant is found. This is the number of steps from a starting note to a randomly selected note chosen from the Markov chain's stationary distribution. No matter which starting note is selected from the piece, it takes about the same amount of steps to reach the randomly selected note from the piece. This number of steps is the width of our sliding window. The window function slides over the track. For each window, the entropy of the windowed sample is measured.
What we're looking for is places where the entropy changes dramatically. This would potentially indicate a local change in the entropy of the piece, which may indicate a compositional change or transition in the work. Identifying macro-phrases like this may be helpful in constructing algorithmic compositions that better emulate human composition. 
As you can tell from the graph, the tracks do not line up. The pedal track is much shorter than the great, which is shorter than the swell. Therefore, the number of windows evaluated is not the same across the three tracks. This means that the samples are not aligned in time if they are simply listed along the horizontal axis. The samples need to be normalized for observed time. This (using the timestamps in the MIDI file to align the samples) is the next task in the design of this part of the software. 

Monday, November 3, 2014

Modeling Rests in Composed Music

There are at least two types of rests in music. The first are the ones the composer wrote into the score. The second are the ones that naturally occur during playing. Musicians pause, extend, chop, attack, cheat, and move notes around within the measure in order to emote, interpret, or express.

There are many more of the second type of rests in human-performed music than the first. The first type are represented on the score, but the second type makes an enormous difference in how the music is perceived stylistically. Recognizing, categorizing, and modeling both types of these rests is a goal of Organ Donor, with the expectation that introducing proper amounts of "space" into algorithmically produced music will create music that sounds more like a human is playing it. Being able to create different models of resting based on desired style would be a very powerful and useful result.

Another area of investigation is the minimum return distance from root notes, or Mean First Passage Times. I suspect that there might be some utility from these statistics in terms of creating believable phrasing - or uncovering patterns that reveal other hidden structures in composed music. Examining the minimum return distance for both types of rests as well as notes will help improve the understanding of the role and effect rests play in composition and style.
This area of math (MFPT) is used in a wide variety of fields to answer pragmatic questions about physical systems. There's no guarantee that it will provide repeatable or useful results in music, but we have the tools (thanks to the python library pykov) to start the process of finding out.
Here's some results from parsing MIDI files of performed music. The rests captured don't exist in the score. Some notes are staccato, and you can see the additional "distance" the musician included to create the desired effect. You can also see additional distance punctuating the end of the phrase. This is the violin 2 part from Beethoven's 7th, second movement.
Note 64 had tick duration 409
a rest had tick duration 66
Note 64 had tick duration 110 (staccato note)
a rest had tick duration 138 (results in longer space between notes)
Note 64 had tick duration 116 (staccato note)
a rest had tick duration 121 (results in longer space between notes)
Note 64 had tick duration 410
a rest had tick duration 67
Note 64 had tick duration 392 (ended a bit early, end of a phrase)
a rest had tick duration 91
Modeling music requires obedience to aesthetics, and this is where the difficulty - possibly the impossibility - lies. However, I cannot think of anything more worthy of analysis, modeling, machine learning, and algorithmic design. More soon!

Monday, October 20, 2014

Independent Parallel Tracks and Hidden Markov Models

OK, so now we have code that analyzes each track of a multi-track midi file, and creates transition tables. These transition tables are used to generate new music that has some influence from the analyzed track. 

However, each new track is used independently. While the behavior of each new track is based completely on the statistics of the analyzed track, they will not sound like they are "composed together” when they are recombined. 

Organ Donor Frank Brickle says, "To keep them together you actually need to model the interaction in some way. Looking ahead a bit, you can probably see why a hidden Markov model of all the activity is one of the best ways of coordinating the subordinate parts."

Ok so what does this mean? 

A hidden markov model is where the observation and the state are separated. The simplest example is a coin. Usually you see the coin, and can read whether it came up heads or tails. A hidden markov model has the coin hidden, like behind a screen, and the observations (heads or tails) are read out to an audience (or user, or participant, or contestant). 

One of the jobs of the contestant is to figure out how many states are required to best explain the observations. For example, for five minutes the coin flipping produced about half heads and half tails. Then it suddenly changed, and the observations were mostly tails for four minutes. Then mostly heads for three minutes. Then it went back to a fair distribution for the rest of the session.

One way to explain this is with three coins. A fair coin, a heads-heavy coin, and a tail-heavy coin. The person behind the screen switched from one coin type to another and read off the resulting observations. The number of states in this hidden markov model would be three. Each coin is a state. Each state has an alphabet of two possible values. Each state’s alphabet has a different distribution of likelihoods. The probability distribution for each state (each coin) is different.

I believe our job is to figure out how to keep the tracks working together when new music is created. Analyzing each track separately stays in the toolbox, but analyzing the entire piece, and using that analysis to coordinate the production of new tracks must be done as well. 

Tuesday, October 7, 2014

First New Music from Bach Violin Solo - Quick Sample

Here's the first simple example of the sort of files we're trying to produce for Organ Donor.

This file was created by taking the statistics of a Bach Violin Solo (the Partita No.2 Gigue), and analyzing them in the following way. Each note was examined (with a software program) to find out what note followed. After all the notes were counted up, we then calculated the probabilities of which note would follow any particular note that had occurred in the piece.

We then picked a random note, and "rolled the loaded dice" to see which note we should go with next. Once we saw which note we came up with, we did it again. We collected a 100-note-long sample.

For the original song that was analyzed, here's Hillary Hahn playing it:
https://www.youtube.com/watch?v=7eXzlg2Xcgg

For our very simplified example song:
http://www.delmarnorth.com/audio/bach_nmo.mid


midi file