5 Preludes on TonicNet Synthetic Chorales #3

These are based on four of the synthetic chorales that I manufactured using the TonicNet GRU deep neural network model. They are in A minor, C major, A minor, C major, and repeat the first at the end in A minor. Each uses a different arpeggiation matrix. The tempos are based on how many notes in the synthetic chorale. With more notes, the tempo is faster, with fewer notes, it’s slower. I made 500 chorales, and then selected some that were of a moderate length. I looked for those in complementary major and minor keys, settling on A minor and C major. A different selection criteria would have different results. This is referred to in the deep learning literature as “cherry picking”. But I’m sure Bach would approve.

Score Excerpt

Fantasia on some TonicNet Chorales #1

This is one that uses five synthetic chorales manufactured by the TonicNet model, all in the key of F# minor. It’s scored for solo finger piano. There’s a short pause between each chorale. The tuning is Kellner’s Well Temperament. I took the idea from Bach’s Well Tempered Clavier Prelude #1, where he moved through a set of chords one measure per chord. That’s what I do with the synthetic chorale. First I extend it to 4 times normal length, then arpeggiate it with a matrix of 24 1/16th notes, with zeros:ones ratio of 9:4. This ensures that there are more zeros than ones, so more notes are set to zero and therefore missing. This results in some interesting arpeggiations. I also extend some of the octaves up and down a bit. It’s kind of like if Philip Glass used Bach chord progressions instead of his own unique ones.

Fantasia on some chorales made by TonicNet #14

Today’s submission is based on a chorale synthesized by TonicNet, created by Omar Reacha. His paper, Improving Polyphonic Music Models with Feature-Rich Encoding from 26 Nov 2019 uses a type of deep neural network called the Gated Recurrent Unit to generate very nice Bach chorales. Here is the Paper and Code.

He used that network to create a database of 500 synthetic chorales in his next paper, JS Fake Chorales: a Synthetic Dataset of Polyphonic Music with Human Annotation from 3/31/2022. Here is that paper, code, and a web page that generates a fake chorale while you wait.

TonicNet Architecture

I used some python code to analyze the resulting 500 MIDI files to find those that met certain criteria:

  • Rather short, around 8 measures
  • Rather high pitch entropy, that is they frequently contain pitches that are not in the key of the chorale

The top scores went to chorales number 35, 268, 107, and 121, in the keys of G minor, F# major, E minor, and F# major respectively.

I put them through some of my python programs that lengthen, repeat, and transform sections based on the pitch entropy of the time steps. This enables me to linger over suspensions and harmonic transitions using different manipulations of the notes.

The piece is scored for a large string orchestra of about 256 string instruments: violins, violas, cellos, and double basses. I include samples of each instrument playing without vibrato, with vibrato, martele, and pizzicato. The piece starts out with everyone playing at the same time, then moves to sections that are only one type of sample. They come back together after several sections to all play at once.

I used a tuning developed by Herbert Anton Kellner, which in the Scala repository is referred to as kellner.scl Herbert Anton Kellner’s Bach tuning. 5 1/5 Pyth. comma and 7 pure fifths. Since I didn’t know what key would end up chosen, I wanted to pick a tuning that could handle almost any key.

There is a separation between each of the four chorales. Just a brief pause.


——————-

Look Down for Finger Piano, Strings, Balloon Drums, and Springs #65

This another version of the piece I’ve been working on for a while, based on a COCONET deep neural network generated synthetic chorale. The original chorale was Look Down from Heaven (BWV 2.6, K 7, R 262) Ach Gott, vom Himmel sieh darein (BWV 2.6, K 7, R 262), but it’s gone through the model many times, so it has changed significantly. Some remains the same, and shows itself at times.

This version is scored for three quartets of instruments:

  1. First quartet:
    1. Finger Piano
    2. Ernie Ball Super Slinky Guitar
    3. High Balloon Drum
    4. Spring
  2. Second quartet:
    1. Finger Piano
    2. Medium Balloon Drum
    3. Baritone Guitar
    4. Bass Finger Piano
  3. Third quartet:
    1. Low Balloon Drum
    2. Finger Piano
    3. Very long string
    4. Bass Finger Piano

A certain times, all the instruments play together, but more of the time is spent with each quartet playing by themselves. Each instrument plays many notes, so the music gets fairly dense at times, and sparse at other times.

How The Star Spangled Banner became “Not the Star Spangled Banner”

This post is going to trace the path that our national anthem took as it went through a neural network and probabilistic algorithms in its journey to produce some music. It started out as a midi file downloaded from the internet. Here is the first few measures as played on a quartet of bassoons with staccatto envelope realized in Csound.

I then turn it from a nice 3:4 waltz into a 4:4 march by stretching out the first four 1/16th note time steps to 8 time steps.

The MIDI files are transformed into a set of time steps, each 1/16th note in length. Each time step has four voices, soprano, alto, tenor, and bass, with a MIDI note number for every voice sounding at that time. Zeros represent a silence. The transformation into time steps has some imperfections. In the time step data structure, the assumption is that if the time step contains a non-zero value, it represents a MIDI note sounding at that time. If the MIDI number changes from one step to the next, that voice is played. If it is not different from its predecessor, it holds the note for another 1/16th note. In the end we have an array of 32 time steps with 4 MIDI note numbers in each, but some are held, some are played, and others are silent.

The next step is to take that march, and chop it up into seventeen separate 32 1/16th-note segments. This is necessary because the Coconet deep neural network is expecting four measure chorale segments in 4:4 time. The result is a numpy array of dimension (17 segments, 4 voices, 32 time-steps of 1/16th note each). I then take that structure, and feed each of the seventeen segments into Coconet, masking one voice to zeros, and telling the network to recreate that voice as Bach would have done. I repeat that with another voice, and continue to repeat it until I have four separate chorales that are 100% synthesized. Here is one such 4 part chorale, the first of four.

And another, the last of the four, and so the most different from the original piece.

I create about 100 of these arrays of 16 voice chorales (four 4-voice synthetic chorales), and pick the ones that have the highest pitch entropy. That is, the ones with the most notes not in the original key of C major. It takes about 200 hours of compute time to accomplish this step, but I have a spare server in my office that I can task with this process. It took this one about a week to finish all 100.

Then I take one of those result sets, and put it through a set of algorithms that accomplish several transformations. The first is to multiply the time step voice arrays by a matrix of zeros and ones, randomly shuffled. I have control of the percentages of ones and zeros, so I can make some sections more dense than others.

Imagine a note that is held for 4 1/16th notes. After this process, it might be changed into 2 eighth notes, or a dotted eighth and a sixteenth note. Or a 1/16th note rest, followed by a 1/16th note, followed by a 1/8th note rest. This creates arpeggiations. Here is a short example of that transformation.

That’s kind of sparse, so I can easily double the density with a single line of python code:

one_chorale = np.concatenate((one_chorale, one_chorale),axis = 0) # stack them on top of each other to double the density

I could have doubled the length by using axis = 1

That’s getting more interesting. It still has a slight amount of Star Spangled Banner in it. The next step will fix that.

The second transformation is to extend sections of the piece that contain time steps with MIDI notes not in the root key of the piece, C major. I comb through the arrays checking the notes, and store the time steps not in C. Then I perform many different techniques to extend the amount of time on those time steps. For example, suppose the time steps from 16 to 24 all contain MIDI notes that are not in the key of C. I transform the steps 16 through 24 by tiling each step a certain number of times to repeat it. Or I might make each time step 5 to 10 times longer. Or I might repeat some of them backwards. Or combine different transformations. There is a lot of indeterminacy in this, but the Python Numpy mathematical library provides ways to ensure a certain amount of probabilistic control. I can ensure that each alternative chosen a certain percentage of time.

Here is a section that shows some of the repeated sections. I also select from different envelopes for the notes, with 95% of of them very short staccato, and 5% some alternative of held for as long as the note doesn’t change. The held notes really stick out compared to the preponderance of staccato notes. I also add additional woodwind and brass instruments. Listen for the repetitions that don’t repeat very accurately.

There are a lot of other variations that are part of the process.

Here’s a complete rendition:

Not the Star Spangled Banner #15

I used to play in a woodwind quintet in college, and it was a lot of fun. Sometimes a professor would sit in if someone wasn’t available, and we could really get cooking then. My instrument was the clarinet at the time. I wrote some music for the group, but it wasn’t very good. Some of the ugliest music I’d ever heard. It made sense on paper, but when it was played, you could tell that that was the first time it had actually been heard. That’s one reason I really like playing with samples and a laptop. I can instantly hear how terrible my music is sounding at the time I think of it.

Original Score

Today’s music started out as a MIDI file of the Star Spangled Banner, scored for four voices. The music for what is now the U.S. national anthem was written by John Stafford Smith, who wrote it for a musical social group of which he was a member. Later, the brother-in-law of Francis Scott Key mentioned that the poem Scott had just finished, originally titled “Defence of Fort M’Henry”, matched the rhythm of the Smith tune. Amateur musician meet amateur poet, and the rest is history. “The Star-Spangled Banner” is very challenging to sing, but that hasn’t stopped many people from making the effort regardless of the challenge.

I pulled a MIDI file of the song, and quickly discovered that it was written in 3:4 time. All the inputs to the deep neural network Coconet must be in 4:4 time, and 32 1/16th notes long. So I set about to double the duration of the first beat of each measure. There’s some precedent for this. Whitney Houston performed it in 4:4 at the 1968 Super Bowl. It’s charming, in a very relaxed way. I had to do this to continue my technique of feeding existing music into Coconet, and then having the deep learning model generate its own harmonizations.

After obtaining around 100 synthetic Banners, I then selected a few to go through an algorithm that extends the durations of time steps that include notes not in the root key of the song. This process stretches out the interesting parts and rushes through to conventional cadences. Unless they cadence in a chord whose notes are not in the key of C major. All these alterations create something quite unlike the original tune.

I scored it for nine instruments: flute, oboe, clarinet, french horn, bassoon, piccolo, english horn, Bach trumpet, and contra bassoon.

Sacred Head #115

I’ve been trying out lots of modifications to the Sacred Head Fantasia on a Synthetic Chorale. Today’s post is number 115. It’s more dense than before, starting with 24 voices, and then selectively trimming some voices in each of nine sections.

Oh God, Look Down from Heaven #38

I increased the potential number of voices, and added Balloon Drums, Long Strings, and a few more finger pianos. Now it sounds like an orchestra of zithers. Big ones, and giant bass kalimbas. These are all samples from instruments I’ve built over the years. There are times that remind me of Hawaii Slack Guitars. I adjusted the tuning to Victorian Rational Well Temperament on A♭, since that has a nice B♭ major, and this piece is in D minor, until the final chord with a Picardy third.

What I really like is that this version sounds less like Bach than any of the others.

Oh God Look Down

Oh God, Look Down from Heaven (BWV 2.6, K 7, R 262) Ach Gott, vom Himmel sieh darein #15

This is based on the coconet model transforming another Bach chorale. Ach Gott, vom Himmel sieh darein (Oh God, Look Down From Heaven). This chorale uses a lot of notes outide the primary key of D minor. Coconet did his best to harmonize it. I scored it for some samples that I made myself, and don’t have any licensing issues. The instruments are two different finger pianos, one full, and another just for bass notes, plus some Ernie Ball guitar strings, and other assorted strings sounds.

I used the same basic manipulation techniques on this one: stretch out the interesting parts, and repeat them in different ways. This version includes code to randomly choose among several different manipulations:

if final_length > 10: probability = ([0.1, 0.05, 0.05, 0.1, 0.1, 0.05, 0.15, 0.05, 0.15, 0.2])
else: probability = ([0.2, 0.1, 0.06, 0.1, 0.1, 0.15, 0.04, 0.05, 0.1, 0.1])

if type == 'any':
type = rng.choice(['tile', 'repeat', 'reverse', 'tile_and_reverse', 'reverse_and_repeat',
'shuffle_and_tile', 'tile_and_roll', 'tile_and_repeat', 'repeat_and_tile',
'tile_reverse_odd'], p = probability)
print(f'time_steps: {clseg.shape[1] = }, {factor = }, {type = }')

So I control the likelihood of picking different techniques for longer repetitions, favoring ’tile_and_roll’ and ‘repeat_and_tile’. The former tiles the section, basically repeating it note for note, but each time it rolls the notes, so it starts at a different point in the array. Repeat and tile takes half the voices and tiles them, and the other half it just makes them longer. It all works out in the end.

Oh God Look Down