During the summer of 2015, I had an amazing experience as a reseach intern in the Media Technology Lab at Gracenote. I developed a method for estimating the time of the downbeat in music. When you count the beats in 4/4 music as "1 2 3 4 1 2 3 4 ..." or 3/4 music as "1 2 3 1 2 3 ...", the downbeat is the "1". My task was to find the time at which the "1" occurs in different audio signals. I used a machine learning approach instead of a strictly DSP approach for this task. I trained my models only on the Ballroom Dataset and tested them on other songs that were not in the dataset. This work could not have been done without the help of my supervisor at Gracenote, Greg Tronel.
Here are some examples of the audio output. You will hear the test audio with a metronome. The bleep with the high frequency is the estimated downbeat. I am only posting good examples here, so if you would like to hear some bad examples, feel free to contact me:
This is a collaborative project between Cheng-i Wang, Shlomo Dubnov, and myself. Using a Factor Oracle-like data structure, the Variable Markov Oracle, we search for repeated patterns in polyphonic music. This method is capable of discovering patterns in both audio and symbolic representations.
This project is for integrating a physical model of the voice with a live saxophone player. I implemented a physical model of the vocal tract and vocal folds that can morph between vowel sounds. I also developed a technique to specify the sounding frequency of the voice synthesis. Frequency trajectories extracted from the acoustic saxophone signal are used to control the pitch of the voice. Future work involves mapping other sonic aspects of the saxophone sound to voice model control parameters.
This is a multiband delay VST plugin. MBDelay allows you to split a sound into three frequency bands. The amplitude, feedback, and delay time for each band can be controlled separately. I am working on getting this to work on Mac OSX 10.10, but I have been experiencing issues with Apple vecLib FFT functions since updating my OS. If you have any suggestions, please send them my way.
This is a granular synthesis VST plugin. With GrannyGrains, you can chunk up your audio into grains (large or small) and play them either sequentially or in a shuffled manner. You can control the binaural spread of the grains, the feedback, and percentage of reversed grains. There is also an option to quantize the timing of the grains to the beat of your DAW.
Here's an example of the audio
output. The first few seconds are of a
clean music box signal. I manipulate
parameters on GrannyGrains for the rest
of the sound file.
This was an experiment to see if you
can create interesting music using a new
technique of concatenative synthesis. A user sorts small 10-second snippets
of audio ('source') into groups. The groups are
purely user-defined. The system uses
decision trees to understand the
structure of the grouping. The user
inputs a 'target' audio for what they
want the final output audio to sound
like. The system then takes the small
snippets of source audio that sound the
most 'similar' to each chunk of the
target audio, concatenates them
together, and outputs the sound.
Here are some examples of the audio output:
Contact me at firstname.lastname@example.org