Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Kaido

Converting MP3 to MIDI

Recommended Posts

Isn't it possible to distinguish instruments with Fourier Analysis, I remember that different Instruments have different spectral density. *tries to sound smart but fails miserably*

Share this post


Link to post
7 minutes ago, rodster said:

Isn't it possible to distinguish instruments with Fourier Analysis, I remember that different Instruments have different spectral density. *tries to sound smart but fails miserably*

At a very fundamental level, yes, as each instrument does have a specific spectral signature. However you can't just use a bunch of Fourier coefficients unless you're dealing with continuous, unchanging sounds. Real instruments also have an ADSR envelope, noise and other imperfections, as well as a virtually infinite variations. So you need a more complex "signature", one that also takes time into account, e.g. wavelets, and is robust vs scaling or dropping some harmonics. In essence, you're doing a kind of "inverse synthesis". 

Share this post


Link to post
2 hours ago, rodster said:

Isn't it possible to distinguish instruments with Fourier Analysis, I remember that different Instruments have different spectral density. *tries to sound smart but fails miserably*

Problem is that each instrument produces its own complex sound wave (or multiple waves), yet, when you hear the final recording, all of those waves are combined into one very complex wave (2 for stereo). Higher frequencies "ride on top" of the lower frequencies, and each wave becomes either additive or subtractive upon the whole.

 

Using techniques like the Fourier transform, you can, over time, identify the individual frequencies that make up that complex final waveform, as long as they haven't cancelled each other completely out (2 notes at the same frequency, out-of-phase: A subtractive process). But this cannot be done instantly - it requires a sampling of the waveform over time.

 

Likewise, each source voice gets its own timbre from a complex interaction of the properties of the vibrating physical material. Especially noticeable with string instruments, you get the base frequency, but you also get harmonics which produce multiples of the base frequency, as a natural property of the way the string carries waves that buck across its length, as well as effects that occur when joining dissimilar metals.

 

The point is that the instrument's timbre is not detectable immediately. You can recognize specific repeating patterns in the waveform being generated by an instrument, and it's those patterns that define what the instrument sounds like. But, to see a pattern, you need a specific, longer-than-0-length sample, which differs in length per voice, for each voice.

 

But, this multiple-wave signature looks to the Fourier transform, like multiple voices, with different frequencies! And, this is further complicated by instruments that employ vibrato, slides, etc.

 

I suppose you could follow a Fourier transform with a broader pass that expects to find the harmonic patterns. A third process could then look for these signatures and try to identify them by voice, using a database of known voice signatures.

 

Hopefully I have conveyed that this is massively difficult, prone to detecting false positives, and only theoretical at this point. It's a bit like trying to un-cook an egg.

 

Share this post


Link to post

You can try to install Audacity (v 1.3)
Audacity is a really great open-source software. It is handy if you want to record via your laptop/desktop to MP3. It has a lot of effects. 

rolling sky

Share this post


Link to post

The whole concept reminds me of some people that ask a photographer if they can "remove a person from a photo in order to see what's behind them" or put a mirror in their scanners and are surprised when what they print out is not reflective :)

Share this post


Link to post

I attempted to do that and added some Shadow the Hedgehog music to doom2.wad using Slade 3, but the music sounds like a high-frequency beat when it's in the game.

Share this post


Link to post
On 8/17/2017 at 9:16 PM, kb1 said:

This is actually massively impressive. I can actually hear the emulation of some nuances of the phonemes being sung, such as the difference between a sung "A" vowel vs. an "E" vowel. Wow. Of course it sounds shitty, because the piano notes have a specific signature timbre, with a percussive attack and linear, predictable decay. And, of course, consonants can only be simulated with cymbals and other drums. If this could have been run at a much higher tempo, with a much softer voice, the results would be even more impressive.

Well actually you can hear it even better just select the ocarina instrument and you can hear the words clearly the software i use is fl studio and synthfont 

Share this post


Link to post
On 8/14/2017 at 2:19 AM, Ahmed said:

When I play maps like sunlust , the music of course is originally taken from somewhere else. I tried converting an mp3 audio that I downloaded recently into a midi file , however the music and tone is very different from the original. How do map makers convert the original mp3 into an midi without drastic changes towards the midi itself? 

try using WIDI it converts wav mp3 and some other audio files to midi

Share this post


Link to post
Guest
This topic is now closed to further replies.
×