Jump to content
Search In
  • More options...
Find results that contain...
Find results in...
Ahmed

Converting MP3 to MIDI

Recommended Posts

When I play maps like sunlust , the music of course is originally taken from somewhere else. I tried converting an mp3 audio that I downloaded recently into a midi file , however the music and tone is very different from the original. How do map makers convert the original mp3 into an midi without drastic changes towards the midi itself? 

Share this post


Link to post

Because they all use midi as original source?

Just search on Google the song you want in midi format...

 

Also check midimelody.ru (i think is that name)

 

For hilarity issue you can always use that online midi to mp3 converter that now I don't remember the name that turns your mp3 track into "hell of  1 track piano midi".

Share this post


Link to post

It's more likely that the tracks you hear were painstakingly recreated in MIDI as opposed to simply converted: MP3 converts to MIDI badly and generally isn't worth it. If you want to give manual conversion a go then try downloading a program like Sekaiju and learning the ropes.

Share this post


Link to post

While MP3 and MIDI may both qualify as "audio formats", they are fundamentally different in how they store the audio data and how that data is processed. The MP3 audio format is a compressed waveform, effectively a digitized reproduction of the sound vibrations that we've come to understand as audio: when a program is asked to play the MP3, it simply handles the waveform as-is. The MIDI audio format is more like a notation of how audio is meant to be played using quantities such as pitch, duration, velocity, et cetera: for playback, the program processes the instructions in real-time to produce a song.

 

Trying to parse a waveform into various instruments and notes, automatically, is an incredibly complex task: it would be like trying to use a speech-to-text program to record the multitude of conversations occurring simultaneously in a sports stadium. Most MIDI songs that came from some other format are far more likely to have been rewritten from scratch by listening to the song over and over again, as Obsidian stated.

 

Share this post


Link to post

As far as I know it's not possible yet to convert MP3 files into MIDI. You'd have to recreate your desired MP3 track completely new in MIDI format. This made me understand how much work MIDI authors actually have to put in, it's amazing!

Share this post


Link to post
21 minutes ago, rodster said:

As far as I know it's not possible yet to convert MP3 files into MIDI.

This worked for me, of course the result wasn't really great.

Share this post


Link to post
2 hours ago, CapnClever said:

While MP3 and MIDI may both qualify as "audio formats", they are fundamentally different in how they store the audio data and how that data is processed. The MP3 audio format is a compressed waveform, effectively a digitized reproduction of the sound vibrations that we've come to understand as audio: when a program is asked to play the MP3, it simply handles the waveform as-is. The MIDI audio format is more like a notation of how audio is meant to be played using quantities such as pitch, duration, velocity, et cetera: for playback, the program processes the instructions in real-time to produce a song.

 

Trying to parse a waveform into various instruments and notes, automatically, is an incredibly complex task: it would be like trying to use a speech-to-text program to record the multitude of conversations occurring simultaneously in a sports stadium. Most MIDI songs that came from some other format are far more likely to have been rewritten from scratch by listening to the song over and over again, as Obsidian stated.

 

If what you are saying is true, then how do map authors get their midi audio for their maps? do they just download it from the internet from some source, or do they go through the inconvenience of making a new midi audio from scratch just like you and obsidian stated?

 

Or they could ask someone to make them the audio... I dont know 

Share this post


Link to post

They typically download midis from midi database websites or rip them from other wads or (less often) compose them themselves or have them composed by someone from around here.

Share this post


Link to post

I tried searching "video game midi" and this was the first website:

 

VGMusic - 31,113 Game Music MIDI files

 

and I'm sure there are more. Just gotta spend some time lookin'! If you have the song in mind, you could try searching the name of said song and add "midi" at the end, see if there's anything floating around.

 

This guy (on this very forum) does commission work and specifically mentions MIDI remakes. Some people just do it as a hobby, too, and use their skills to compliment mapping. If you ask around here, it's possible you'll get some takers.

 

You can also use an MP3 directly in the map and recommend that the map only be played in ports that support MP3s. This, of course, increases the filesize of the map: it's not uncommon to observe that the MP3 greatly exceeds that of the map! Sometimes an audio file simply doesn't convert well into MIDI, what with limited instrumentation available, and this becomes an unfortunate necessity in the eyes of the mapper.

Share this post


Link to post

Vgmusic is my go to place for videogame midis composed by other people. Jimmy, Viscra Maelstrom, Bucket, Eris Falling, Alfonzo, Monster Iestyn, etc. all make/made pretty cool midis. I especially like Monster Iestyn's Sonic midis.


But what does that have to do with converting mp3 to mid? Didn't YukiRaven post a technique using Fluidsynth on Linux to create a mp3 file from a midi? That's just as cool.

But a MP3 to MID converter sounds like it would need some serious recognition code to make it possible. The software would have to be able to recognise specific sounds and deduce what instrument they might be. Sometimes, there are background instruments too that get muted out by another instrument, which would mean that the exported MID would probably not have those instruments.

It's just too fucking complicated! Unless there's a team out there working there asses off to make such a converter, then I would expect them to take more than a decade at least, to get a proper release.

Share this post


Link to post

midi composition is a skilled trade :D

 

Luckily the format is so old and the number of competent sequencers is pretty high, so over the years the internet en masse has managed to make tens of thousands of these things available to you. VG music (linked by captain clever, above) is an invaluable resource, as are other midi databases: e.g. http://midkar.com/, and of course there are plenty of composers around these parts who have a plethora of music available.

Share this post


Link to post

as others have mentioned here already, MIDIs of music from other sources in Doom wads were recreated by others in MIDI sequencers. MIDI isn't a type of sound or genre per se, simply a set of generalized instructions introduced so that different pieces of sound hardware and software can easily communicate. specifically, MIDI files uses General MIDI, which has a list of instrument information that is identical across several devices, so e.g. a guitar and piano instrument in a track will always use the correct instruments playing it back.

 

MP3 audio itself is a digital audio codec, which can contain whatever it is that was encoded into the file. it doesn't use instructions like MIDI, since the music in an MP3 file is already "done" and ready to be played back. since it is not restricted to a set number of instruments and just a few drumtracks, it's inherently a lot more complex in design than the relative simplicity of a MIDI file. MP3 to MIDI converters are therefor not really that useful, since it doesn't recognize what is what in an audio file, making everything come out as a big mess.

 

the only way to make it into a proper MIDI file, therefor, is to listen to the original file, and recreate it to the best of your ability. this takes time and resources, trying to find what MIDI instruments best mimics the original song, but sometimes it can be easier depending on what you're listening to. i've made a few covers of songs to MIDI in the past, one in particular i'm fond of is this one, which is based on this song.

 

 

for MIDI conversions of most well-known songs, sites like VGMusic exists, which are usually of pretty good quality, although without editing them, they can sound weird in-game, because they often tend to have some silence in the beginning of the track, fade out at the end, or just have plain errors looping in-game.

Share this post


Link to post
On 8/14/2017 at 4:21 AM, Roofi said:

This is how the mp3 to midi conversion sounds like.

 

 

This is actually massively impressive. I can actually hear the emulation of some nuances of the phonemes being sung, such as the difference between a sung "A" vowel vs. an "E" vowel. Wow. Of course it sounds shitty, because the piano notes have a specific signature timbre, with a percussive attack and linear, predictable decay. And, of course, consonants can only be simulated with cymbals and other drums. If this could have been run at a much higher tempo, with a much softer voice, the results would be even more impressive.

Share this post


Link to post

I swear I remember seeing something about how our brain is so attuned to human language that you can listen to a weird MIDI-ized version of dialogue, hear absolutely nothing intelligible, then hear the original sound file of the dialogue, and then when you listen to the MIDI-ized version again you can't NOT hear and understand what it is saying.

Share this post


Link to post
3 hours ago, Linguica said:

I swear I remember seeing something about how our brain is so attuned to human language that you can listen to a weird MIDI-ized version of dialogue, hear absolutely nothing intelligible, then hear the original sound file of the dialogue, and then when you listen to the MIDI-ized version again you can't NOT hear and understand what it is saying.

Yeah, ok, maybe. I'll give you that one :) I guess it's the same thing as listening to songs backwards, to hear the evil message. No one can hear it, until they are told what and where it is, then everyone hears it.

 

Counter-point: Vowels actually make very rudimentary waveforms. If you try, you can repeat vowel sounds indefinitely: "Ah...."   "O....". On an oscilloscope, these make simple waveforms, like 3 wide sine waves, followed by one thin sine wave, over and over. This could be emulated by the sheer number of notes and the tempo they are played in that video.

 

Ah, fuck it: Yep, I hear the song in my head, and it won't go away :) Still amazing, though.

Share this post


Link to post
8 hours ago, Linguica said:

I swear I remember seeing something about how our brain is so attuned to human language that you can listen to a weird MIDI-ized version of dialogue, hear absolutely nothing intelligible, then hear the original sound file of the dialogue, and then when you listen to the MIDI-ized version again you can't NOT hear and understand what it is saying.

I could definitely hear the words at times in that smash mouth MIDI song. Surprisingly a piano can mimic human voice in a partially recognizable way, provided it is programmed to play the exact pitches and rhythms of the speech (which is what I imagine an audio -> MIDI converter would do - straight up convert the pitch and rhythm into a single instrument, in the smash mouth case - a piano)

 

I first saw this technique in this clip a few years ago:

 

Also for MIDIs in my projects I tend to either get them from MIDI repositories and/or other games or write them myself. I've never used an audio-MIDI converter and I would never want to when purpose written/transcribed MIDIs exist.

Share this post


Link to post
10 hours ago, Linguica said:

I swear I remember seeing something about how our brain is so attuned to human language that you can listen to a weird MIDI-ized version of dialogue, hear absolutely nothing intelligible, then hear the original sound file of the dialogue, and then when you listen to the MIDI-ized version again you can't NOT hear and understand what it is saying.

♫ You gotta eat your vegetables ♫

 

Share this post


Link to post
On lundi 14 août 2017 at 9:19 AM, A7MAD said:

When I play maps like sunlust , the music of course is originally taken from somewhere else. I tried converting an mp3 audio that I downloaded recently into a midi file , however the music and tone is very different from the original. How do map makers convert the original mp3 into an midi without drastic changes towards the midi itself? 

C0muertjn9 erorn MP3 to M|D1 is |lke c0nvertjng frcrn 8MP to T><T,

 

The reconnaissance algorithm makes mistakes that an actual human brain wouldn't make. You get much better result when someone recreates the MIDI by hand (and ear) instead of using an automated converter which will botch things up. If you can't decipher my first sentence, I've written it based on the mistakes that I remember getting from OCR software. For music it's the same kind of problem, except a lot more complex because the sounds overlay each other.

Share this post


Link to post

Think about it, in order to have any chance at all any "MP3 to MIDI" converter would need to be able to "unmix" the audio back to separate tracks first, and THEN make out different instruments and the such. Even that first task is not trivial, and unlikely you'll find it on some random guy's website or freeware converter. Maybe in a multi-million $$$ forensic software or something, but not for shits and giggles.

 

Once you have split tracks with precisely ONE kind of instrument in each, then it's a much easier process. That's why such "converters" may work well with a single-channel/instrument track or a guy whistling a tune (that's how, among others, Shazam works), but not anything like, you know, actual music mixdowns. Then there's stereo panning to take into account, vocals, noise etc.

 

Don't be fooled by Youtube "knowing" what song is played: this kind of audio recognition is done using actual audio signatures, a sort of simplified footprint of a piece of audio, and it's only good enough to say whether what you used in your video MIGHT be part of a copyrighted soundtrack in their database. There's no wonderful & magic track breakdown being done there, sorry.

Share this post


Link to post
5 minutes ago, Maes said:

Think about it, in order to have any chance at all any "MP3 to MIDI" converter would need to be able to "unmix" the audio back to separate tracks first, and THEN make out different instruments and the such.

Why couldn't it recognize instruments first and THEN isolate each recognized one's sound and put them into separate tracks?

Share this post


Link to post
6 minutes ago, scifista42 said:

Why couldn't it recognize instruments first and THEN isolate each recognized one's sound and put them into separate tracks?

This would only be possible if the instruments had a very characteristic "signature" and non-correlated, non-overlapping spectra to begin with. E.g. imagine a piano and an oboe playing at the same time but at very different pitches, and playing completely different notes (no harmony). However, any sort of overlap in pitch, note, or harmony, makes it harder or impossible to tell whether one, two, or more instruments are playing in "lockstep" in any given instant.

 

There's an entire science behind this process:

 

https://en.wikipedia.org/wiki/Principal_component_analysis

 

Even if that worked out OK, it would only give a starting point for a POSSIBLE division into tracks, with no guarantee that it would be unique or correct in the more general case. Other criteria would be identifying e.g. parts with a steady rhythm (e.g. percussion or basslines) or a very irregular one (e.g. speech).

 

Ah, the human brain. There's so much we take for granted, which however is a Digital Signal Processing engineer's nightmare :-)

Share this post


Link to post

Melodyne is as far as I know the only software able to separate chords and audio in general into separate pitches and being able to edit those. Google it. Phase tricks and stereo manipulation, as well as multiband-gates with advanced transient detection, could be used to separate the rest but even then, it would probably sound awful and full of unwanted artifacts so you're probably better off asking a musician to do it for you. Any AI would have to be terribly advanced to be able to decode all genres. Maaaaybe you could make one for generic Nashville country. I probably wouldn't need more than 1-3 hours tops to transcribe something to midi, depending on the song, not that I'm offering to do it or anything!

Edited by RightField

Share this post


Link to post
15 minutes ago, Maes said:

any sort of overlap in pitch, note, or harmony, makes it harder or impossible to tell whether one, two, or more instruments are playing in "lockstep" in any given instant. [...] it would only give a starting point for a POSSIBLE division into tracks, with no guarantee that it would be unique or correct in the more general case.

I'd argue that it doesn't matter whether the instrument identification is actually correct, as long as the alternatives are indistinguishable by humans. Therefore, it'd be sufficient if the pitch/note/harmony analysis was as good at recognizing instruments as humans are, to make a conversion sufficiently good for humans to listen to. At least if it was converting specifically to a single midi soundfont. Converting simultaneously to multiple soundfonts could also improve precision of the identification and help making a multi-soundfont compatible conversion.

Edited by scifista42

Share this post


Link to post
1 hour ago, scifista42 said:

I'd argue that it doesn't matter whether the instrument identification is actually correct, as long as the alternatives are indistinguishable by humans.

This reminded me of some MOD/tracker files that actually used "composite instruments": samples which contained more than one sound but were forced to be always played together at the same pitch and also at the same relative time sequence (e.g. the "Bell + sustained strings" sample in R-Type).

 

 

This helped save on a channel. So, for the purpose of "identifying a unique instrument", being able to identify such a composite sound as a single "instrument" would work. For a traditional sheet music transcription or a MIDI...probably not.

 

For recognizing generic/unknown instruments some pretty clever signature/pattern algorithm would be required, also able to handle variants such as different volumes, different harmonic signatures at varying pitches, different ADSR envelopes, acoustic filtering effects etc.

 

Probably a good subject for someone's PhD, but a bit too much work just to appease some Joe P. Random guy on the Internet who "just" wants to convert MP3 (or any sound) to MIDI ;-)

Share this post


Link to post
On 8/14/2017 at 2:19 AM, A7MAD said:

When I play maps like sunlust , the music of course is originally taken from somewhere else. I tried converting an mp3 audio that I downloaded recently into a midi file , however the music and tone is very different from the original. How do map makers convert the original mp3 into an midi without drastic changes towards the midi itself? 

They do not "convert" recorded music into MIDI, such a thing is completely impossible. An mp3 is a sound recording, a MIDI is a set of instructions given to a synthesizer, like a high-tech player piano roll. To do a MIDI based on a sound recording, you have to get the original score of the music, whether by purchasing sheet music, looking up a guitar tabs, or learning the music by ear, and rewrite it in a MIDI composer. You will need a basic grasp of music theory and composition to do this. There is no easy way to do this--you have to know how to write music, and you have to sit down and transcribe the entire piece yourself.

 

Any "converter" you see will basically analyze the recordings and use heuristics to guess what the music must have been like. Their output will require massive clean-up (by someone who knows how to compose) at best and will more likely be garbage.

Share this post


Link to post
9 minutes ago, Woolie Wool said:

There is no easy way to do this--you have to know how to write music, and you have to sit down and transcribe the entire piece yourself.

Sadly, in this day and age that's precisely what people don't want to hear -instant gratification and all that, and, hey, there HAS to be an "app" that does it, right? That's why (in the boundaries of our own Doom community) we regularly keep getting questions about e.g. easy ways to automatically make 3D models from sprites or "convert" sprites to "Hi Res" or true color (or both) etc. This "MP3 to MIDI" thing is just another variant on the theme.

Share this post


Link to post

This is a clear example on how MP3 Converts to MIDI (Automatically and not manually)

 

 

Share this post


Link to post

^ Sounds like some pretty avant-garde stuff :-)

 

Jokes apart, even if someone purports to use such a converter only to get a quick and dirty transcription and then polish it manually to smooth any (well, many) rough edges, the work involved would probably equal and exceed the time needed to do a honest-to-God old-fashioned transcription. Now, where have I seen this pattern again....oh right, sprites.

 

FWIW, I remember back in the days of the C64 there was an audio sampler kit that came with a bunch of software, and one of them was actually able to "transcribe" whistled music or humming to notes on the user's screen. As noted back then, whistling a slow, simple tune worked wonders, but anything more merry quickly got hashed. Not a lot of progress, considering it's been 30 years since ;-)

Share this post


Link to post

Not surprising, because transcribing a recording to a MIDI is creating a new work of art, not manipulating an old one. This is the sort of problem only strong AI could fully solve, and contrary to what Kurzweil believes, we're decades away from that at best and it would probably require an entirely new form of computer using totally different technologies from the binary transistor-based machines we use today.

Share this post


Link to post
Guest
This topic is now closed to further replies.
×