Piero V.

A journey into GameCube music

One evening, a pair of weeks ago, I was pretty bored, and I did not want to do anything too demanding. Somehow I reminded Burnout’s soundtrack, and I wondered: is it possible to extract it from the original disk instead of relying on YouTube, or something similar, to listen to it?

Loading GCMs on Linux

I dumped my Burnout disk with some Wii homebrew 8 years ago or so. That time, I named it with the .iso extension, but GC disks are not ISO 9660 standard. They have a proprietary format that the community dubbed GCM.

Even the file utility can detect that them:

$ file burnout.iso
burnout.iso: Nintendo GameCube disc image: "Burnout" (GBOP51, Rev.00)

So, I downloaded a tool called simply gcm-tool. Man, many years have passed since the last time I download something from Google Code!

And I love that people provided an executable and its source code, in plain portable C or C++, without any GUI or additional dependencies. So a gcc *.c gets the work done, and using the wanted tool is immediately possible.

gcm allows showing the image file system by using the -fs flag. For Burnout, its output starts in this way:

GCM v0.4 (c)2007 by dsbomb
==============================
Filename: burnout.iso
Game Code: GBOP, Region: PAL
Maker: 51, Disk ID: 00, Version: 00
Audio streaming: No
Game Name: Burnout

Bootfile offset: 0x0001DA00
FST offset: 0x001F6000
FST size: 10196
FST max size: 10196

    1: /audio, Parent: 0, Next: 42
    2: /audio/music, Parent: 1, Next: 35
    3: /audio/music/DCredits.ulw, size: 11970560, offset: 00200000-00D6A7FF
    4: /audio/music/DMenu.ulw, size: 3641430, offset: 00D6A800-010E3855
    5: /audio/music/Dolby1.ulw, size: 8720384, offset: 010E3858-01934857
...

As you can see, each line has the following format:

file-number: path, size: NNNN, offset: XXXX-XXXX

It is super easy to parse with regular expressions, and it gives all the information to extract the files. The following Python script does exactly that:

#!/usr/bin/env python3
from pathlib import Path
import re

r = re.compile(r'/([^,]+), size: ([0-9]+), offset: ([0-9A-Z]+)-')

files = []
with open('files.txt') as f:
    for li in f:
        m = r.findall(li)
        if not m:
            continue
        p = Path('extract') / Path(m[0][0])
        files.append((p, int(m[0][1]), int(m[0][2], 16)))

with open('game.iso', 'rb') as f:
    for p, size, offset in files:
        f.seek(offset)
        if p.parents:
            p.parent.mkdir(parents=True, exist_ok=True)
        with open(p, 'wb') as out:
            out.write(f.read(size))

Please notice how I excluded the first slash from the capturing group to transform the absolute path to a relative one. I am really proud of this little hack 🤓️😎️.

Burnout

If you have another look at the sample output I provided before, you will see that the first directory is /audio, and then /audio/music. How lucky!

Files have extension .ulw, which stands for µ-law. Usually, by PCM, we refer to linear PCM, which has uniform quantization levels. With µ-law, a logarithmic quantization is used instead to make louder signal levels have smaller quantization intervals. In other words, you trade dynamic range for file size.

In particular, Burnout’s files are encoded in G.711 and can be converted either with FFmpeg or with a utility specific to videogames: vgmstream.

ffmpeg -f mulaw -ar 32000 -ac 2 -i file.ulaw -compression_level 12 file.flac

The sampling rate of 32kHz was chosen because it is Gamecube’s maximum! Files beginning with D are Dolby Surround, but have 2 channels anyway, as files starting by S (Stereo). Finally, the first letter can be M, which stands for Mono, in which case files have only one channel.

I have never encountered non-linear PCM before. Apart from this, everything seemed standard and similar to what I expected. So I felt confident, and I decided to go ahead with Luigi’s Mansion and Pokémon Colosseum.

Luigi’s Mansion

So, some days later, I started having a look at Luigi’s Mansion… And my enthusiasm immediately stopped. Sure, there is a /AudioRes directory, but it has only a few files, and its size is less than 20MB, so I could not believe all the audio was here.

But being 20 years late helps, and many secrets of this game have already been unveiled.

So, yes, that is the correct directory, and yes, all the audio is there. The trick is that Luigi’s Mansion music is sequenced.

Man, I could not believe this. In my mind, MIDIs produced unimpressive and unrealistic results, whose nature can be spotted immediately.

I wonder the reason behind Nintendo’s choice. GameCube is powerful enough to rely on PCM, and it can even stream it from the drive. It is not even a question of space, as Luigi’s Mansion takes less than 300MB of a 1.46GB medium. Maybe it is just because developers used to do that for Nintendo 64.

Anyway, on GitHub, you can find gctools, a set of programs to deal with video game files. In particular, a tool called smssynth can play Luigi’s Mansion music or convert it to PCM. The result is very accurate; only a few passages sounded a bit off to my memory.

The player also shows which notes are played, and some music passages have a spectacular graphic representation.

But at this point, I do not know whether it is better to create new renderings in this way or to dump sounds while playing them in Dolphin. Of course, recording from the original console is heresy (DAC followed by ADC 🤮️) and should always be avoided!

Pokémon Colosseum

I have already written about my memories of this game. Also, I have always thought its soundtrack is remarkable.

MusicBrainz has a Pokémon Colosseum release, and Google lists the same track order, but I cannot find their source. So, I decided to try to get the original files also in this case.

My only clue was the /sound/snd_music.samp file. It led me to this GitHub issue, which was revealing.

First, by looking at the project name, you can understand that Colosseum uses something called MusyX. If you look for other information, you can find an enthusiastic article from IGN, dated July 30, 1999. The GameCube name was still a secret, and they were discussing Dolphin with the future tense! How cute!

Secondly, @Nisto suggested extracting the files from common.fsys.

Another piece came from this thread on projectpokemon.org: the files from bgm_archive.fsys are needed, too.

In both cases, the .fdat extension in the archived files is bogus, should be removed, and the last underscore replaced with a dot.

As regards extraction per se, QuickBMS and the scripts downloadable from this other thread get you covered. I tried to compile QuickBMS, but it is troublesome (32 bit compulsory in 2021 and subprojects with not working makefiles 😵️); luckily, I later noticed that the site provides Linux binaries.

Sadly, the same cannot be said of amuse, an unofficial project to deal with MusyX content, and mentioned again in the previous GitHub issue.

Basically, it can only be compiled together with its dependencies, which have lots of dependencies, in turn. If you use Debian or a derivative, these should be enough (unless you also want the Qt GUI):

apt install libclang-dev zlib1g-dev libx11-dev libglew-dev libdbus-1-dev libxi-dev libasound2-dev libxrandr-dev libudev-dev libpulse-dev libx11-xcb-dev

Be advised: libclang-dev alone can take hundreds of MB!

After that, you can clone the Git repo, change the shabang of standalone_bootstrap.sh to bash and run it. Then you have to run the usual CMake stuff, build, and that’s it: you should have the amuserender command.

As suggested by its name, it takes a MusyX project (anyone of .proj/.pool/.sdir/.samp; it will look for the other 3 in the same directory), a .song, and render it to a WAVE… Sorta.

My PAL copy of Colosseum contains 78 songs, but one is null_bgm.song, so it can probably be skipped. For each file, amuse asks to choose between 77 MIDI configurations, possibly one per file… which is a lot. In my first attempt, I rendered all the files with the zeroth configuration. A few had wrong sequences; the rest had the correct tunes but lacked equalization and possibly other effects.

So, again, the original files were sequenced, and the WAVE reconstruction is not what I really wanted.

Conclusions

I was a bit disappointed that I could not reach my objective.

But I am happy that a lazy evening transformed into this occasion to learn so much. I could have never imagined that a system released in those years employed these clever tricks.