Reverse Engineering
of
FF7 PSX sound effects system

First release, 2018-04-28

INTRO

Final Fantasy VII PSX version uses real-time synthesized sounds effects, unlike the PC version which uses plain samples. Generating them on-the-fly requires only the instructions telling the sound engine how to build them. It achieves to pack 700+ sounds in 50KB worth of data, which is a nice accomplishment. Unfortunately the format seemed quite obscure, nobody in the modding community bothered with reverse-engineering it. So I gave it some time, and wrote this document. It should give you a good base, but keep in mind anything is 100% accurate, things were figured out with intensive testing, listenings and comparisons. Anyway, being the developer of a sound synthesis software (FM Composer) helped a lot !


FF7 PSXSFX

This tool was made after my findings. It can play the sounds (more or less accurately) and show the commands used, with the read cursor shown in real time.

Download v0.2 binary
Download v0.1 binary



Discussion on Qhimm forums

How it works

Sounds are generated in real time using sampled waveforms found in INSTR.ALL, another file of the game. It contains simple waveforms like sine, triangle, and the instruments used for the songs. Waveforms are played like a musical instrument, with notes and such, except that the playback speed is usually fast, with a lot of pitch/volume effects.

In overall, the FF7 SFX system is quite close to a classic synthesizer : notes, envelopes, LFOs. The most remarkable thing is the multi-level loops (nested loops) the format provides, allowing to create complex sounds in a few commands.


Looping hell

The instructions telling the system how to generate the sounds are found in the EFFECT.ALL file, located in the SOUND folder on your FF7 CD-Rom.

The EFFECT.ALL file

The file is made of two parts, the header and the body.

Main structure :

OffsetSizeDescription
0x004 bytesFF FF FF FF signature
0x044092 bytesFixed-size offset table for the sounds. Positions are given relative to the beginning of the sound data chunk, which starts at 0x1000. Each offset is 2 byte long, unsigned, little endian.
Some offsets are empty (0xFFFF), or points to sounds that contains no actual note commands. I don't know the reason behind.
0x1000* bytesThe actual sound data. Sounds can be extracted from here using relative positions given by the offset table.

The offset table is currently using only 2924 bytes of 4092, the rest being padded with 0xFF, thus it *may* be possible to add new sounds.

Instructions can be of three different types : commands, parameters, and notes. After seeking to a sound with the help of the offset table, data must be read sequentially. All commands and parameters are read instantly, except the notes : they will cause the system to wait for the note duration, before reading the next byte.

Commands

The first thing to do is checking if the read byte is >= 0xA0, so you know it's a command. A command has a fixed number of parameters after it. When the read byte isn't a command or a parameter, it's a note.

Commands are always called before playing a note, no command can alter a note after it has started playing.

CodeNb of parametersDescription
0xA00End of message. All sounds finish with this command, or with 0xCA when they are looping sounds.
0xA11 Set the waveform to use. If notes are played without calling this command before, the waveform #05 is used.
0xA21 Set the next note duration with great precision. Will override the duration included in the next note byte.
0xA31 Set the note volume. Default is 0x80. Range is from 0x00 to 0xFF. Value stays for all the notes coming after.
0xA42 Pitch slide for the next note. First parameter is the slide speed (0-127), second is the destination pitch.
0xA51 Set the current octave (2 is the default). Octave affects the note frequency
0xA60 Octave +1
0xA70 Octave -1
0xA8 1 Set the global volume (0-127). Default seems to be 80 if the command isn't called.
0xA9 2 Global volume fade. First parameter is the speed (0-255, high value=slow), second is the destination volume (0-127)
0xAA 1 ?
0xAB 2 ?
0xAC 1 Sets the frequency for the noise generator. Values less than 0x40 sets the frequency as an absolute value, from high (0x00) to low (0x3F). Values greater than 0x3F adds to the current frequency, making it lower.
0xAD 1 Set attack rate (0-7F), low value = fast, high value=slow
0xB0 2 Set decay + sustain. 1st parameter is the decay rate, 2nd parameter is the sustain level
0xB1 1 Set decay
0xB4 3 Set frequency LFO. 1st parameter is the depth, 2nd parameter is the speed, 3rd parameter sets the LFO shape
0xB60Stop frequency LFO
0xB83 Set amplitude (volume) LFO
0xBA0Stop volume LFO
0xC01 Absolute transposition. Parameter <= 0x7F is for positive transposition, value >= 0x80 is for negative transposition starting from 0xFF to 0x80(reversed). The transposition stays active for all notes played after.
0xC11 Relative transposition (adds to the previous transposition), each step is a semitone. Value <= 0x7F is for positive transposition, while value >= 0x80 is for negative transposition starting from 0xFF to 0x80 (reversed). The transposition stays active for all notes played after.
0xC20 Play the following notes on a reverbered channel, if reverb enabled
0xC30 Play the following notes on a non-reverbered channel (default)
0xC40Tells the engine to use the noise generator instead of sampled waveforms. Stays active until the C5 command is found.
0xC50Stops the noise generator to use the waveforms instead. Waveform number is restored as it was before C4 command occured.
0xC80Loop start marker
0xC91Jump to the corresponding C8 marker, repeat the number of times specified by the parameter. C8/C9 markers can be nested, so the outer C9 must points to the outer C8, and same for the more inner ones. When a nested block has finished looping, its loop counter must be reset to zero since it can be played again if nested.
0xCA0Infinite loop to the matching loop start point (C8)
0xCC0Enable legato. Notes played after this command will be slurred
0xCD0Disable legato.
0xD81Global fine tuning. This value is used to multiply the frequency of the notes played, allowing a precise tuning. If the parameter value is lower than 0x7F, the pitch is made higher. Otherwise it's made lower, counting from 0xFF to 0x80 (reversed).
0xD91Global relative fine tuning. It's the same as D8, except that it adds the value to the existing global tuning.
0xDC1Set the duration for all the upcoming notes (same as A2 except it doesn't apply only to the next note)
0xDD2Creates a depth fade for the frequency LFO. First parameter is the fade speed, second parameter is the destination depth.

Notes

Notes are bytes <= 0x83. A single byte tells which note to play and how much time it stays. You find the note number by doing value/11, and the note duration by doing value%11.

To get the final MIDI note number from your value/11, you must add the current octave number multiplied by 12, add 14, and apply the global tuning. Which leads to:

Final note = ((note byte)/11 + octave*12 + 14 + transposition)*(1+tuning)

transposition is 0 by default, unless modified by the C1 command.
tuning is 0 by default, unless modified by D8/D9 commands
octave is 2 by default, unless modified by A5/A6/A7 commands

Durations are a bit strange, since the 11 possible values (from 0 to 0x0A) don't produce a very logical pattern of durations.

int durations[11] = {80000, 40000, 20000, 10000,5000,2500,1250,13000,7000,4000,3000}

It's clearly getting two times shorter each time for the first values, but gets messed up at some point. Anyway those values were purely guessed by ear, so they aren't exact. They are in Samples per Second, assuming the audio output frequency is 44100Hz.

When a note is read, the system must wait for the note duration before reading the next bytes. It's easy to do with such pseudo code :

void play_sound(){
	timer=1;
	trigger=0;
}

void stop_sound(){
	timer=0;
}

void audio_callback(){
	if(timer>0){
		if(trigger <= timer)
		{
			while(read data...)
			{
				if(note)
				{
					trigger += note_duration;
					break;
				}
			}
		}

		timer++;
		output_sound...;
	}
}

Stopping and waiting

Since notes are <= 0x83 and commands are >= 0xA0, this leaves a range of values which are mostly unused except two of them (and maybe more I haven't discovered) :

Waveforms

The sound system use waveforms to build the sounds, they are found in INSTR.ALL that also contains the instrument sounds used for the musics. The sound engine can access to any waveform including the instruments. To read this file, INSTR.DAT provides the informations about the offsets and parameters for each sound. This file is made of 64-byte blocks, each block giving the infos for the sounds that are in the same order in INSTR.ALL :

INSTR.DAT block structure :

OffsetSizeTypeDescription
0x004 bytesunsigned intOffset of the sound + 0x1000
0x044 bytesunsigned intLoop start offset + 0x1000 (loop starts are given as absolute file offsets, it's not the offset from the start of the sound!)
0x104 bytesunsigned intTuning of the sound. The sound playback frequency will be multiplied by this value.

/!\ Subtract 0x1000 to all offsets before using them for seeking in INSTR.ALL

Other informations in this 64-byte block is unknown. All data is 4-byte long.

Sound compression

Data found in INSTR.ALL is mono, ADPCM-compressed. It's a custom ADPCM made by Sony, it can be converted to PCM using this nice library (adpcm.h / adpcm.cpp).

ADPCM provies a 1:1.75 linear compression ratio (16 ADPCM bytes are uncompressed to 28 PCM bytes). Therefore all sound offsets and loop points must be multiplied by 1.75 if you deal with the uncompressed PCM instead of decoding ADPCM on the fly.

Noise generator

In addition to sampled waveforms, FF7 sound system can use the PSX SPU abilities to create noise. It's 4-bit noise, which mean it has only 16 different amplitude steps. The frequency controls the rate at which a new random amplitude will be generated. This creates a very distinctive, low-fi noise.

/*  A timer will let us know when to generate the next random value, depending on its reverse-frequency */
int noise_generator_timer = noise_generator_freq;

void update_noise(){
	noise_generator_timer--;
	if (noise_generator_timer <= 0)
	{
		/* The 3000 value is the volume, it needs to be matched with the waveforms volume. On FF7, it seems a bit louder. */

		noise_generator_value =(rand()%16-8)*3000;
		
		noise_generator_timer=noise_generator_freq;
	}
}

In the game, noise seems heavily low-pass filtered depending on its frequency.

Low Frequency Oscillator (LFO)

Several LFO shapes are available with the third parameter of the LFO commands. They can be reproduced using tables with the right values. I have only reverse engineered the first 11 ones, and not very accurately. LFO speed and depth values are VERY annoying to reproduce, since the values doesn't seem to match with any linear or basic multiplicative scale. I haven't found any correct formula, so I brute forced it by finding values by ear for some points (like 1 - 10 - 20 - 30 etc.) and interpolating between them. It would be better to know which table/formula they used...

MIDI note numbers and frequencies

Frequency table in Hz can be generated in C/C++ using the following code :
float noteFreq[128];

for (unsigned x = 0; x < 128; ++x)
{
	noteFreq[x] = pow(2, (x - 9.0) / 12.0);
}

This table has nothing to do with the actual implementation in the game (which may use pre-calculated increments instead of frequencies in Hz). I'm giving this only for help purpose.

This isn't finished work

It's only an attempt at understanding how this format works. I'm glad if it may help someone creating his set of sound effects for the game, or giving ideas for a new engine to be designed from scratch. If you want to talk with me about this subject, or for any questions, please contact me at stephane.damo@gmail.com