Noisetracker/Soundtracker/Protracker Module Format 4th Revision Credits: Lars Hamre, Norman Lin, Kurt Kennett, Mark Cox, Peter Hanning, Steinar Midtskogen, Marc Espie, and Thomas Meyer (All numbers below are given in decimal) Module Format: # Bytes Description ------- ----------- 20 The module's title, padded with null (\0) bytes. Original Protracker wrote letters only in uppercase. (Data repeated for each sample 1-15 or 1-31) 22 Sample's name, padded with null bytes. If a name begins with a '#', it is assumed not to be an instrument name, and is probably a message. 2 Sample length in words (1 word = 2 bytes). The first word of the sample is overwritten by the tracker, so a length of 1 still means an empty sample. See below for sample format. 1 Lowest four bits represent a signed nibble (-8..7) which is the finetune value for the sample. Each finetune step changes the note 1/8th of a semitone. Implemented by switching to a different table of period-values for each finetune value. 1 Volume of sample. Legal values are 0..64. Volume is the linear difference between sound intensities. 64 is full volume, and the change in decibels can be calculated with 20*log10(Vol/64) 2 Start of sample repeat offset in words. Once the sample has been played all of the way through, it will loop if the repeat length is greater than one. It repeats by jumping to this position in the sample and playing for the repeat length, then jumping back to this position, and playing for the repeat length, etc. 2 Length of sample repeat in words. Only loop if greater than 1. (End of this sample's data.. each sample uses the same format and they are stored sequentially) N.B. All 2 byte lengths are stored with the Hi-byte first, as is usual on the Amiga (big-endian format). 1 Number of song positions (ie. number of patterns played throughout the song). Legal values are 1..128. 1 Historically set to 127, but can be safely ignored. Noisetracker uses this byte to indicate restart position - this has been made redundant by the 'Position Jump' effect. 128 Pattern table: patterns to play in each song position (0..127) Each byte has a legal value of 0..63 (note the Protracker exception below). The highest value in this table is the highest pattern stored, no patterns above this value are stored. (4) The four letters "M.K." These are the initials of Unknown/D.O.C. who changed the format so it could handle 31 samples (sorry.. they were not inserted by Mahoney & Kaktus). Startrekker puts "FLT4" or "FLT8" here to indicate the # of channels. If there are more than 64 patterns, Protracker will put "M!K!" here. You might also find: "4CHN", "6CHN" or "8CHN" which indicates 4, 6 or 8 channels respectively. If no letters are here, then this is the start of the pattern data, and only 15 samples were present. (Data repeated for each pattern:) 1024 Pattern data for each pattern (starting at 0). (Each pattern has same format and is stored in numerical order. See below for pattern format) (Data repeated for each sample:) xxxxxx The maximum size of a sample is 65535 words. Each sample is stored as a collection of bytes (length of a sample was given previously in the module). Each byte is a signed value (-128 ..127) which is the channel data. When a sample is played at a pitch of C2 (see below for pitches), about 8287 bytes of sample data are sent to the channel per second. Multiply the rate by the twelfth root of 2 (=1.0595) for each semitone increase in pitch eg. moving the pitch up 1 octave doubles the rate. The data is stored in the order it is played (eg. first byte is first byte played). The first word of the sample data is used to hold repeat information, and will overwrite any sample data that is there (but it is probably safer to set it to 0). The rate given above (8287) conveys an inaccurate picture of the module-format - in reality it is different for different Amigas. As the routines for playing were written to run off certain interrupts, for different Amiga computers the rate to send data to the channel will be different. For PAL machines the clock rate is 7093789.2 Hz and for NTSC machines it is 7159090.5 Hz. When the clock rate is divided by twice the period number for the pitch it will give the rate to send the data to the channel, eg. for a PAL machine sending a note at C2 (period 428), the rate is 7093789.2/856 ~= 8287.1369 (Each sample is stored sequentially) Pattern Format: Each pattern is divided into 64 divisions. By allocating different tempos for each pattern and spacing the notes across different amounts of divisions, different bar sizes can be accommodated. Each division contains the data for each channel (1..4) stored after each other. Channels 1 and 4 are on the left, and channels 2 and 3 are on the right. In the case of more channels: channels 5 and 8 are on the left, and channels 6 and 7 are on the right, etc. Each channel's data in the division has an identical format which consists of 2 words (4 bytes). Divisions are numbered 0..63. Each division may be divided into a number of ticks (see 'set speed' effect below). Channel Data: (the four bytes of channel data in a pattern division) 7654-3210 7654-3210 7654-3210 7654-3210 wwww xxxxxxxxxxxxxx yyyy zzzzzzzzzzzzzz wwwwyyyy (8 bits) is the sample for this channel/division xxxxxxxxxxxx (12 bits) is the sample's period (or effect parameter) zzzzzzzzzzzz (12 bits) is the effect for this channel/division If there is to be no new sample to be played at this division on this channel, then the old sample on this channel will continue, or at least be "remembered" for any effects. If the sample is 0, then the previous sample on that channel is used. Only one sample may play on a channel at a time, so playing a new sample will cancel an old one - even if there has been no data supplied for the new sample. Though, if you are using a "silence" sample (ie. no data, only used to turn off other samples) it is polite to set its default volume to 0. To determine what pitch the sample is to be played on, look up the period in a table, such as the one below (for finetune 0). If the period is 0, then the previous period on that channel is used. Unfortunately, some modules do not use these exact values. It is best to do a binary- search (unless you use the period as the offset of an array of notes.. expensive), especially if you plan to use periods outside the "standard" range. Periods are the internal representation of the pitch, so effects that alter pitch (eg. sliding) alter the period value (see "effects" below). C C# D D# E F F# G G# A A# B Octave 1: 856, 808, 762, 720, 678, 640, 604, 570, 538, 508, 480, 453 Octave 2: 428, 404, 381, 360, 339, 320, 302, 285, 269, 254, 240, 226 Octave 3: 214, 202, 190, 180, 170, 160, 151, 143, 135, 127, 120, 113 Octave 0:1712,1616,1525,1440,1357,1281,1209,1141,1077,1017, 961, 907 Octave 4: 107, 101, 95, 90, 85, 80, 76, 71, 67, 64, 60, 57 Octaves 0 and 4 are NOT standard, so don't rely on every tracker being able to play them, or even not crashing if being given them - it's just nice that if you can code it, to allow them to be read. Effects: Effects are written as groups of 4 bits, eg. 1871 = 7 * 256 + 4 * 16 + 15 = [7][4][15]. The high nibble (4 bits) usually determines the effect, but if it is [14], then the second nibble is used as well. [0]: Arpeggio Where [0][x][y] means "play note, note+x semitones, note+y semitones, then return to original note". The fluctuations are carried out evenly spaced in one pattern division. They are usually used to simulate chords, but this doesn't work too well. They are also used to produce heavy vibrato. A major chord is when x=4, y=7. A minor chord is when x=3, y=7. [1]: Slide up Where [1][x][y] means "smoothly decrease the period of current sample by x*16+y after each tick in the division". The ticks/division are set with the 'set speed' effect (see below). If the period of the note being played is z, then the final period will be z - (x*16 + y)*(ticks - 1). As the slide rate depends on the speed, changing the speed will change the slide. You cannot slide beyond the note B3 (period 113). [2]: Slide down Where [2][x][y] means "smoothly increase the period of current sample by x*16+y after each tick in the division". Similar to [1], but lowers the pitch. You cannot slide beyond the note C1 (period 856). [3]: Slide to note Where [3][x][y] means "smoothly change the period of current sample by x*16+y after each tick in the division, never sliding beyond current period". Any note in this channel's division is not played, but changes the "remembered" note - it can be thought of as a parameter to this effect. Sliding to a note is similar to effects [1] and [2], but the slide will not go beyond the given period, and the direction is implied by that period. If x and y are both 0, then the old slide will continue. [4]: Vibrato Where [4][x][y] means "oscillate the sample pitch using a particular waveform with amplitude y/16 semitones, such that (x * ticks)/64 cycles occur in the division". The waveform is set using effect [14][4]. By placing vibrato effects on consecutive divisions, the vibrato effect can be maintained. If either x or y are 0, then the old vibrato values will be used. [5]: Continue 'Slide to note', but also do Volume slide Where [5][x][y] means "either slide the volume up x*(ticks - 1) or slide the volume down y*(ticks - 1), at the same time as continuing the last 'Slide to note'". It is illegal for both x and y to be non-zero. You cannot slide outside the volume range 0..64. The period-length in this channel's division is a parameter to this effect, and hence is not played. [6]: Continue 'Vibrato', but also do Volume slide Where [6][x][y] means "either slide the volume up x*(ticks - 1) or slide the volume down y*(ticks - 1), at the same time as continuing the last 'Vibrato'". It is illegal for both x and y to be non-zero. You cannot slide outside the volume range 0..64. [7]: Tremolo Where [7][x][y] means "oscillate the sample volume using a particular waveform with amplitude y*(ticks - 1), such that (x * ticks)/64 cycles occur in the division". If either x or y are 0, then the old tremolo values will be used. The waveform is set using effect [14][7]. Similar to [4]. [8]: (Set panning position) This command is unused by the vast majority of trackers, but one tracker for the PC (called DMP) uses this for setting the panning state of the channel. As this is very useful, I am documenting it here. The effect [8][x][y] means "set channel to panning position x*16 + y". Position 0 is left, 64 is centre, 128 is right. Interestingly, position 164 is defined as "surround". [9]: Set sample offset Where [9][x][y] means "play the sample from offset x*4096 + y*256". The offset is measured in words. If no sample is given, yet one is still playing on this channel, it should be retriggered to the new offset using the current volume. [10]: Volume slide Where [10][x][y] means "either slide the volume up x*(ticks - 1) or slide the volume down y*(ticks - 1)". If both x and y are non-zero, then the y value is ignored (assumed to be 0). You cannot slide outside the volume range 0..64. [11]: Position Jump Where [11][x][y] means "stop the pattern after this division, and continue the song at song-position x*16+y". This shifts the 'pattern-cursor' in the pattern table (see above). Legal values for x*16+y are from 0 to 127. [12]: Set volume Where [12][x][y] means "set current sample's volume to x*16+y". Legal volumes are 0..64. [13]: Pattern Break Where [13][x][y] means "stop the pattern after this division, and continue the song at the next pattern at division x*10+y" (the 10 is not a typo). Legal divisions are from 0 to 63. [14][0]: Set filter on/off Where [14][0][x] means "set sound filter ON if x is 0, and OFF is x is 1". This is a hardware command for some Amigas, so if you don't understand it, it is better not to use it. [14][1]: Fineslide up Where [14][1][x] means "decrement the period of the current sample by x". The incrementing takes place at the beginning of the division, and hence there is no actual sliding. This type of sliding cannot be continued with effect [5]. You cannot slide beyond the note B3 (period 113). [14][2]: Fineslide down Where [14][2][x] means "increment the period of the current sample by x". Similar to [14][1] but shifts the pitch down. You cannot slide beyond the note C1 (period 856). [14][3]: Set glissando on/off Where [14][3][x] means "set glissando ON if x is 1, OFF if x is 0". Used in conjunction with [3] ('Slide to note'). If glissando is on, then 'Slide to note' will slide in semitones, otherwise will perform the default smooth slide. [14][4]: Set vibrato waveform Where [14][4][x] means "set the waveform of succeeding 'vibrato' effects to wave #x". [4] is the 'vibrato' effect. Possible values for x are: 0 - sine (default) /\ /\ (2 cycles shown) 4 (without retrigger) \/ \/ 1 - ramp down | \ | \ 5 (without retrigger) \ | \ | 2 - square ,--, ,--, 6 (without retrigger) '--' '--' 3 - random: a random choice of one of the above. 7 (without retrigger) If the waveform is selected "without retrigger", then it will not be retriggered from the beginning at the start of each new note. [14][5]: Set finetune value Where [14][5][x] means "sets the finetune value of the current sample to the signed nibble x". x has legal values of 0..15, corresponding to signed nibbles 0..7,-8..-1 (see start of text for more info on finetune values). [14][6]: Loop pattern Where [14][6][x] means "set the start of a loop to this division if x is 0, otherwise after this division, jump back to the start of a loop and play it another x times before continuing". If the start of the loop was not set, it will default to the start of the current pattern. Hence 'loop pattern' cannot be performed across multiple patterns. Note that loops do not support nesting, and you may generate an infinite loop if you try to nest 'loop pattern's. [14][7]: Set tremolo waveform Where [14][7][x] means "set the waveform of succeeding 'tremolo' effects to wave #x". Similar to [14][4], but alters effect [7] - the 'tremolo' effect. [14][8]: -- Unused -- [14][9]: Retrigger sample Where [14][9][x] means "trigger current sample every x ticks in this division". If x is 0, then no retriggering is done (acts as if no effect was chosen), otherwise the retriggering begins on the first tick and then x ticks after that, etc. [14][10]: Fine volume slide up Where [14][10][x] means "increment the volume of the current sample by x". The incrementing takes place at the beginning of the division, and hence there is no sliding. You cannot slide beyond volume 64. [14][11]: Fine volume slide down Where [14][11][x] means "decrement the volume of the current sample by x". Similar to [14][10] but lowers volume. You cannot slide beyond volume 0. [14][12]: Cut sample Where [14][12][x] means "after the current sample has been played for x ticks in this division, its volume will be set to 0". This implies that if x is 0, then you will not hear any of the sample. If you wish to insert "silence" in a pattern, it is better to use a "silence"-sample (see above) due to the lack of proper support for this effect. [14][13]: Delay sample Where [14][13][x] means "do not start this division's sample for the first x ticks in this division, play the sample after this". This implies that if x is 0, then you will hear no delay, but actually there will be a VERY small delay. Note that this effect only influences a sample if it was started in this division. [14][14]: Delay pattern Where [14][14][x] means "after this division there will be a delay equivalent to the time taken to play x divisions after which the pattern will be resumed". The delay only relates to the interpreting of new divisions, and all effects and previous notes continue during delay. [14][15]: Invert loop Where [14][15][x] means "if x is greater than 0, then play the current sample's loop upside down at speed x". Each byte in the sample's loop will have its sign changed (negated). It will only work if the sample's loop (defined previously) is not too big. The speed is based on an internal table. [15]: Set speed Where [15][x][y] means "set speed to x*16+y". Though it is nowhere near that simple. Let z = x*16+y. Depending on what values z takes, different units of speed are set, there being two: ticks/division and beats/minute (though this one is only a label and not strictly true). If z=0, then what should technically happen is that the module stops, but in practice it is treated as if z=1, because there is already a method for stopping the module (running out of patterns). If z<=32, then it means "set ticks/division to z" otherwise it means "set beats/minute to z" (convention says that this should read "If z<32.." but there are some composers out there that defy conventions). Default values are 6 ticks/division, and 125 beats/minute (4 divisions = 1 beat). The beats/minute tag is only meaningful for 6 ticks/division. To get a more accurate view of how things work, use the following formula: 24 * beats/minute divisions/minute = ----------------- ticks/division Hence divisions/minute range from 24.75 to 6120, eg. to get a value of 2000 divisions/minute use 3 ticks/division and 250 beats/minute. If multiple "set speed" effects are performed in a single division, the ones on higher-numbered channels take precedence over the ones on lower-numbered channels. This effect has a large number of different implementations, but the one described here has the widest usage. N.B. This document should be fairly accurate now, but as the module format is more of an observation than a standard, a couple of effects cannot be relied upon to act exactly the same from tracker to tracker (especially if the tracker is not for the Amiga). It is probably better to use this document as a guide rather than as a hard-and-fast definition of the module format. Oh.. and yes, I would normally give bytes as hex values, but it is easier to understand a consistent notation. Andrew Scott, Author of MIDIMOD (MOD to MIDI converter), PTMID (MIDI to MOD converter)