Friday, April 1, 2022

llvm-mos SID player for Commodore 64

In my previous post I described llvm-mos, a new llvm backend which can generate code for 6502 CPU, and I presented a simple program for an 8-bit Atari. Today I want to show you how you can use assembly code in your C program by using a more complicated example. This time it will be a music player for Commodore 64.

The most popular C64 chiptune format is SID. It's basically a 6502 machine code which consists of two main routines: an init routine which sets up the sound chip, and a player routine which must be called once per screen frame. The biggest resource of SID tunes is the High Voltage SID Collection. There are many players which can play SID music files on modern devices, like ZXTune or VLC.

Preparing a SID file

In order to play a SID tune on a real Commodore 64 we must determine where it should be located in memory, and what are the addresses of the init and the player routines. This information can be found in the SID file header. Bytes 8 and 9 tell if the the data are in original C64 binary file format, i.e. the first two bytes of the data contain the little-endian load address (low byte, high byte), bytes 10 and 11 hold the init address, and bytes 12 and 13 contain the play address. Each tune can use different addresses, so it's necessary to check them carefully. For this project I'll be using one of my favourite C64 tunes, "Ginger" by Kristian Røstøen. When you analyze the header of this particular tune you'll notice that the init address is $1000, the play address is $1003, and the binary data, which begin at offset 124, already contain the load address $1000. We can use this information to extract the machine code from the SID file:
dd if=ginger.sid of=ginger.prg bs=1 skip=124
The resulting file ginger.prg can be put on a floppy disk and loaded on a Commodore 64 with the following command:
LOAD "GINGER",8,1

Loading a SID file

Instead of loading the file into memory by hand, we can write some code to do it for us. Unfortunately, llvm-mos does not have any library functions to do it at the moment, but fortunately Commodore 64 ROM does. The easiest way to use the ROM functions is through assembly, so let's try to write some assembly code in llvm:
char *fname = "GINGER";
unsigned char fnamelo = (unsigned int)fname & 0xFFFF;
unsigned char fnamehi = (unsigned int)fname >> 8;
unsigned char fnamelen = strlen(fname);

asm("JSR $FFBD" :: "a"(fnamelen), "x"(fnamelo), "y"(fnamehi));
The C part should be pretty much self-explanatory. We create variables which contain the lower and the higher byte of the file name pointer, and the file name length. You may think now that using strlen() function is an overkill. Well, in cc65 it probably is, but llvm-mos optimizer can translate the whole expression into a single CPU instruction:
LDA #$06
The assembly part of the code needs more explanation. Here we call SETNAM function located in ROM at address $FFBD. Before we call it, we need to put an address of the string containing the file name in two 8-bit CPU registers, X and Y, and the string length in register A. A pure assembly code would look like this:
LDA fnamelen
LDX fnamelo
LDY fnamehi
JSR $FFBD
However, in llvm-mos we cannot use C expressions in assembly code directly. But we can use so called input and output operands to transfer C variables to CPU registers and vice versa. In gcc syntax for inline assembly, which llvm also uses, operands are separated from assembly expresssions by colons. You can read more about using inline assembly and operands here.

After setting up the file name we need to provide some information about the device from which data should be loaded, and whether it should be loaded to an address pointed by the file header:
asm("LDA #$01\n\t"     // logical file number
    "LDX #$08\n\t"     // device number
    "LDY #$01\n\t"     // load to address found in file header
    "JSR $FFBA\n\t");  // SETLFS
Remember the LOAD "GINGER",8,1 command? This is the ",8,1" part, but written in assembly.

Now all that's left is to load the file into memory:
asm ("LDA #$00\n\t"     // load to memory
     "JSR $FFD5\n\t");  // LOAD

Playing a SID tune

It's time to play the music, then. Let's start with the init routine:
asm ("LDA #$00\n\t"
     "TAX\n\t"
     "TAY\n\t"
     "JSR $1000");
Now we have two ways of calling the play routine. The first one is fairly simple. We need to wait in a loop until the raster reaches a certain line on the screen, and when it happens, call the play routine. The line number can be any number from 0 to 263 (for NTSC machines) or 312 (for PAL machines), but because the raster line register $D012 is 8-bit, it means that it can only hold values from 0 to 255 and that values from 0 to 8 (NTSC) and 0 to 56 (PAL) appear in $D012 more than once per frame. A solution to this problem is to check the 7th bit of register $D011, which is 0 for lines from 0 to 255 and 1 for lines > 255, or use values bigger than 56, which never appear in register $D012 twice during one frame. In my example, I'm using 255:
#define rasterline (*((volatile unsigned char*)0xD012))

while (1) {
    if (rasterline == 255) {
        asm ("JSR $1003");
    }
}
Putting all the code snippets together, a complete program looks like this:
#include <string.h>

#define rasterline (*((volatile unsigned char*)0xD012))

int main() {
    char *fname = "GINGER";
    unsigned char fnamelo = (unsigned int)fname & 0xFFFF;
    unsigned char fnamehi = (unsigned int)fname >> 8;
    unsigned char fnamelen = strlen(fname);

    // Load SID file into memory
    asm("JSR $FFBD\n\t"  // SETNAM (A = fname lenght, XY = fname address)
        "LDA #$01\n\t"   // logical file number
        "LDX #$08\n\t"   // device number
        "LDY #$01\n\t"   // load to address found in file header
        "JSR $FFBA\n\t"  // SETLFS
        "LDA #$00\n\t"   // load to memory
        "JSR $FFD5\n\t"  // LOAD
        :: "a"(fnamelen), "x"(fnamelo), "y"(fnamehi));

    // Init SID player routine
    asm("LDA #$00\n\t"
        "TAX\n\t"
        "TAY\n\t"
        "JSR $1000");

    // Call SID refresh routine
    while (1) {
        if (rasterline == 255) {
            asm ("JSR $1003");
        }
    }

    return 0;
}

Playing a SID tune using interrupts

The second way of calling the play routine is to use raster interrupts. The principle is the same, we call the play routine every frame, but this time we don't have to wait in a loop for a given raster line. We only need to write a short function which will be called when an interrupt occurs, and tell the computer to generate the interrupt at a certain raster line. The addresses of the interrupt routines are stored at the beginning of the memory in a structure called vector table. We need to modify this table so that a raster interrupt vector now points to our function instead of the original one.

Let's write the function first:
void play() {
    asm ("ASL $D019\n\t"  // acknowledge interrupt (reset bit 0)
         "JSR $1003\n\t"  // call SID refresh routine
         "JMP $EA31");    // jump to default interrupt handler routine
}
When writing a raster interrupt function, we need to do two additional things. First, we need to let the computer know that the interrupt was handled properly. This is done by reading register $D019, zeroing it's first byte, and writing back to it. The easiest and fastest way to do it is by using Arithmetic Shift Left (ASL) instruction, which does exactly that. Second, because we hijacked the original interrupt vector, we want to call the original ROM routine when we finish executing our own code.

Now it's time to modify the system interrupts:
#define intctrl (*((volatile unsigned char*)0xDC0D))
#define screenctrl (*((volatile unsigned char*)0xD011))
#define rasterline (*((volatile unsigned char*)0xD012))
#define rasterintctrl (*((volatile unsigned char*)0xD01A))
#define rasterintlo (*((volatile unsigned char*)0x314))
#define rasterinthi (*((volatile unsigned char*)0x315))

void (* fun)(void) = &play;
unsigned char funlo = (unsigned int)fun & 0xFFFF;
unsigned char funhi = (unsigned int)fun >> 8;

asm("SEI");           // switch off interrupts
intctrl = 0x7F;       // disable CIA interrupts
rasterintctrl = 1;    // enable raster interrupts
rasterintlo = funlo;  // low byte of raster interrupt routine
rasterinthi = funhi;  // high byte of raster interrupt routine
rasterline = 127;     // trigger interrupt at raster line 127
screenctrl = screenctrl & 0x7F;
asm("CLI");           // switch on interrupts
First, we need to disable all interrupts with a CPU instruction SEI. We have to do it because there is a chance that an interrupt occurs after we modified some registers, but before we modified others, and it will cause the CPU to jump to a wrong place in memory and crash. Next, we ask the system to disable clock interrupts by zeroing the highest bit of $DC0D, and enable screen raster interrupts instead. Then we change addresses $314 and $315 in the vector table to point to the play() function. Finally, we indicate a raster line which will trigger the interrupt. After we finish modifying the system interrupts we can re-enable them with CLI.

Remember that rasterline register can hold only 8 bits, so you also need to set the highest bit of screenctrl register accordingly. For example, to generate an interrupt when line number 260 is drawn on the screen, you need to do the following:
screenctrl = screenctrl | 0x80;  // last bit equals 1 for lines > 255
rasterline = 4;                  // 260 - 256 = 4
You might look at it as if the rasterline register was 9-bit, with address $D012 holding the lowest 8 bits, and $D011 holding the highest bit in its own highest bit.

Now let's get back to our example. A complete program now looks like this:
#include <string.h>

#define intctrl (*((volatile unsigned char*)0xDC0D))
#define screenctrl (*((volatile unsigned char*)0xD011))
#define rasterline (*((volatile unsigned char*)0xD012))
#define rasterintctrl (*((volatile unsigned char*)0xD01A))
#define rasterintlo (*((volatile unsigned char*)0x314))
#define rasterinthi (*((volatile unsigned char*)0x315))

// SID refresh function
void play() {
    asm("ASL $D019\n\t"  // acknowledge interrupt (reset bit 0)
        "JSR $1003\n\t"  // call SID refresh routine
        "JMP $EA31");    // jump to default interrupt handler routine
}

int main() {
    char *fname = "GINGER";
    unsigned char fnamelo = (unsigned int)fname & 0xFFFF;
    unsigned char fnamehi = (unsigned int)fname >> 8;
    unsigned char fnamelen = strlen(fname);
    void (* fun)(void) = &play;
    unsigned char funlo = (unsigned int)fun & 0xFFFF;
    unsigned char funhi = (unsigned int)fun >> 8;

    // Load SID file into memory
    asm("JSR $FFBD\n\t"  // SETNAM (A = fname lenght, XY = fname address)
        "LDA #$01\n\t"   // logical file number
        "LDX #$08\n\t"   // device number
        "LDY #$01\n\t"   // load to address found in file header
        "JSR $FFBA\n\t"  // SETLFS
        "LDA #$00\n\t"   // load to memory
        "JSR $FFD5\n\t"  // LOAD
        :: "a"(fnamelen), "x"(fnamelo), "y"(fnamehi));

    // Init SID player routine
    asm("LDA #$00\n\t"
        "TAX\n\t"
        "TAY\n\t"
        "JSR $1000");

    // Set up raster interrupt
    asm("SEI");           // switch off interrupts
    intctrl = 0x7F;       // disable CIA interrupts
    rasterintctrl = 1;    // enable raster interrupts
    rasterintlo = funlo;  // low byte of raster interrupt routine
    rasterinthi = funhi;  // high byte of raster interrupt routine
    rasterline = 127;     // trigger interrupt at raster line 127
    screenctrl = screenctrl & 0x7F;
    asm("CLI");           // switch on interrupts

    return 0;
}
It's a bit longer than the first version, but a side effect of using interrupts instead of a loop is that we can still type on the screen and execute simple Basic commands while the music plays in the background.
If you want to try the program yourself, you can find the complete code on Github.

No comments: