Sunday, March 27, 2022

llvm-mos compiler for 8-bit 6502 machines

A while ago I posted an article about programming for an 8-bit Atari. I wrote a simple program in C, compiled it with cc65, and then optimized the assembly output. The best result I could get with built-in compiler optimizations was 630 bytes, which I then reduced to 29 bytes using manual machine code refactoring and an alternative assembler. It proved that you can program an 8-bit machine in C, but the result is far from perfect and hand-written assembly is still better.

A few months ago everything changed. A project llvm-mos has started and achieved some really impressive results so far. Llvm is a highly optimizing compiler toolchain, which can be used as a front end for any modern programming language. Clang, an llvm frontend for C, C++, and Objective-C, is a default compiler for Mac OS X, where it replaced gcc in 2011. Llvm can also produce machine code for any CPU architecture when given a suitable backend. LLvm-mos provides such backend for 6502 CPU, which means that you can write code for Commodore 64 or Atari 400/800/XL/XE even in Rust!

I decided to compile my original program with llvm-mos and compare it with the latest cc65. The nicest thing was that I didn't have to touch the source code at all, I only had to provide a custom atari.h header, which I simply copied from cc65 include directory, with some minor changes (more on this later). The code looks like this:
#include "atari.h"

#define SDMCTL (*((unsigned char*)0x22F))

int main() {
    unsigned char line, counter = 0;

    SDMCTL = 0; // disable display in BASIC screen area
    while (1) {
        line = ANTIC.vcount;  // read current screen line
        if (!line) counter++; // increase colour shift on new frame
        ANTIC.wsync = line;   // block CPU until vertical sync
        GTIA_WRITE.colbk = line + counter; // change background colour
    }
    return 0;
}
What it does is that it displays a moving rainbow animation on the screen.
When I compiled my program with the latest cc65 version I got a 557 bytes long executable. LLvm-mos produced 58 bytes of code. A hand-written assembly version is 29 bytes long. I didn't do any speed comparisons, but the results I found show that llvm-mos creates machine code which is faster than the code produced by cc65 by a long shot.

Now a word about compatibility between cc65 and llvm-mos. I didn't have many problems so far, and I could even use cc65 include files, but with two caveats. First, you have to get rid of cc65 specific extensions like __fastcall__ or __asm__. Second, you need to remember about using volatile keyword every time you access a variable which is mapped to a memory address or an input / output port. For example, you have to change this:
struct __antic {
    unsigned char   dmactl; /* (W) direct memory access control */
    unsigned char   chactl; /* (W) character mode control */
    unsigned char   dlistl; /* display list pointer low-byte */
    unsigned char   dlisth; /* display list pointer high-byte */
    unsigned char   hscrol; /* (W) horizontal scroll enable */
    unsigned char   vscrol; /* (W) vertical scroll enable */
    unsigned char   unuse0; /* unused */
    unsigned char   pmbase; /* (W) msb of p/m base address */
    unsigned char   unuse1; /* unused */
    unsigned char   chbase; /* (W) msb of character set base address */
    unsigned char   wsync;  /* (W) wait for horizontal synchronization */
    unsigned char   vcount; /* (R) vertical line counter */
    unsigned char   penh;   /* (R) light pen horizontal position */
    unsigned char   penv;   /* (R) light pen vertical position */
    unsigned char   nmien;  /* (W) non-maskable interrupt enable */
};
into this:
struct __antic {
    volatile unsigned char   dmactl; /* (W) direct memory access control */
    volatile unsigned char   chactl; /* (W) character mode control */
    volatile unsigned char   dlistl; /* display list pointer low-byte */
    volatile unsigned char   dlisth; /* display list pointer high-byte */
    volatile unsigned char   hscrol; /* (W) horizontal scroll enable */
    volatile unsigned char   vscrol; /* (W) vertical scroll enable */
    volatile unsigned char   unuse0; /* unused */
    volatile unsigned char   pmbase; /* (W) msb of p/m base address */
    volatile unsigned char   unuse1; /* unused */
    volatile unsigned char   chbase; /* (W) msb of character set base address */
    volatile unsigned char   wsync;  /* (W) wait for horizontal synchronization */
    volatile unsigned char   vcount; /* (R) vertical line counter */
    volatile unsigned char   penh;   /* (R) light pen horizontal position */
    volatile unsigned char   penv;   /* (R) light pen vertical position */
    volatile unsigned char   nmien;  /* (W) non-maskable interrupt enable */
};
otherwise llvm optimizer may decide that some parts of your program do nothing meaningful, and remove the code that operates on those variables.

Llvm-mos is a huge step in 8-bit development. It provides a 6502 backend for many programming languages, and produces a highly optimized machine code, compared to hand written assembly. If you always wanted to write a game or demo for an Atari or Commodore computer, but never felt like doing all the work in assembly, then now is your chance!

If you're interested in the source code used in this article, you can download it here.