Tuesday, October 18, 2016

Go Forth with Arduino

Forth is an unusual programming language. To learn it, "you must unlearn what you have learned", as Master Yoda would say. There are many indications that Forth is a programming language of Jedi: it uses postfix notation for expressions (so "a + b" becomes "a b +", which means "to sum receive, a and b you must add" using Yoda's words), it is extremely minimalistic (in most Forths, the language core is written in assembly, and the rest of the language constructs, including conditional, branching and loop instructions, is written in the Forth itself), and it requires long study to understand (even people using Forth in real life to build commercial software sometimes admit, that they are yet to understand it). Also, although Forth has an official ANSI standard, it is so flexible, that most Forth masters tend to build their own light sabres (I mean Forth systems) using the base Forth only as a foundation. Such thing as MANX Musical Forth (an extension of the regular Forth designed specifically to work with MIDI) or J1 Forth (an FPGA implementation of a stack-based CPU) are nothing unusual in the world of Forth. Forth is also complete: it's an operating system, a runtime environment, and an interactive compiler - all that in just a few kilobytes (not megabytes!) of code.

Forth had its best days during the early computers era, since it is extremely well suited for machines with very limited resources. Some of the 8-bit home computers, like British Jupiter ACE or French Hector HRX, used Forth as their operating system. Recently, it has been used successfully in XO-1 laptop's firmware.

Saying all that, no wonder that there is a Forth for Arduino. No wonder that there is more than one. No wonder that most of Forths for Arduino get rid of the bootloader and take full control of the hardware. I decided to try some of them and describe my very short and very subjective experience here.

First, you need an Arduino Uno, or Arduino Nano with Atmega328 CPU, a programmer (I recommend USBasp since it's inexpensive and easy to use) and software that allows programming Arduino board directly. If you have Ubuntu Linux, you can install it with the following command:
sudo apt-get install gcc-avr binutils-avr avr-libc gdb-avr avrdude
For Mac OS X use Homebrew:
brew tap osx-cross/avr
brew install avr-libc
brew install avrdude --with-usb
On Windows just download and install Atmel Studio.

Connect USBasp to the ISP pins of Arduino and put it into the USB port in your computer. You can now upload software straight to the Atmel chip using avrdude command line tool. It is important that you understand what this tool does, before you start fiddling with command line switches, because you can brick the board if you misuse them (this applies primarily to hfuse). I recommend reading Martin Currey's Arduino / ATmega 328P fuse settings if you want to use settings other than provided in this article.

AmForth 6.3

Let's start with AmForth. Download the AmForth distribution archive, and extract appl/arduino/uno.hex and appl/arduino/uno.eep.hex files. They contain compiled binaries in the form of human readable Intel HEX format. Connect the programmer and run the following command (the whole command should be a single line):
avrdude -p m328p -c usbasp -U flash:w:uno.hex -U eeprom:w:uno.eep.hex -U efuse:w:0xfd:m -U hfuse:w:0xd9:m -U lfuse:w:0xff:m -v
After a while, AmForth will be uploaded to Arduino, and the board will reset. Now, connect the Arduino with your computer using mini USB cable and open a terminal program with the following parameters: baud rate 38400, 8 bits, no parity, and 1 stop bit. For Windows, you can use Putty, just select connection type "serial" and a suitable COM port. In Mac OS X and Linux you can use screen, but the connection port depends on the chip your board uses for serial communication. If you have original Arduino, or a more expensive clone, which uses FTDI chip, the command will look similar to:
screen /dev/ACM0 38400
for Linux, and
screen /dev/tty.usbmodem1421 38400
for Mac OS X. In your case device port can be different, depending on which port Arduino is connected to, but it should follow the general pattern of /dev/ACM... for Linux and /dev/ttyusbmodem... for Mac.
With cheaper Arduino clones, using CH340 chip for serial communication, the command will look like:
screen /dev/ttyUSB0 38400
for Linux, and something similar to:
screen /dev/cu.wch\ ch341\ USB\=\>RS232\ 1420 38400
on Mac OS X. To disconnect from screen, use the following key combination: ctrl + a, ctrl + backslash, enter.

If everything goes well, you can start using Forth on your Arduino. For example, you can enter your first program, which calculates the greatest common divisor of two numbers:
: gcd ( a b -- gcd )
  begin
    dup
    while
      swap over mod
  repeat
  drop ;
Forth instructions are called words and are stored in a dictionary. The first line defines a word gcd (colon is the beginning of a word definition), and contains a comment (in brackets) which says that the word expects two values as input (a and b) and produces one value as output (gcd). The names used in comment can be anything, since all values in Forth are put on, and taken from, an anonymous data stack.
Words begin and repeat denote a loop. Within the loop, the current value located on the top of the stack is duplicated. The word "while" takes a value from the top of the stack and checks whether it is false (zero) or true (any value other than zero). If it's true, the loop continues. Because "while" consumes the value it takes from the stack, we need word "dup" before it. Otherwise, the word "while" would eat up all the values from the stack.
Next we swap the top two values on the stack and replicate the lower one to the top. Basically, we take "a b" series and create "b a b" from it. The word "mod" takes two values from the stack, divides them, and puts back the remainder. Now the stack looks like this: "remainder b". The loop is repeated, so the remainder is duplicated and evaluated. If it's not zero, the loop continues, otherwise it is removed from the stack (by the word "drop") and the value remaining on the stack is the final result. Semicolon ends the word definition.

To test how it works, input the following command:
15 25 gcd .
It puts 15 and 25 on the stack, executes "gcd" word and prints (dot means "print on the screen") the top value from the stack, which happens to be our result.

I wanted to know how fast AmForth is, so I wrote a simple benchmark, which calculates the greatest common divisor for all combinations of numbers from 0 to n:
: bench ( n -- )
  dup 0 do
    dup 0 do
      j i gcd
      drop
    loop
  loop
  drop ;
In Forth "do/loop" is a loop which needs two values on the stack - the starting value, and the ending value. With "10 0 do ... loop" you repeat the loop from 10 down to 0. To execute the loop n times (as you see in the comment, "n" is expected to be on the stack when you run "bench") you need to duplicate it with "dup", then put 0 on the stack, and then call "do" which will consume those values. But because "n" was duplicated, it is still on the stack, and can be used in the inner loop. Finally, we calculate the greatest common divisor on the current counters of the inner and the outer loop ("i" and "j"), but since we don't need the result and don't want it to remain on the stack and affect the loops, we need to "drop" it.

The following test
200 bench
takes about 8 seconds on AmForth to complete. It is a very good result comparing to other Forths I tested.

Flash Forth 5

Installing Flash Forth also requires a programmer. You also need avr/hex/ff_uno.hex file, which you can upload to the board using USBasp with:
avrdude -p m328p -c usbasp -e -U flash:w:ff_uno.hex -U efuse:w:0xfd:m -U hfuse:w:0xda:m -U lfuse:w:0xff:m -v
Again, the whole command should be a single line.

You can communicate with Flash Forth the same way as with AmForth, but using different baud rate:
screen /dev/ACM0 9600
Flash Forth comes with separate math library, so to be able to define "gcd" word in Flash Forth, you need to download it from http://flashforth.com/math.txt and rewrite or upload it via the terminal. You need to be careful with uploading, though. If you just copy and paste the whole file in the terminal you will overrun the input buffer and Flash Forth will start returning errors. The same applies to uploading code to other Forths, too. It's best to copy the code definition by definition or to use special software, which slows down the transmission (for example, iTerm has a special paste option "Paste Slowly").

Flash Forth does not support "do/loop", and uses "for/next" instead. So the word "bench" has a slightly different definition:
: bench ( n -- )
  dup for
    r@
    dup for
      dup r@ gcd drop
    next
    drop
  next
  drop ;
Word "for" takes only one value from the stack, and always counts down to zero. Also, because there is no "do/loop" in Flash Forth, there is no "i" and "j" either, and you need to copy the current loop counter from the return stack (which keeps track of the program execution) to the data stack with "r@". Except from the syntax, the loop construct remains the same as with AmForth.
However, Flash Forth turns out to be much faster. Running "200 bench" takes about 4 seconds to complete, which is twice as fast as with AmForth.

328eForth 2.20

It's another Forth for Arduino, which is a direct descendant of renowned eForth. Its main advantage is simplicity - the whole source code fits in one file of Atmel assembly, and the compiled hex is only 14 kilobytes long. Unfortunately, the original project page is no longer avaiable, but you can download the source code of version 2.20 from this Github repository.

The repository does not provide a compiled binary, though, so you need to make it yourself. Fortunately, it's quite easy - all you need is Atmel Assembler, which you can find on Sourceforge. It's a Windows executable, but it works pretty well with Wine, so you can compile 328eForth on Linux and Mac with no problem. Put my_forth.asm file in the avr8/Atmel directory and run the following command:
avrasm2.exe -fI -I Appnotes2/ my_forth.asm
You should now have the my_forth.hex file, which you can upload to Arduino with:
avrdude -p m328p -c usbasp -e -U flash:w:my_forth.hex -U efuse:w:0xfd:m -U hfuse:w:0xd8:m -U lfuse:w:0xff:m -v
To communicate with 328eForth, connect via terminal using baud rate 19200.

I love this Forth implementation for its simplicity, but unfortunately it is quite slow and buggy. The "gcd" word does not work properly, because the word "mod" is broken and returns wrong results. Also, the benchmark executes in 27 seconds with 328eForth, comparing to 8 with AmForth and 4 with Flash Forth. To make things worse, the project page and documentation are missing and can be reached only partially, through the Wayback Machine.

Yaffa Forth 0.6.1

This Forth is different. Yaffa Forth is written in C and can be uploaded to the Arduino board like any regular sketch, with Arduino IDE. It's a good option for people who don't have a programmer or only want to give Forth a short try. It also provides standard Arduino interface with words such as "pinMode", "digitalRead/digitalWrite", "analogRead/analogWrite", etc. AmForth, Flash Forth and 328eForth don't use Arduino libraries, so you have to talk to the pins directly via I/O ports. Also, because Yaffa's source code is very clean and well documented, you can easily extended it with new words which can use existing Arduino libraries written in C.

Yaffa Forth has some deficiencies, though. Because it works on top of virtual machine written in C, which itself also needs memory, it has less space available for the stack, which can be especially painful on Arduino Uno or Nano (they both only have 2kB RAM). Also, it is much slower than previous Forths - the benchmark code identical to AmForth's takes 70 seconds to run, which makes it about ten times slower than AmForth and almost twenty times slower than Flash Forth.

There is also one more important difference between aforemetioned Forths and Yaffa Forth. AmForth, Flash Forth and 328eForth store all user-defined words in flash memory, together with the main dictionary (in 328eForth a new word is defined in RAM, but must be copied to flash with word "flush" before it can be used). This means that if you turn off the power, your definitions remain in Arduino's memory. With Yaffa Forth, all new words are stored in RAM and disappear once you turn the board off or press the reset button. If you want to store your definitions for future use, you must write all of them in EEPROM with eeLoad -> code -> ctrl + z). It's because Yaffa Forth relies on Arduino bootloader, which prevents user applications from writing directly to flash. On the other hand, EEPROM memory can handle almost ten times as many write cycles as flash before it wears off, so in this respect it may be more hardware friendly to use Yaffa Forth than its counterparts.