Monday, May 28, 2012

The power of Software Design Patterns

The first day of GeeCON 2012 conference featured a presentation held by dr Ivar Jacobson. His speech called "Liberating the Essence from the Burden of the Whole: A Renaissance in Lean Thinking" made a huge impression on the audience, as it summarized in under an hour his enormous live-long experience in IT industry. It has been a great lesson to anybody who was lucky to attend it.

There was one point, however, I couldn't agree with. Dr Jacobson stated that despite years of its existance software design still lacks universal, solid foundations. Working as a coding architect I have seen a lot of really bad designed and ugly code, including my own. What's more, the abundance of different programming languages and coding styles seems to make things even worse.

There is no single solution that addresses all possible software design problems, but there are some that help solving a great number of those issues. Those universal solutions are called Software Design Patterns. The power of design patterns comes from their pragmatic nature - they are not another product of a software company, university or technical committee, but they are a collection of best practices used by the most skilled programmers to solve the most common programming challenges. The greatest value of software design patterns is that they are universal. It's a common mistake, made even by experienced programmers, to think that design patterns are related only to object oriented programming. They can be applied to any programming language and paradigm. Let me show you how it works.

One of the simplest and yet most powerful design patterns is the state pattern. It allows to easily implement even a very complicated state machine. In this pattern, each state is a separate entity (can be an object), and it is the current state which decides which state comes as the next step of the program. The main loop just holds a reference to the current state and requests it to take certain actions.

State pattern can be used for example to implement a simple ping-pong machine. In this scenario there are two states: ping and pong. Both implement only two functionalities - displaying "ping" or "pong" on the terminal and moving to the next step: in case of ping the next step is pong, and vice versa.

Below is a C++ implementation of a ping-pong state machine. Ping and pong are simple objects that implement two functions: show() and next(). The main program starts with the ping object and than starts to call both methods in a loop. The most crucial element is that the algorithm uses only one reference pointing to the current state. When a transition to the next step takes place, the reference to the previous state is replaced by the new one. This way the main program doesn't need to care about the logic of the state machine, the states take care about it themselves:
#include <iostream>

class State {
public:
    virtual void show() = 0;
    virtual State *next() = 0;
};

class StatePing: public State {
public:
    void show();
    State *next();
};

class StatePong: public State {
public:
    void show();
    State *next();
};

void StatePing::show() {
    std::cout << "ping" << std::endl;
}

void StatePong::show() {
    std::cout << "pong" << std::endl;
}

State *StatePing::next() {
    return new StatePong();
}

State *StatePong::next() {
    return new StatePing();
}

int main(int argc, char *argv[]) {
    State *st = new StatePing();
    for (int i = 0; i < 10; i++) {
        st->show();
        st = st->next();
    }
    return 0;
}
Now you may wonder how can you make use of the state pattern using a functional programming language like, let's say, Erlang? Well, as I sad before, design patterns are universal, so you can use actors instead of objects. Here is Erlang implementation of the state pattern:
-module(states).
-export([state/1, ping/0, pong/0, start/0]).

state(Pid) ->
    receive
        show ->
            Pid ! show,
            state(Pid);
        next ->
            Pid ! {next, self()},
            state(Pid);
        {new, From} ->
            state(From)
    end.

ping() ->
    receive
        show ->
            io:format("ping~n"),
            ping();
        {next, From} ->
            Pid = spawn(states, pong, []),
            From ! {new, Pid}
    end.

pong() ->
    receive
        show ->
            io:format("pong~n"),
            pong();
        {next, From} ->
            Pid = spawn(states, ping, []),
            From ! {new, Pid}
    end.

start() ->
    S1 = spawn(states, ping, []),
    spawn(states, state, [S1]).
"Ping", "pong" and "state" are actors. "State" actor acts as a current state reference - it holds the pid (process id) of the current actor. When you send message "next" to "state" it passes it to the current actor, which spawns a new actor and sends its pid back to "state". This way "state" always holds a pid of the current state represented by actor.
You can load this code into your erlang shell and use observe how it works:
S = states:start().
S ! show.
S ! next.
S ! show.
How about languages that don't support object oriented programming or actors? Have a look at this Scheme example:
(define ping
  (lambda (arg)
    (cond ((equal? arg "show") (display "ping\n") ping)
          ((equal? arg "next") pong))))

(define pong
  (lambda (arg)
    (cond ((equal? arg "show") (display "pong\n") pong)
          ((equal? arg "next") ping))))

(define state
  (let ((current-state ping))
    (lambda (arg)
      (set! current-state (current-state arg)))))

(define repeat
  (lambda (n)
    (let loop ((i 0))
      (cond ((< i n)
             (state "show")
             (state "next")
             (loop (+ i 1)))))))

(repeat 10)
In this example "ping" and "pong" are first class functions which can be passed as arguments to other expressions. Reference to ping or pong is stored in the "state" function, which acts as a closure. It means that the value of the current-state variable is being retained between subsequent calls of the "state" function.

But what if you are a real hardcore hacker who despises high level programming languages and codes only in assembly? Or maybe there are some cases when you may have to? Well, than I have somehing for you too - here is a short state machine written for 64-bit Linux in Intel x86 assembly using The Netwide Assembler:
global  _start

section .data
    ping_msg:  db  'ping', 0x0A
    ping_len:  equ $-ping_msg
    pong_msg:  db  'pong', 0x0A
    pong_len:  equ $-pong_msg

section .text

_start:
    ; init first state
    mov     rsi, _ping_show
    mov     rdi, _ping_next
    ; repeat 10 times
    mov     rcx, 10
_loop:
    push    rcx
    call    rsi            ; show
    call    rdi            ; next
    pop     rcx
    loop    _loop
    ; finish
    mov     rax, 60         ; sys_exit
    mov     rdi, 0          ; exit code
    syscall

_ping_show:
    mov     rax, 1          ; sys_write
    mov     rdi, 1          ; stdout
    mov     rsi, ping_msg
    mov     rdx, ping_len
    syscall
    mov     rsi, _ping_show
    mov     rdi, _ping_next
    ret

_ping_next:
    mov     rsi, _pong_show
    mov     rdi, _pong_next
    ret

_pong_show:
    mov     rax, 1          ; sys_write
    mov     rdi, 1          ; stdout
    mov     rsi, pong_msg
    mov     rdx, pong_len
    syscall
    mov     rsi, _pong_show
    mov     rdi, _pong_next
    ret

_pong_next:
    mov     rsi, _ping_show
    mov     rdi, _ping_next
    ret
To compile the assembly example in Linux use the following commands:
nasm -f elf64 -o states.o states.asm
ld -o states states.o
In this case index registers rsi and rdi are used to keep references to the current state. The main loop just calls "_show" and "_next" subsequently and the "ping" and "pong" states take responsibility for passing valid pointers to the CPU registers.

As you can see, you can use design patterns to properly implement soultions to some universal software desing problems. You only need to be aware of the strength and possibilities of your current programming language and use them wisely.

Sunday, May 27, 2012

GPars - Groovy Parallel Systems

One of the most interesting lectures on this year's GeeCON University Day was Dierk König's presentation on GPars, which is a Groovy library that allows easy handling of concurrent and parallel processing using multiple approaches: from classic actors to Kanban dataflow. Dataflow concurrency provided by GPars eliminates a few problems typical for actor-based design. One is message queue overflow, which takes place when a message producer creates more requests than a consumer can handle - in a long run such situation leads to memory exhaustion and, consequently, system instability or crash. The second one is random deadlocks - GPars, by design, guarantees that if you happen to introduce a deadlock in your dependencies, the deadlock will occur each time you run the code, so it's easy to get spotted by unit tests.