Sunday, January 3, 2010

Multi-Core Ant Colony Optimization for TSP in Go

Go is a new statically typed, garbage collected, concurrent programming language. It is native compiled and offers near-C performance. Like Erlang it uses a CSP model of lightweight processes ("goroutines") communicating via message passing. Messages can be synchronous (unbuffered) or asynchronous. Go supports multi-core processors.

On my usual ACO TSP test case Go is the best performing language so far. On a single core it was 2.8x slower than C, beating the previous best Shedskin Python. Go's speedup from 1 to 2 cores was 1.9, matching previous best Erlang. With 4 cores Go exceeds C's single core performance -- the first tested language to achieve this goal.

Go's lightweight processes are scheduled onto OS threads by the Go runtime system. This allows the system to support many more processes. I was able to run ACO TSP with 100,000 ant processes. The runtime system scheduler appears to treat these processes somewhat like futures, lazily scheduling them when another process is blocking on receipt of a message from a shared channel. This is inefficient for the type of parallel processing used in ACO (many seconds of parallel computation ending with a single message send), as only a single process (and thus a single core) is active at a time. This is a known issue, and the simple solution suggested on the golang-nuts mailing list is to add runtime.Gosched() calls to yield the processor. For my case this was insufficient, and adding additional heartbeat messages to each process helped force maximal multi-core usage.

See here for code and more details.

No comments: