Benchmarks

The question of Speed often crops up, so I’ve listed some comparisons of programming languages and MCU’s below.

Programming Language Benchmark

Computing the same 100-term polynomial 500,000 times, smaller is faster.

I’ve borrowed the table below from http://dan.corlan.net/bench.html which you will need to read for the full story.

Note

Performed on a 300MHz Pentium using Debian GNU/Linux, so it’s fairly old

Language single body (s) with call (s)
FORTRAN, g77 V2.95.4 2.73 2.73
Ada 95, gnat V3.13p 2.73 2.74
C, hand optimized , gcc V2.95.4 2.73  
Java, gcj V3.0 3.03 15.53
D, gcc V4.0.3+ 1)3.43 1)3.98
C, gcc V2.95.4 3.61 3.57
R translated to lisp using R2cl v0.1 and compiled with cmucl 3.69  
Lisp, CMU Common Lisp V3.0.8 18c+, build 3030 4.69 10.69
Java, jikes V1.15 (bytecompiled) 8.23 13.54
FORTH, hand optimized Gforth 0.6.1 1)18.21  
FORTH,** Gforth 0.6.1 1)27.26  
Python** +psyco (interpreted) 1)168.50  
Perl, more optimized$ V5.6.1 (natively compiled) 209.20  
Perl, more optimized$ V5.6.1 (interpreted) 258.64  
Perl, hand optimized*** V5.6.1 (bytecompiled) 306.18  
Perl* V5.6.1 (natively compiled) 367.23  
Python** V2.1.2 (interpreted) 505.50  
Perl* V5.6.1 (bytecompiled) 515.04  
RUBY*** (interpreted) 1074.52  
R V1.5.1 (interpreted) 5662.64  

Power Of A Language Or Performace Of An Implementation ?

This is the big question to me:

  • What if the project itself doesn’t need the maximum processing speed of the target MCU, but rather requires the minimum development time ? I prefer the interactivity of Forth so I can quickly determine hardware characteristics when beginning a new project. This saves me a lot of time correcting invalid assumptions in my code later on.

Benchmarking Different Forths/Mcus

Calculate the greatest common divisor for 0 to 200 Download: GCD

Inspired by : http://weblambdazero.blogspot.com.au/search/label/forth

Benchmark: Calculate The Greatest Common Divisor For 0 to 200

Less time is best.

Hardware Clock (MHz) Time (sec) Comments
STM32F0 Discovery Board with STM32F051 MCU 96 (overclocked) 0.3 Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051
STM32F0 Discovery Board with STM32F051 MCU 48 0.6 Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051
STM32F0 Discovery Board with STM32F051 MCU 8 2.59 Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051
Arduino Uno, or Arduino Nano with Atmega328 MCU 16 4 Flash Forth 5: 4 seconds, AmForth 6.3: 8 seconds, Yaffa Forth 0.6.1 : 70 seconds
STM8EF a STM8S003F3P6 (optimized for size not speed) 16 6.4 200 bench from RAM takes 6.9s, 200 bench from Flash takes 6.4s