The question of Speed often crops up, so I’ve listed some comparisons of programming languages and MCU’s below.

Programming Language Benchmark

Computing the same 100-term polynomial 500,000 times, smaller is faster.

I’ve borrowed the table below from which you will need to read for the full story.

Language single body (s) with call (s)
FORTRAN, g77 V2.95.4 2.73 2.73
Ada 95, gnat V3.13p 2.73 2.74
C, hand optimized , gcc V2.95.4 2.73  
Java, gcj V3.0 3.03 15.53
D, gcc V4.0.3+ 1)3.43 1)3.98
C, gcc V2.95.4 3.61 3.57
R translated to lisp using R2cl v0.1 and compiled with cmucl 3.69  
Lisp, CMU Common Lisp V3.0.8 18c+, build 3030 4.69 10.69
Java, jikes V1.15 (bytecompiled) 8.23 13.54
FORTH, hand optimized Gforth 0.6.1 1)18.21  
FORTH,** Gforth 0.6.1 1)27.26  
Python** +psyco (interpreted) 1)168.50  
Perl, more optimized$ V5.6.1 (natively compiled) 209.20  
Perl, more optimized$ V5.6.1 (interpreted) 258.64  
Perl, hand optimized*** V5.6.1 (bytecompiled) 306.18  
Perl* V5.6.1 (natively compiled) 367.23  
Python** V2.1.2 (interpreted) 505.50  
Perl* V5.6.1 (bytecompiled) 515.04  
RUBY*** (interpreted) 1074.52  
R V1.5.1 (interpreted) 5662.64  

Power Of A Language Or Performace Of An Implementation ?

This is the big question to me:

  • What if the project itself doesn’t need the maximum processing speed of the target MCU, but rather requires the minimum development time ? I prefer the interactivity of Forth so I can quickly determine hardware characteristics when beginning a new project. This saves me a lot of time correcting invalid assumptions in my code later on.

Benchmarking Different Forths/Mcus

A benchmark I wrote up in July 2017 Benchmark: calculate the greatest common divisor for 0 to 200

Inspired by :

Benchmark: Calculate The Greatest Common Divisor For 0 to 200

Less time is best.

Hardware Clock (MHz) Time (sec) Comments
STM32F0 Discovery Board with STM32F051 MCU 96 (overclocked) 0.3 Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051
STM32F0 Discovery Board with STM32F051 MCU 48 0.6 Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051
STM32F0 Discovery Board with STM32F051 MCU 8 2.59 Mecrisp-Stellaris RA 2.3.7 with M0 core for STM32F051
Arduino Uno, or Arduino Nano with Atmega328 MCU 16 4 Flash Forth 5: 4 seconds, AmForth 6.3: 8 seconds, Yaffa Forth 0.6.1 : 70 seconds
STM8EF a STM8S003F3P6 (optimized for size not speed) 16 6.4 200 bench from RAM takes 6.9s, 200 bench from Flash takes 6.4s