Saturday, January 24, 2015

Compilers and costs of abstraction

Experimenting recently with embedded development I wondered how much abstraction comparable to virtual function table would cost. So I quickly wrote a small test app to check this:

#ifdef __AVR__
#include <avr/io.h>
void led_on() { PORTB |= 0x04; }
void led_off() { PORTB &= (unsigned char)~0x04; }
#else // __AVR__
#include <mcs51/8051.h>
void led_on() { P0 |= 0x04; }
void led_off() { P0 &= (unsigned char)~0x04; }
#endif // __AVR__

typedef void (*func_t)(void);
const func_t a = led_on;
const func_t b = led_off;

void main()
{
    while(1)
    {
        a();
        b();
    }
}

Compiling it with SDCC 3.4.0 for 8051 target and checking asm output I've got exactly what I expected - explicit reads from code segment and indirect calls. Compared to simple port bit manipulation this looks pretty overcomplicated:

_led_on:
        orl     _P0,#0x04
        ret
_led_off:
        anl     _P0,#0xFB
        ret
_main:
00102$:
        mov     dptr,#_a
        clr     a
        movc    a,@a+dptr
        mov     r0,a
        mov     a,#0x01
        movc    a,@a+dptr
        mov     dph,a
        mov     dpl,r0
        lcall   __sdcc_call_dptr
        mov     dptr,#_b
        clr     a
        movc    a,@a+dptr
        mov     r0,a
        mov     a,#0x01
        movc    a,@a+dptr
        mov     dph,a
        mov     dpl,r0
        lcall   __sdcc_call_dptr
        sjmp    00102$

Now, it's GCC time. Compiling source with "avr-gcc -mmcu=atmega328 -O2 -flto main.c" gives the following output:

00000080 <main>:
  80: 2a 9a       sbi 0x05, 2 ; 5
  82: 2a 98       cbi 0x05, 2 ; 5
  84: fd cf       rjmp .-6       ; 0x80 <main>

Uhm, applause to GCC developers :) Compiler is smart enough to see that function pointers are constant in this case, so it can call functions directly. And since functions are really tiny it decides to inline them. 

Conclusion? If you have smart enough compiler then constant function pointer tables are just as efficient as calling functions directly, with possible automatic inlining. And if you really need dynamic polymorphism you don't need to rewrite anything - just use non-constant pointers.

And, surely - GCC rocks :)