- Code: Select all
`#include <array>`

int f1(int num) {

static std::array<long, 10> mult {

0, 100000000, 10000000, 1000000, 100000, 10000, 1000, 100, 10, 1};

return mult[num];

}

int f2(int num)

{

static long mult[] = {

0, 100000000, 10000000, 1000000, 100000, 10000, 1000, 100, 10, 1};

return mult[num];

}

Compiler output (GCC 8.3 with -O3):

- Code: Select all
`f1(int):`

movsx rdi, edi

mov rax, QWORD PTR f1(int)::mult[0+rdi*8]

ret

f2(int):

movsx rdi, edi

mov rax, QWORD PTR f2(int)::mult[0+rdi*8]

ret

f2(int)::mult:

.quad 0

.quad 100000000

.quad 10000000

.quad 1000000

.quad 100000

.quad 10000

.quad 1000

.quad 100

.quad 10

.quad 1

f1(int)::mult:

.quad 0

.quad 100000000

.quad 10000000

.quad 1000000

.quad 100000

.quad 10000

.quad 1000

.quad 100

.quad 10

.quad 1

Both implementations are identical! One of these uses the new c++ array class and the other looks (other than the static keyword) like it could have been written 30 years ago. Just because an abstraction provides a safer interface, it does not mean slower or bigger.

When someone says, why can't I use macro, pointers, hand-rolled data-structures, etc. You can (I'll come to that in a moment), but really only as a last resort or with really good reason. Frankly, the world is full of buggy code. Please don't contribute to it. Be safe first, measure second, and use the sharp knives when necessary.

Now, back to using the sharp knives... Yes, I am guilty playing with knives, but I try to only bring them out when necessary. If you look at some of my code on GitHub, you will see that I have used macros, for example, but only because there is no alternative. See https://github.com/tonywalker1/stuff/blob/master/include/stuff/core/exception.h.

If you are curious about the code above, I am using a library function that is slower than I need. I rolled my own to shave a factor of 10 of the run-time. One of the tricks in my toolbox is to use lookup tables for special circumstances. Lookups are always constant time and often faster than other techniques (assuming small tables, big enough caches, etc.). That is what the code above prototypes. By the way, CPUs used to (and I think still do) speed division and multiplication via lookup tables. Also, checkout FPGAs for that technique on steroids.

I hope this helps...