The Fastcode Project is mostly about optimizing for architecture. I was looking for similar project, but sadly i couldn’t find any. Maybe I wasn’t usign the right phrases in google.
A few days ago I remembered the LibTom Library for C I downloaded a while ago. The LibTomMath is a multiple precision integer library. I was reading the code of mp_mul function. Then I looked up the referenced algorithms and more…
Comba multiplication
Karatsuba multiplication
Toom-Cook multiplication
Strassen algorithm for polynomial multiplication
Multiplication using the Fast Fourier Transform
Martin Fürer FFT algorithm
the time complexity of conventional long multiplication is O(n^2)
Karatsuba Multiplication: O(n^lg2(3)) or O(n^1.585)
Toom-Cook 3-way: O(n^1.464)
FFT: O(n log n)
A promising library: GNU Multiple Precision Arithmetic Library
And here is some instruction timings for optimizing for architectures I forgot to put up last time: http://swox.com/doc/x86-timing.pdf