Of course, it helps if you tell gcc to generate a 64 bit executable
... when you're using 64 bit operations:
tercel-2:$ time bin32/tbray4 datasets/hundred-o10k.apI'd assumed it would default as it does on linux, but no. Which also explains why I was confused about sizeof(size_t) being less than sizeof(long long) here.
tercel-2:$ time bin32/tbray6 datasets/hundred-o10k.ap
tercel-2:$ time bin/tbray4 datasets/hundred-o10k.ap
tercel-2:$ time bin/tbray6 datasets/hundred-o10k.ap
It's still around 8 clock cycles per byte - the 5% gain is not enough to alter yesterday's estimates, which is a little encouraging actually, as I wasn't sure whether or not 32 bit Sparc instruction generators, such as gnu lightning (used in rubinius) or whatever erlang's hipe uses would cause a significant slow down.