Your suggestion would add the overhead of a function call and return on every iteration to save a pointer deref.
Looks like a bad trade to me.
I meant to move the dereference outside of iteration completely. Have the iteration cycle code for each algo so it doesn't go through dereferencing.
Note - all of this is speculation, I still didn't measure exactly where the slowdown is and why it's slower. I'm just reporting that for some reason non-AES versions of algos are slower in cpuminer-opt compared to cpuminer-multi. This needs further investigation.
One same CPU, these algos are slower on cpuminer-opt compared to cpuminer-multi:
"groestl" => 1109819 1000,/ cpuminer-opt
"groestl" => 1125917 1000,/ cpuminer-nicehash
"keccak" => 6964234 1000,/ cpuminer-opt
"keccak" => 8332952 1000,/ cpuminer-nicehash
"luffa" => 2728931 1000,/ cpuminer-opt
"luffa" => 3177996 1000,/ cpuminer-nicehash
"lyra2" => 716945 1000,/ cpuminer-opt
"lyra2" => 921109 1000,/ cpuminer-nicehash
"neoscrypt" => 27583 1000,/ cpuminer-opt
"neoscrypt" => 28891 1000,/ cpuminer-nicehash
"pentablake" => 3479320 1000,/ cpuminer-opt
"pentablake" => 3609862 1000,/ cpuminer-nicehash
"pluck" => 1722 1000,/ cpuminer-opt
"pluck" => 1818 1000,/ cpuminer-nicehash
"s3" => 1086149 1000,/ cpuminer-opt
"s3" => 1201897 1000,/ cpuminer-nicehash
"scrypt" => 91557 1000,/ cpuminer-opt
"scrypt" => 99702 1000,/ cpuminer-nicehash
"sha256d" => 53122339 1000,/ cpuminer-opt
"sha256d" => 54669375 1000,/ cpuminer-nicehash
"shavite3" => 2232258 1000,/ cpuminer-opt
"shavite3" => 2343704 1000,/ cpuminer-nicehash
"skein" => 6405675 1000,/ cpuminer-opt
"skein" => 6586806 1000,/ cpuminer-nicehash
"skein2" => 7985012 1000,/ cpuminer-opt
"skein2" => 8167405 1000,/ cpuminer-nicehash
I'm using this version of cpuminer-multi
https://github.com/nicehash/cpuminer-multiWell, your pseudo code had the call/ret inside the loop.
Most of the algos in your list are of little interest, except neoscrypt. That is one algo I'd like to improve.
In relative terms it underperforms the GPU version by a lot.
Another thing to consider is that local hashrate reporting by the miner isn't very reliable and your data
is well within a 2% margin of error. I was seeing greater variation just from different sessions of the same code.
I thought I was making incremental improvements with some changes and regressions with others when all along
it was just noise.
I like intelectual challenges but you need to do a better job. You don't provide the full picture initilally and only
give more info after I poke holes in your initial presentation. This seeems to be a pattern with your "suggestions".
You obviously have some knowledge, maybe not as much as me, but knowledge in areas where I am weak, c++,
for example. I'm also weak in GUI apps and web programming but I'm strong in OS fundamentals and CPU architecture,
though not specifically Linux and x86. One of my biggest challenges has been applying my knowledge and experience
to an unfamiliar environment. I tend to make a lot of mistakes as a result.
I have given you the benefit of the doubt and tried to probe you for more info in areas where I didn't have the confidence
to call you out. But so far it's come up empty. When you challenge me on one of my strengths you'd better be well
prepared.
But so far it's come up empty