>> (p.1)
    Author Topic: Assessing the impact of TLB trashing on memory hard algorhitms  (Read 7734 times)
    Genoil (OP)
    Sr. Member
    ****
    Offline Offline

    Activity: 438
    Merit: 250


    View Profile
    November 28, 2015, 10:24:19 AM
     #1

    During the development of the CUDA miner for Ethereum, I ran into an issue where the hashrate on GTX750Ti dramatically drops when the size of the memory buffer the miner operates on exceeds a certain threshold (1GB on Win7/Linux, 512MB on Win8/10). After a long discussion on the CUDA forums, one of the designers of CUDA weighed in and identified the issue as TLB trashing. I'm currently conducting a bit of research on the subject and have created a simple test program that measures these effects. It simulates the 'dagger' part of the Ethereum algorithm at different memory buffer (DAG) sizes and writes the results to a CSV file. So far, I have concluded that it is not an Nvidia-only issue, but manifests on AMD hardware as well. And apparently this is not an ETH-only issue, I've got some reports from srcypt-jane miners in as well.

    I'm currently looking for as many as possible hardware/OS combinations to come to a recommendation for miners as well as designers of new algo's. Below is an example for ETH hashrate on GTX780 on Windows with increasing buffer size (in MB):



    The test program can be dowloaded from https://github.com/Genoil/dagSimCL. Win-64 binaries are in the x64/Release folder. You can also build it yourself, but only have supporting MSVC files targetted at Nvidia OpenCL. On AMD hardware you may want to run

    Code:
    set GPU_MAX_ALLOC_PERCENT 100

    first. By default, the program tries to use all of your GPU's RAM up until 4096MB. If you have less system RAM, you may add a cmd line param to test up until a lower maximum:

    Code:
    dagsimCL.exe 2048

    If you have multiple GPU's, you need to add a second param:

    Code:
    dagsimCL.exe 4096 1

    If you have multiple OpenCL platform installed:

    Code:
    dagsimCL.exe 4096 0 1

    I would be very grateful if you could participate in this bit of research and possible discuss any workarounds. Thanks!

    p.s. note that achieved hashrates with the test program can be significantly higher than what you actaully get with ethminer. This is because it only simautes the Dagge stages, not the Keccak stages.



    ETH: 0xeb9310b185455f863f526dab3d245809f6854b4d
    BTC: 1Nu2fMCEBjmnLzqb8qUJpKgq5RoEWFhNcW
Page 1
Viewing Page: 1