In windows there's a 12% memory backoff value in place in cudaMiner to avoid out of memory crashes. If your card has no monitor attached you can get away with changing that value to 1% or 2% which increases performance as cudaMiner uses more VRAM - if autotune doesn't crash, that is.
to be more specific, the 12% backoff is a "better safe than sorry" feature to avoid evil crashes during autotune with the nVidia WDDM drivers. I've had weird stuff happen, like single digit hash rates and total system lockups when taking too much memory. So I chose a pretty high default backoff percentage on Windows to avoid any kind of trouble.
Christian
I've had this happen in YACMiner as well - and what I've found is happening is the amount of memory being allocated overflows the dedicated memory on the card and rolls over to dynamic memory (think of swapping memory to disk), AKA system memory. On one of my systems, when this happens, it rolls over to system ram, which doesn't have enough available, so Windows swaps it out to disk. Think about how slow hashing YAC on your hard drive would be! It's easier to pull the power cord than recover from that.
The only way I've been able to see this is by monitoring the memory using GPUz while launching - if you have any thoughts, would love to try to detect when that happens in code and throttle the settings back.