So you could lower the threshold (I am just throwing it out there) or better yet, figure out why one GPU is not performing like the others. What are your GPUs and are all they all identical?
They are all GTX 1060, although from different manufacturers and versions. 2 with double fan and 3 with sigle fan.
The errors are coming randomly for any of the cards. Not sure if it is normal or not
That doesn't sound normal. Unfortunately, I don't have any 1060's so I have no idea what the OC settings should be. In the meantime, just to keep mining with it how it is, you may want to disable the watchdog [in 1bash] which will keep restarting your miner as detects low GPU utilization. Hopefully somebody else can chime in and help you with the OC settings.
My 13 gpu rig 1060 3Gb OC (this runs on centos 7):
sudo nvidia-smi -pl 95
sudo nvidia-settings -aGPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -a [gpu:8]/GPUMemoryTransferRateOffset[3]=1400
and I don't touch the Graphics Clock here.
My rig 3 1060 + 2 970. The 1060 settings:
sudo nvidia-smi -pl 75
sudo nvidia-settings -aGPUMemoryTransferRateOffset[3]=1500
sudo nvidia-settings -aGPUGraphicsClockOffset[3]=-100
in 1bash would be:
__CORE_OVERCLOCK_1=-100
MEMORY_OVERCLOCK_1=1500
__CORE_OVERCLOCK_2=-100
MEMORY_OVERCLOCK_2=1500
__CORE_OVERCLOCK_3=-100
MEMORY_OVERCLOCK_3=1300
for gpus 1,2,3
mining ETC around 24000Mh/s per gpu. Temp goes around 58-65 ºC
Running time: 500 - 1000 hours
hope help as reference ;-)