[PVE-User] excessive I/O latency during CEPH rebuild

Adam Thompson athompso at athompso.net
Tue Oct 28 16:05:05 CET 2014

On 14-10-28 10:03 AM, Adam Thompson wrote:
> I'm seeing ridiculous I/O latency after out'ing and re-in'ing a disk 
> in the CEPH array; the OSD monitor tab shows two OSDs (i.e. disks) 
> having latency above 10msec - they're both in the 200ms range - but 
> reading a single uncached sector from a virtual disk takes >10sec.
> It's bad enough that all my virtualized DNS servers are timing out and 
> this, of course, directly impacts service.
> During normal (non-rebuild, non-rebalance) operations, CEPH is not 
> terribly fast to write, but delivers acceptable read speeds.
> Where do I start looking for problems?  Are there any knobs I should 
> be tweaking for CEPH?

A related question: to proactively replace a disk, I'm doing 
Stop->Out->Remove / swap disk / Create OSD.  Is that a viable 
procedure?  Other than the rebuild I/O starving regular reads, it seems 
to be working...

-Adam Thompson
  athompso at athompso.net

More information about the pve-user mailing list