[PVE-User] excessive I/O latency during CEPH rebuild
Adam Thompson
athompso at athompso.net
Tue Oct 28 16:05:05 CET 2014
On 14-10-28 10:03 AM, Adam Thompson wrote:
> I'm seeing ridiculous I/O latency after out'ing and re-in'ing a disk
> in the CEPH array; the OSD monitor tab shows two OSDs (i.e. disks)
> having latency above 10msec - they're both in the 200ms range - but
> reading a single uncached sector from a virtual disk takes >10sec.
>
> It's bad enough that all my virtualized DNS servers are timing out and
> this, of course, directly impacts service.
>
> During normal (non-rebuild, non-rebalance) operations, CEPH is not
> terribly fast to write, but delivers acceptable read speeds.
>
> Where do I start looking for problems? Are there any knobs I should
> be tweaking for CEPH?
>
A related question: to proactively replace a disk, I'm doing
Stop->Out->Remove / swap disk / Create OSD. Is that a viable
procedure? Other than the rebuild I/O starving regular reads, it seems
to be working...
--
-Adam Thompson
athompso at athompso.net
More information about the pve-user
mailing list