[PVE-User] excessive I/O latency during CEPH rebuild

Eneko Lacunza elacunza at binovo.es
Tue Oct 28 16:45:51 CET 2014

Hi Adam,

On 28/10/14 16:31, Adam Thompson wrote:
> 4 nodes.
> 2 OSDs per node.
> Journal on the same drive as the OSD, unfortunately... the nodes only 
> have 3 drive bays each.
> Each node has 4 x 1Gb network in LACP bond, using OpenVSwitch, VLANs 
> on top of that.  Dedicated VLAN for CEPH and Proxmox management. Total 
> network bandwidth in use from each node during rebuild is only 
> ~1.5Gbps, with no single LACP member ever bursting higher than 
> ~600Mbps.  I believe it's unlikely to be a network problem, I've 
> stress-tested OVS at much higher data rates than this.
Yes I agree.
> You mention setting 'noout'; is there a way to do that inside the GUI, 
> or should I just do that at the CEPH CLI with "ceph osd set noout"?  I 
> can see that this would skip one rebalancing step, but I still have to 
> rebalance after I replace each disk, don't I?

Yes. I don't know of a way to set/unset noout from the GUI, I don't 
think there is one.
> FWIW, I'm replacing 8x 250GB disks with 8x 500GB disks that became 
> available from another storage cluster.  I'm almost done at this 
> point... just want to know how to avoid the massive performance hit 
> next time.
> Oh, and on the node with the new disk, I see IOWAIT times of ~15%. 
> Which makes sense IMHO, I'm writing a ton of data to the new disk.
Sure it is; when you put a new disk, ceph changes the CRUSHMAP taking 
the new disk into account; that changes the location of data on the 
disks... and all the new data belonging to the new disk must be written. 
I think that the new disk being of a new size could make data on the 
other disks to move around too, this should show on the relative weights 
of the OSDs in the GUI.

In our case we have a SSD for each 2 OSD disks (each node). Maybe we 
don't see such a high impact because of this; we recently RMA'd a disk 
an the change was quite smooth. Your disks must be seeking like its the 
end of the world... ;)


