[PVE-User] excessive I/O latency during CEPH rebuild

Tue Oct 28 16:31:13 CET 2014

On 14-10-28 10:11 AM, Eneko Lacunza wrote:
> Hi Adam,
>
> You only have 3 osd in ceph cluster?
>
> What about journals? Are they inline or in a separate (ssd?) disk?
>
> What about network? Do you have an phisically independent network for 
> proxmox/vms and ceph?
>
> We have a currently 6-osd 3-node ceph cluster;  doing an out/in of a 
> osd, doesn't create a very high impact. If you in a new osd (replace a 
> disk) the impact is noticeable but our ~30 vms were yet workable. We 
> do have different physicall networks for proxmox/VMs and ceph. (1gbit)

4 nodes.
2 OSDs per node.
Journal on the same drive as the OSD, unfortunately... the nodes only 
have 3 drive bays each.
Each node has 4 x 1Gb network in LACP bond, using OpenVSwitch, VLANs on 
top of that.  Dedicated VLAN for CEPH and Proxmox management. Total 
network bandwidth in use from each node during rebuild is only ~1.5Gbps, 
with no single LACP member ever bursting higher than ~600Mbps.  I 
believe it's unlikely to be a network problem, I've stress-tested OVS at 
much higher data rates than this.

You mention setting 'noout'; is there a way to do that inside the GUI, 
or should I just do that at the CEPH CLI with "ceph osd set noout"?  I 
can see that this would skip one rebalancing step, but I still have to 
rebalance after I replace each disk, don't I?

FWIW, I'm replacing 8x 250GB disks with 8x 500GB disks that became 
available from another storage cluster.  I'm almost done at this 
point... just want to know how to avoid the massive performance hit next 
time.

Oh, and on the node with the new disk, I see IOWAIT times of ~15%. Which 
makes sense IMHO, I'm writing a ton of data to the new disk.

-- 
-Adam Thompson
  athompso at athompso.net