[PVE-User] excessive I/O latency during CEPH rebuild
Adam Thompson
athompso at athompso.net
Tue Oct 28 16:31:13 CET 2014
On 14-10-28 10:11 AM, Eneko Lacunza wrote:
> Hi Adam,
>
> You only have 3 osd in ceph cluster?
>
> What about journals? Are they inline or in a separate (ssd?) disk?
>
> What about network? Do you have an phisically independent network for
> proxmox/vms and ceph?
>
> We have a currently 6-osd 3-node ceph cluster; doing an out/in of a
> osd, doesn't create a very high impact. If you in a new osd (replace a
> disk) the impact is noticeable but our ~30 vms were yet workable. We
> do have different physicall networks for proxmox/VMs and ceph. (1gbit)
4 nodes.
2 OSDs per node.
Journal on the same drive as the OSD, unfortunately... the nodes only
have 3 drive bays each.
Each node has 4 x 1Gb network in LACP bond, using OpenVSwitch, VLANs on
top of that. Dedicated VLAN for CEPH and Proxmox management. Total
network bandwidth in use from each node during rebuild is only ~1.5Gbps,
with no single LACP member ever bursting higher than ~600Mbps. I
believe it's unlikely to be a network problem, I've stress-tested OVS at
much higher data rates than this.
You mention setting 'noout'; is there a way to do that inside the GUI,
or should I just do that at the CEPH CLI with "ceph osd set noout"? I
can see that this would skip one rebalancing step, but I still have to
rebalance after I replace each disk, don't I?
FWIW, I'm replacing 8x 250GB disks with 8x 500GB disks that became
available from another storage cluster. I'm almost done at this
point... just want to know how to avoid the massive performance hit next
time.
Oh, and on the node with the new disk, I see IOWAIT times of ~15%. Which
makes sense IMHO, I'm writing a ton of data to the new disk.
--
-Adam Thompson
athompso at athompso.net
More information about the pve-user
mailing list