[pve-devel] [PATCH docs] Add section Ceph CRUSH and device classes for pool assignment

Alwin Antreich a.antreich at proxmox.com
Fri Nov 17 14:29:18 CET 2017


Signed-off-by: Alwin Antreich <a.antreich at proxmox.com>
---
 pveceph.adoc | 71 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 71 insertions(+)

diff --git a/pveceph.adoc b/pveceph.adoc
index c5eec4f..f152052 100644
--- a/pveceph.adoc
+++ b/pveceph.adoc
@@ -284,6 +284,77 @@ operation footnote:[Ceph pool operation
 http://docs.ceph.com/docs/luminous/rados/operations/pools/]
 manual.
 
+Ceph CRUSH & device classes
+---------------------------
+The foundation of Ceph is its algorithm, **C**ontrolled **R**eplication
+**U**nder **S**calable **H**ashing
+(CRUSH footnote:[CRUSH https://ceph.com/wp-content/uploads/2016/08/weil-crush-sc06.pdf]).
+
+CRUSH calculates where to store to and retrieve data from, this has the
+advantage that no central index service needed. CRUSH works with a map of OSDs,
+buckets (device locations) and rulesets (data replication) for pools.
+
+NOTE: Further information can be found in the Ceph documentation, under the
+section CRUSH map footnote:[CRUSH map http://docs.ceph.com/docs/luminous/rados/operations/crush-map/].
+
+This map can be altered to reflect different replication hierarchies. The object
+replicas can be separated (eg. failure domains), while maintaining the desired
+distribution.
+
+A common use case is to use different classes of disks for different Ceph pools.
+For this reason, Ceph introduced the device classes with luminous, to
+accommodate the need for easy ruleset generation.
+
+The device classes can be seen in the 'ceph osd tree' output. These classes
+represent its own root bucket, that root can be seen with the below command.
+
+[source, bash]
+----
+ceph osd crush tree --show-shadow
+----
+
+Example for above command.
+
+[source, bash]
+----
+ID  CLASS WEIGHT  TYPE NAME
+-16  nvme 2.18307 root default~nvme
+-13  nvme 0.72769     host sumi1~nvme
+ 12  nvme 0.72769         osd.12
+-14  nvme 0.72769     host sumi2~nvme
+ 13  nvme 0.72769         osd.13
+-15  nvme 0.72769     host sumi3~nvme
+ 14  nvme 0.72769         osd.14
+ -1       7.70544 root default
+ -3       2.56848     host sumi1
+ 12  nvme 0.72769         osd.12
+ -5       2.56848     host sumi2
+ 13  nvme 0.72769         osd.13
+ -7       2.56848     host sumi3
+ 14  nvme 0.72769         osd.14
+----
+
+To let a pool distribute its objects only on a specific device class, you need
+to create a ruleset with the specific class first.
+
+[source, bash]
+----
+ceph osd crush rule create-replicated <rule-name> <root> <failure-domain> <class>
+----
+
+Once the rule is in the CRUSH map, you can tell a pool to use the ruleset.
+
+[source, bash]
+----
+ceph osd pool set <pool-name> crush_rule <rule-name>
+----
+
+TIP: If the pool already contains objects, all of these have to be moved
+accordingly. Depending on your setup this may introduce a big performance hit on
+your cluster. As an alternative, you can create a new pool and move disks
+separately.
+
+
 Ceph Client
 -----------
 
-- 
2.11.0





More information about the pve-devel mailing list