[pve-devel] [PATCH v2 storage] fix #1816: rbd: add support for erasure coded ec pools

Aaron Lauterer a.lauterer at proxmox.com
Fri Jan 28 12:22:41 CET 2022


The first step is to allocate rbd images correctly.

The metadata objects still need to be stored in a replicated pool, but
by providing the --data-pool parameter on image creation, we can place
the data objects on the erasure coded (EC) pool.

Signed-off-by: Aaron Lauterer <a.lauterer at proxmox.com>
---
changes: add data-pool parameter in clone_image() if present


Right now this only this only affects disk image creation and cloning.
The EC pool needs to be created manually to test this.

The Ceph blog about EC with RBD + CephFS gives a nice introduction and
the necessary steps to set up such a pool [0].

The steps needed are:

- create EC profile (a 21 profile is only useful for testing purposes in
     a 3 node cluster, not something that should be considered for
     production use!)
# ceph osd erasure-code-profile set ec-21-profile k=2 m=1 crush-failure-domain=host

- create a new pool with that profile
# ceph osd pool create ec21pool erasure ec-21-profile

- allow overwrite
# ceph osd pool set ec21pool allow_ec_overwrites true

- enable application rbd on the pool (the command in the blog seems to
    have gotten the order of parameters a bit wrong here)
# ceph osd pool application enable ec21pool rbd

- add storage configuration
# pvesm add rbd ectest --pool <replicated pool> --data-pool ec21pool

For the replicated pool, either create a new one without adding the PVE
storage config or use a namespace to separate it from the existing pool.

To create a namespace:
# rbd namespace create <pool>/<namespace>

add the '--namespace' parameter in the pvesm add command.

To check if the objects are stored correclty you can run rados:

# rados -p <pool> ls

This should only show metadata objects

# rados -p <ec pool> ls

This should then show only `rbd_data.xxx` objects.
If you configured a namespace, you also need to add the `--namespace`
parameter to the rados command.


[0] https://ceph.io/en/news/blog/2017/new-luminous-erasure-coding-rbd-cephfs/


 PVE/Storage/RBDPlugin.pm | 23 +++++++++++++----------
 1 file changed, 13 insertions(+), 10 deletions(-)

diff --git a/PVE/Storage/RBDPlugin.pm b/PVE/Storage/RBDPlugin.pm
index 2607d25..efb2187 100644
--- a/PVE/Storage/RBDPlugin.pm
+++ b/PVE/Storage/RBDPlugin.pm
@@ -289,6 +289,10 @@ sub properties {
 	    description => "Pool.",
 	    type => 'string',
 	},
+	'data-pool' => {
+	    description => "Data Pool (for erasure coding only)",
+	    type => 'string',
+	},
 	namespace => {
 	    description => "RBD Namespace.",
 	    type => 'string',
@@ -318,6 +322,7 @@ sub options {
 	disable => { optional => 1 },
 	monhost => { optional => 1},
 	pool => { optional => 1 },
+	'data-pool' => { optional => 1 },
 	namespace => { optional => 1 },
 	username => { optional => 1 },
 	content => { optional => 1 },
@@ -492,15 +497,10 @@ sub clone_image {
     my $newvol = "$basename/$name";
     $newvol = $name if length($snapname);
 
-    my $cmd = $rbd_cmd->(
-	$scfg,
-	$storeid,
-	'clone',
-	get_rbd_path($scfg, $basename),
-	'--snap',
-	$snap,
-	get_rbd_path($scfg, $name),
-    );
+    my @options = ('clone', get_rbd_path($scfg, $basename), '--snap', $snap);
+    push @options, ('--data-pool', $scfg->{'data-pool'}) if $scfg->{'data-pool'};
+    push @options, get_rbd_path($scfg, $name);
+    my $cmd = $rbd_cmd->($scfg, $storeid, @options);
 
     run_rbd_command($cmd, errmsg => "rbd clone '$basename' error");
 
@@ -516,7 +516,10 @@ sub alloc_image {
 
     $name = $class->find_free_diskname($storeid, $scfg, $vmid) if !$name;
 
-    my $cmd = $rbd_cmd->($scfg, $storeid, 'create', '--image-format' , 2, '--size', int(($size+1023)/1024), $name);
+    my @options = ('create', '--image-format' , 2, '--size', int(($size+1023)/1024));
+    push @options, ('--data-pool', $scfg->{'data-pool'}) if $scfg->{'data-pool'};
+    push @options, $name;
+    my $cmd = $rbd_cmd->($scfg, $storeid, @options);
     run_rbd_command($cmd, errmsg => "rbd create '$name' error");
 
     return $name;
-- 
2.30.2






More information about the pve-devel mailing list