[PVE-User] zfs raidz2 expansion

Aaron Lauterer a.lauterer at proxmox.com
Mon May 26 09:38:30 CEST 2025



On  2025-05-23  20:17, Randy Bush wrote:
>> There is no option to replicate a full ZFS pool to another.
> 
> not exactly what i want to do.  my bad in saying "full replication."
> what i meant was all vms are replicated. on other nodes.
> 
> i was thinking that each node could have one pool for primary vm images
> and a second to receive replication from other nodes.

Ah okay. No, the way it works is that you have ZFS pools in both nodes 
with the same name. Then when the replication is configured for a guest, 
its disk images are replicated to the other node to the pool with the 
same name.

> 
>> So, you have a current pool with one raidz2 VDEV made up of 4x 2TB
>> disks.
> 
> yup
> 
> 
>> Because if you have another set of 4x 2TB disks, you can just expand
>> the pool with another raidz2 VDEV, without expanding the current VDEV
>> you have.
> 
> yup.  what are the performance implications?

As usually with ZFS, the change will only affect newly written data. It 
will be spread over both VDEVs, with likely a bias to the newer, much 
emptier VDEV. So if you are already happy with the current performance, 
you should see similar or better performance, depending if one or both 
VDEVs are used to read/write data.

> 
>> If you add another VDEV, the pool could have the following layout:
>>
>> pool
>>    raidz2-0
>>      disk0
>>      disk1
>>      disk2
>>      disk3
>>    raidz2-1
>>      disk4
>>      disk5
>>      disk6
>>      disk7
> 
> yup
> 
>> If you want to create a new pool, then things will be a bit more
>> complicated, as you would need to create a new storage config for it
>> as well, Move-Disk all the disks over to it. If you have a cluster and
>> use the VM replication feature, that new pool must be present on the
>> other nodes as well and you will have to remove the replication jobs
>> before you move the disks to the new pool and then re-create them once
>> all VM disks are on the new pool.
> 
> we would keep the nodes all symmetric, so that would not be an issue.
> and it's just a few hours of ops pain to de-repl and re-repl.  but what
> i do not see is how to tell `/etc/pve/storage.cfg` that pool0 is for
> images and pool1 is for incoming replication.  maybe i am just trying to
> do something too weird.

Yeah, see my reply at the beginning. I think you have a more complicated 
view of the replication than it actually is.

If you are okay with the current performance, I would just add the 
second VDEV to the pool with
`zpool add {pool} raidz2 /dev/disk/by-id/nvme-… /dev/disk/by-id/nvme-…`

Before you do it on a production system, you can test the procedure in a 
(virtual) test machine to make sure you get the CLI command correct.

By extending the pool, you don't need to change anything in the storage 
config or replication settings.

> 
> randy
> 





More information about the pve-user mailing list