From gaio at lilliput.linux.it  Fri May  2 17:21:05 2025
From: gaio at lilliput.linux.it (Marco Gaiarin)
Date: Fri, 2 May 2025 17:21:05 +0200
Subject: [PVE-User] Dell PowerEdge T140, megaraid, BIOS Boot...
Message-ID: <vf4fel-i3o2.ln1@leia.lilliput.linux.it>


I've just upgraded to PVE8 a little cluster built with an PowerEdge T340
(boot in UEFI mode) and an T140 (boot in BIOS mode).

T340 works like a charme; T140 after rebooting panicked because cannot find
root filesystem; looking carefully at kernel log revealed that disks get not
recognized, and going deeper lead me to:

 gaio at leia:~$ grep megaraid T140_boot_log.txt
 [   31.952470] megaraid_sas 0000:01:00.0: FW now in Ready state
 [   31.952472] megaraid_sas 0000:01:00.0: 63 bit DMA mask and 32 bit consistent mask
 [   31.952624] megaraid_sas 0000:01:00.0: firmware supports msix        : (96)
 [   31.952750] megaraid_sas 0000:01:00.0: requested/available msix 5/5 poll_queue 0
 [   31.952752] megaraid_sas 0000:01:00.0: current msix/online cpus      : (5/4)
 [   31.952753] megaraid_sas 0000:01:00.0: RDPQ mode     : (disabled)
 [   31.952758] megaraid_sas 0000:01:00.0: Current firmware supports maximum commands: 928        LDIO threshold: 237
 [   31.954076] megaraid_sas 0000:01:00.0: Performance mode :Latency (latency index = 1)
 [   31.954078] megaraid_sas 0000:01:00.0: FW supports sync cache        : No
 [   31.954080] megaraid_sas 0000:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
 [   32.362294] megaraid_sas 0000:01:00.0: Ignore DCMD timeout: megasas_get_ctrl_info 5382
 [   32.665126] megaraid_sas 0000:01:00.0: Could not get controller info. Fail from megasas_init_adapter_fusion 1907
 [   32.908139] megaraid_sas 0000:01:00.0: Failed from megasas_init_fw 6539

googling a bit lead me to some forum and pages; after some test i've added
to kernel boot parameters:

	iommu=pt

and now server boot flawlessy.


The two servers have different hardware but share the same BIOS, so this
lead to me to the hypotesis that the new megaraid_sas kernel driver depend
on UEFI initialization to work, or something like this.

Also, i've understood that 'pt' mean 'PassThrought', but i've not clear what
overral consequences (performance, stability, ...) have this options.


I'm seeking feedback. Thanks.

-- 


From devzero at web.de  Fri May  2 19:03:01 2025
From: devzero at web.de (RolandK)
Date: Fri, 2 May 2025 19:03:01 +0200
Subject: [PVE-User] Dell PowerEdge T140, megaraid, BIOS Boot...
In-Reply-To: <vf4fel-i3o2.ln1@leia.lilliput.linux.it>
References: <vf4fel-i3o2.ln1@leia.lilliput.linux.it>
Message-ID: <b7165214-1db9-46a8-8d9d-2e8cb5cd1c6c@web.de>

hello,

there is a note at https://pve.proxmox.com/wiki/Roadmap#Proxmox_VE_8.2


        Kernel: |intel_iommu| now defaults to on

The |intel_iommu| parameter defaults to |on| in the kernel 6.8 series. 
Enabling IOMMU can cause problems with older hardware, or systems with 
not up to date BIOS, due to bugs in the BIOS.

The issue can be fixed by explicitly disabling |intel_iommu| on the 
kernel commandline (|intel_iommu=off|) following the reference 
documentation 
<https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#sysboot_edit_kernel_cmdline>. 


does intel_iommu=off also help ?

regards
Roland

Am 02.05.25 um 17:21 schrieb Marco Gaiarin:
> I've just upgraded to PVE8 a little cluster built with an PowerEdge T340
> (boot in UEFI mode) and an T140 (boot in BIOS mode).
>
> T340 works like a charme; T140 after rebooting panicked because cannot find
> root filesystem; looking carefully at kernel log revealed that disks get not
> recognized, and going deeper lead me to:
>
>   gaio at leia:~$ grep megaraid T140_boot_log.txt
>   [   31.952470] megaraid_sas 0000:01:00.0: FW now in Ready state
>   [   31.952472] megaraid_sas 0000:01:00.0: 63 bit DMA mask and 32 bit consistent mask
>   [   31.952624] megaraid_sas 0000:01:00.0: firmware supports msix        : (96)
>   [   31.952750] megaraid_sas 0000:01:00.0: requested/available msix 5/5 poll_queue 0
>   [   31.952752] megaraid_sas 0000:01:00.0: current msix/online cpus      : (5/4)
>   [   31.952753] megaraid_sas 0000:01:00.0: RDPQ mode     : (disabled)
>   [   31.952758] megaraid_sas 0000:01:00.0: Current firmware supports maximum commands: 928        LDIO threshold: 237
>   [   31.954076] megaraid_sas 0000:01:00.0: Performance mode :Latency (latency index = 1)
>   [   31.954078] megaraid_sas 0000:01:00.0: FW supports sync cache        : No
>   [   31.954080] megaraid_sas 0000:01:00.0: megasas_disable_intr_fusion is called outbound_intr_mask:0x40000009
>   [   32.362294] megaraid_sas 0000:01:00.0: Ignore DCMD timeout: megasas_get_ctrl_info 5382
>   [   32.665126] megaraid_sas 0000:01:00.0: Could not get controller info. Fail from megasas_init_adapter_fusion 1907
>   [   32.908139] megaraid_sas 0000:01:00.0: Failed from megasas_init_fw 6539
>
> googling a bit lead me to some forum and pages; after some test i've added
> to kernel boot parameters:
>
> 	iommu=pt
>
> and now server boot flawlessy.
>
>
> The two servers have different hardware but share the same BIOS, so this
> lead to me to the hypotesis that the new megaraid_sas kernel driver depend
> on UEFI initialization to work, or something like this.
>
> Also, i've understood that 'pt' mean 'PassThrought', but i've not clear what
> overral consequences (performance, stability, ...) have this options.
>
>
> I'm seeking feedback. Thanks.
>

From gaio at lilliput.linux.it  Wed May 14 22:38:47 2025
From: gaio at lilliput.linux.it (Marco Gaiarin)
Date: Wed, 14 May 2025 22:38:47 +0200
Subject: [PVE-User] Dell PowerEdge T140, megaraid, BIOS Boot...
In-Reply-To: <b7165214-1db9-46a8-8d9d-2e8cb5cd1c6c@web.de>
References: <vf4fel-i3o2.ln1@leia.lilliput.linux.it>
 <b7165214-1db9-46a8-8d9d-2e8cb5cd1c6c@web.de>
Message-ID: <aCT_V2BS6oiB63WY@sv.lnf.it>

Mandi! RolandK
  In chel di` si favelave...

> does intel_iommu=off also help ?

Yes, also this fix the trouble.

But is really strange... most of our T140 (ALL boot in BIOS mode) work
without 'intel_iommu=off', but three server need it to boot. All get upgaded
to the latest BIOS revision.

Also, our T340 start to suffer ZFS trouble, and 'intel_iommu=off' seems to
cure also that.


We have added 'intel_iommu=off' to all our server, as a safety measure...

-- 


From leesteken+proxmox at pm.me  Wed May 14 22:46:36 2025
From: leesteken+proxmox at pm.me (Arjen)
Date: Wed, 14 May 2025 20:46:36 +0000
Subject: [PVE-User] Dell PowerEdge T140, megaraid, BIOS Boot...
In-Reply-To: <aCT_V2BS6oiB63WY@sv.lnf.it>
References: <vf4fel-i3o2.ln1@leia.lilliput.linux.it>
 <b7165214-1db9-46a8-8d9d-2e8cb5cd1c6c@web.de> <aCT_V2BS6oiB63WY@sv.lnf.it>
Message-ID: <oj5gf2D_oUcP0yEhjWbdHZXjiz6ULD0wSK9pbeUseDy663MWkg2MIClfGGzZFoAfO4XahMOmIQ78iYDJoBKRMYb6Bf6LjAlk8ECZD1Q6xlk=@pm.me>

On Wednesday, 14 May 2025 at 22:40, Marco Gaiarin <gaio at lilliput.linux.it> wrote:

> But is really strange... most of our T140 (ALL boot in BIOS mode) work
> without 'intel_iommu=off', but three server need it to boot. All get upgaded
> to the latest BIOS revision.

Maybe those three have VT-d (or IOMMU) enabled in the motherboard BIOS? intel_iommu=on became the default in kernel version 6.8.
Unless VT-d is enabled in the motherboard BIOS settings, intel_iommu=off does nothing (as it is disabled in the motherboard BIOS).

kind regards


From gaio at lilliput.linux.it  Fri May 16 15:43:38 2025
From: gaio at lilliput.linux.it (Marco Gaiarin)
Date: Fri, 16 May 2025 15:43:38 +0200
Subject: [PVE-User] Dell PowerEdge T140, megaraid, BIOS Boot...
In-Reply-To: <mailman.423.1747256171.394.pve-user@lists.proxmox.com>;
 from SmartGate on Sat, May 17, 2025 at 09:06:01AM +0200
References: <aCT_V2BS6oiB63WY@sv.lnf.it>
 <mailman.423.1747256171.394.pve-user@lists.proxmox.com>
Message-ID: <81sjfl-sa02.ln1@leia.lilliput.linux.it>

Mandi! Arjen via pve-user
  In chel di` si favelave...

> Maybe those three have VT-d (or IOMMU) enabled in the motherboard BIOS? intel_iommu=on became the default in kernel version 6.8.
> Unless VT-d is enabled in the motherboard BIOS settings, intel_iommu=off does nothing (as it is disabled in the motherboard BIOS).

I've downloaded BIOS settings in XML format from iDRAC, and:

 gaio at leia:~$ diff -ud Scaricati/sdpve2-bios.xml.xml Scaricati/tvpve2-bios.xml 
 --- Scaricati/sdpve2-bios.xml.xml	2025-05-16 15:33:39.309180290 +0200
 +++ Scaricati/tvpve2-bios.xml	2025-05-16 15:40:10.974747104 +0200
 @@ -1,4 +1,4 @@
 -<SystemConfiguration Model="PowerEdge T140" ServiceTag="1234567" TimeStamp="Fri May 16 13:33:25 2025">
 +<SystemConfiguration Model="PowerEdge T140" ServiceTag="9876543" TimeStamp="Fri May 16 13:40:01 2025">
  <!--Export type is Normal,XML,Selective-->
  <!--Exported configuration may contain commented attributes. Attributes may be commented due to dependency, destructive nature, preserving server identity or for security reasons.-->
  <Component FQDD="BIOS.Setup.1-1">

exactly the same...

-- 


From randy at psg.com  Tue May 20 03:24:52 2025
From: randy at psg.com (Randy Bush)
Date: Mon, 19 May 2025 18:24:52 -0700
Subject: [PVE-User] zfs raidz2 expansion
Message-ID: <m24ixgm5uj.wl-randy@psg.com>

running 8.4.1 with images on 4x2tb raidz2 ssds (zfs-2.2.7-pve2).  we
want to double the space.  the net of a million lies says all sorts of
things about expanding raidz2.  as we do full replication, we could just
create a new pool for replocation and call it a day.  but i am curious
if we can just expand the current pool.  my reading says doubtful, but
am not deeply clued.

randy

---

https://forum.proxmox.com/threads/raidz-expansion.135413/
https://github.com/openzfs/zfs/pull/15022


From a.lauterer at proxmox.com  Tue May 20 09:05:53 2025
From: a.lauterer at proxmox.com (Aaron Lauterer)
Date: Tue, 20 May 2025 09:05:53 +0200
Subject: [PVE-User] zfs raidz2 expansion
In-Reply-To: <m24ixgm5uj.wl-randy@psg.com>
References: <m24ixgm5uj.wl-randy@psg.com>
Message-ID: <8fa284b2-11b6-4136-82df-cc81a1976213@proxmox.com>

You can always add a new  VDEV to the pool (man zpool-add). Expanding a 
raidz2 VDEV itself is only possible with ZFS 2.3 AFAIU and we do not 
ship that yet [0].

But if this is for VM storage, you might want to consider a ZFS pool of 
multiple mirrored VDEVs. [1] has more details.

While performance might not be much of an issue anymore with recent 
NVMEs that provide a ton of IOPS, the additional space usage might still 
be something to think about in a raidz pool used for VMs (ZVOL).


[0] 
https://forum.proxmox.com/threads/proxmox-ve-8-4-released.164821/#post-762367
[1] 
https://pve.proxmox.com/pve-docs/pve-admin-guide.html#sysadmin_zfs_raid_considerations

On  2025-05-20  03:24, Randy Bush wrote:
> running 8.4.1 with images on 4x2tb raidz2 ssds (zfs-2.2.7-pve2).  we
> want to double the space.  the net of a million lies says all sorts of
> things about expanding raidz2.  as we do full replication, we could just
> create a new pool for replocation and call it a day.  but i am curious
> if we can just expand the current pool.  my reading says doubtful, but
> am not deeply clued.
> 
> randy
> 
> ---
> 
> https://forum.proxmox.com/threads/raidz-expansion.135413/
> https://github.com/openzfs/zfs/pull/15022
> 
> _______________________________________________
> pve-user mailing list
> pve-user at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
> 
> 


From randy at psg.com  Tue May 20 19:21:45 2025
From: randy at psg.com (Randy Bush)
Date: Tue, 20 May 2025 10:21:45 -0700
Subject: [PVE-User] zfs raidz2 expansion
In-Reply-To: <8fa284b2-11b6-4136-82df-cc81a1976213@proxmox.com>
References: <m24ixgm5uj.wl-randy@psg.com>
 <8fa284b2-11b6-4136-82df-cc81a1976213@proxmox.com>
Message-ID: <m2o6vnkxjq.wl-randy@psg.com>

ok, makes sense.  thanks.  as this is production, we probably would not
want to be early adopters of zfs 2.3 anyway.

so, can i specify a second pool to be replication target, as i suggested
in

>> as we do full replication, we could just create a new pool for
>> replocation and call it a day.

i do not see an attribute for relocation pool in storage.cfg

randy


From gaio at lilliput.linux.it  Wed May 21 14:28:31 2025
From: gaio at lilliput.linux.it (Marco Gaiarin)
Date: Wed, 21 May 2025 14:28:31 +0200
Subject: [PVE-User] Interface not renamed...
Message-ID: <dgt0gl-nko1.ln1@leia.lilliput.linux.it>


I've upgraded a server from PVE7 to PV8, and (as expected) some interfaces
get renamed.

Really, i've upgraded TWO identical server (Dell T440), and one went
flawlessy, the other... and interface get NOT renamed, remain 'eth1':

 root at svpve2:~# ip link show eth1
 3: eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode DEFAULT group default qlen 1000
     link/ether 5c:6f:69:0f:99:79 brd ff:ff:ff:ff:ff:ff
     altname enp1s0f1

I've tried:

 root at svpve2:~# ip -force link set dev eth1 name ens1f1
 RTNETLINK answers: File exists

and:

 root at svpve2:~# ip link property add dev eth1 altname ens1f1
 RTNETLINK answers: File exists


but nothing work; there's something i can do whithout rebooting the server?


Thanks.

PS: port is on a dual-port pcie card, the other port on the same card works as expected:

 root at svpve2:~# ip link show ens1f0
 2: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
     link/ether f4:ee:08:24:3a:ff brd ff:ff:ff:ff:ff:ff permaddr 5c:6f:69:0f:99:78
     altname enp1s0f0

-- 


From a.lauterer at proxmox.com  Thu May 22 15:23:25 2025
From: a.lauterer at proxmox.com (Aaron Lauterer)
Date: Thu, 22 May 2025 15:23:25 +0200
Subject: [PVE-User] zfs raidz2 expansion
In-Reply-To: <m2o6vnkxjq.wl-randy@psg.com>
References: <m24ixgm5uj.wl-randy@psg.com>
 <8fa284b2-11b6-4136-82df-cc81a1976213@proxmox.com>
 <m2o6vnkxjq.wl-randy@psg.com>
Message-ID: <9aed076a-0e2e-46b9-9ee3-2e6d9abdf651@proxmox.com>


On  2025-05-20  19:21, Randy Bush wrote:
> ok, makes sense.  thanks.  as this is production, we probably would not
> want to be early adopters of zfs 2.3 anyway.
> 
> so, can i specify a second pool to be replication target, as i suggested
> in
> 
>>> as we do full replication, we could just create a new pool for
>>> replocation and call it a day.
> 
> i do not see an attribute for relocation pool in storage.cfg

Where does the quote here come from? Did I miss an email in this thread?

> 
> randy
> 


From randy at psg.com  Thu May 22 23:03:27 2025
From: randy at psg.com (Randy Bush)
Date: Thu, 22 May 2025 14:03:27 -0700
Subject: [PVE-User] zfs raidz2 expansion
In-Reply-To: <9aed076a-0e2e-46b9-9ee3-2e6d9abdf651@proxmox.com>
References: <m24ixgm5uj.wl-randy@psg.com>
 <8fa284b2-11b6-4136-82df-cc81a1976213@proxmox.com>
 <m2o6vnkxjq.wl-randy@psg.com>
 <9aed076a-0e2e-46b9-9ee3-2e6d9abdf651@proxmox.com>
Message-ID: <m2ecwgjr34.wl-randy@psg.com>

>> so, can i specify a second pool to be replication target, as i
>> suggested in
>>>> as we do full replication, we could just create a new pool for
>>>> replocation and call it a day.
>> 
>> i do not see an attribute for relocation pool in storage.cfg

> Where does the quote here come from? Did I miss an email in this thread?

all the above are from my keyboard on this thread

randy


From a.lauterer at proxmox.com  Fri May 23 10:50:24 2025
From: a.lauterer at proxmox.com (Aaron Lauterer)
Date: Fri, 23 May 2025 10:50:24 +0200
Subject: [PVE-User] zfs raidz2 expansion
In-Reply-To: <m2ecwgjr34.wl-randy@psg.com>
References: <m24ixgm5uj.wl-randy@psg.com>
 <8fa284b2-11b6-4136-82df-cc81a1976213@proxmox.com>
 <m2o6vnkxjq.wl-randy@psg.com>
 <9aed076a-0e2e-46b9-9ee3-2e6d9abdf651@proxmox.com>
 <m2ecwgjr34.wl-randy@psg.com>
Message-ID: <e8c83c5c-0923-4ced-8c5d-bd371afdf926@proxmox.com>

On  2025-05-22  23:03, Randy Bush wrote:
>>> so, can i specify a second pool to be replication target, as i
>>> suggested in
>>>>> as we do full replication, we could just create a new pool for
>>>>> replocation and call it a day.
>>>
>>> i do not see an attribute for relocation pool in storage.cfg
> 
>> Where does the quote here come from? Did I miss an email in this thread?
> 
> all the above are from my keyboard on this thread

ah sorry. I was too blind yesterday to see that in your first email.

There is no option to replicate a full ZFS pool to another.

So, you have a current pool with one raidz2 VDEV made up of 4x 2TB disks.

What kind of pool layout/sitation do you want to have in the end?

Because if you have another set of 4x 2TB disks, you can just expand the 
pool with another raidz2 VDEV, without expanding the current VDEV you have.

Right now, `zpool status` will show you something like:

pool
   raidz2-0
     disk0
     disk1
     disk2
     disk3

If you add another VDEV, the pool could have the following layout:

pool
   raidz2-0
     disk0
     disk1
     disk2
     disk3
   raidz2-1
     disk4
     disk5
     disk6
     disk7

If you want to create a new pool, then things will be a bit more 
complicated, as you would need to create a new storage config for it as 
well, Move-Disk all the disks over to it. If you have a cluster and use 
the VM replication feature, that new pool must be present on the other 
nodes as well and you will have to remove the replication jobs before 
you move the disks to the new pool and then re-create them once all VM 
disks are on the new pool.


From falko.trojahn at gmail.com  Fri May 23 12:16:09 2025
From: falko.trojahn at gmail.com (Falko Trojahn)
Date: Fri, 23 May 2025 12:16:09 +0200
Subject: [PVE-User] Interface not renamed...
In-Reply-To: <dgt0gl-nko1.ln1@leia.lilliput.linux.it>
References: <dgt0gl-nko1.ln1@leia.lilliput.linux.it>
Message-ID: <af9f074b-0784-44ba-8eca-a7b51315d31a@gmail.com>

Marco Gaiarin schrieb am 21.05.25 um 14:28:
 >
 > I've upgraded a server from PVE7 to PV8, and (as expected) some 
interfaces
 > get renamed.
 >
 > Really, i've upgraded TWO identical server (Dell T440), and one went
 > flawlessy, the other... and interface get NOT renamed, remain 'eth1':
Hi Marco,

you surely don't have any udev rules for this interface
e.g. left over from former system?

Cheers
Falko


From devzero at web.de  Fri May 23 12:23:40 2025
From: devzero at web.de (Roland)
Date: Fri, 23 May 2025 12:23:40 +0200
Subject: [PVE-User] Interface not renamed...
In-Reply-To: <af9f074b-0784-44ba-8eca-a7b51315d31a@gmail.com>
References: <dgt0gl-nko1.ln1@leia.lilliput.linux.it>
 <af9f074b-0784-44ba-8eca-a7b51315d31a@gmail.com>
Message-ID: <d1f0f4d4-1589-4d35-bc6b-32194cf97721@web.de>

hello,

iirc, that can be a matter of the bios version installed.

https://linux.dell.com/files/whitepapers/consistent_network_device_naming_in_linux.pdf

roland


Am 23.05.25 um 12:16 schrieb Falko Trojahn:
> Marco Gaiarin schrieb am 21.05.25 um 14:28:
> >
> > I've upgraded a server from PVE7 to PV8, and (as expected) some 
> interfaces
> > get renamed.
> >
> > Really, i've upgraded TWO identical server (Dell T440), and one went
> > flawlessy, the other... and interface get NOT renamed, remain 'eth1':
> Hi Marco,
>
> you surely don't have any udev rules for this interface
> e.g. left over from former system?
>
> Cheers
> Falko
>
> _______________________________________________
> pve-user mailing list
> pve-user at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user
>

From gaio at lilliput.linux.it  Fri May 23 16:41:25 2025
From: gaio at lilliput.linux.it (Marco Gaiarin)
Date: Fri, 23 May 2025 16:41:25 +0200
Subject: [PVE-User] Interface not renamed...
In-Reply-To: <d1f0f4d4-1589-4d35-bc6b-32194cf97721@web.de>
 <af9f074b-0784-44ba-8eca-a7b51315d31a@gmail.com>
Message-ID: <aDCJFTKo_i7FXI4_@sv.lnf.it>

Mandi! Falko Trojahn
  In chel di` si favelave...

> you surely don't have any udev rules for this interface
> e.g. left over from former system?

No, i've not altered in any way udev rules...


Mandi! Roland
  In chel di` si favelave...

> iirc, that can be a matter of the bios version installed.

> https://linux.dell.com/files/whitepapers/consistent_network_device_naming_in_linux.pdf

We have upgarded 8 identical servers, upgrading to identical bios version;
this is the only one that have this trouble.

Anyway, i've found some more info:

 root at svpve2:~# ip link show | egrep '[0-9]+: '
 1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT group default qlen 1000
 2: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
 3: eth1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
 4: ens5f0np0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond2 state UP mode DEFAULT group default qlen 1000
 5: eno1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP mode DEFAULT group default qlen 1000
 6: eno2: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond1 state UP mode DEFAULT group default qlen 1000
 7: ens5f1np1: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 9000 qdisc mq master bond2 state UP mode DEFAULT group default qlen 1000
 8: idrac: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN mode DEFAULT group default qlen 1000
 9: bond2: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 9000 qdisc noqueue state UP mode DEFAULT group default qlen 1000
 10: bond0: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr0 state UP mode DEFAULT group default qlen 1000
 11: vmbr0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
 12: bond1: <BROADCAST,MULTICAST,MASTER,UP,LOWER_UP> mtu 1500 qdisc noqueue master vmbr1 state UP mode DEFAULT group default qlen 1000
 13: vmbr1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UP mode DEFAULT group default qlen 1000
 [...] 

interface 'ens1f1' does not desappear, simply get not 'renamed' on boot, and
remain 'eth1'.

I've tried a manual:
	ip link set eth1 master bond1

and interface start working as expected.

If logs can be useful, if it is better to fire up a bug, say me.


Just i'm here, i've found also:

 root at svpve2:~# ip addr show ens1f0
 2: ens1f0: <BROADCAST,MULTICAST,SLAVE,UP,LOWER_UP> mtu 1500 qdisc mq master bond0 state UP group default qlen 1000
     link/ether f4:ee:08:24:3a:ff brd ff:ff:ff:ff:ff:ff permaddr 5c:6f:69:0f:99:78
     altname enp1s0f0

and 'enp1s0f0' in 'altname' was the old interface name, that was in
/etc/network/interfaces; but at boot bond does not bind it, i was forced to
use the 'new' name.

'altname' work only insome aspect? Or simply, does not work for bonds?


Thanks.

-- 


From randy at psg.com  Fri May 23 20:17:38 2025
From: randy at psg.com (Randy Bush)
Date: Fri, 23 May 2025 11:17:38 -0700
Subject: [PVE-User] zfs raidz2 expansion
In-Reply-To: <e8c83c5c-0923-4ced-8c5d-bd371afdf926@proxmox.com>
References: <m24ixgm5uj.wl-randy@psg.com>
 <8fa284b2-11b6-4136-82df-cc81a1976213@proxmox.com>
 <m2o6vnkxjq.wl-randy@psg.com>
 <9aed076a-0e2e-46b9-9ee3-2e6d9abdf651@proxmox.com>
 <m2ecwgjr34.wl-randy@psg.com>
 <e8c83c5c-0923-4ced-8c5d-bd371afdf926@proxmox.com>
Message-ID: <m21psfjinx.wl-randy@psg.com>

> There is no option to replicate a full ZFS pool to another.

not exactly what i want to do.  my bad in saying "full replication."
what i meant was all vms are replicated. on other nodes.

i was thinking that each node could have one pool for primary vm images
and a second to receive replication from other nodes.

> So, you have a current pool with one raidz2 VDEV made up of 4x 2TB
> disks.

yup


> Because if you have another set of 4x 2TB disks, you can just expand
> the pool with another raidz2 VDEV, without expanding the current VDEV
> you have.

yup.  what are the performance implications?

> If you add another VDEV, the pool could have the following layout:
> 
> pool
>   raidz2-0
>     disk0
>     disk1
>     disk2
>     disk3
>   raidz2-1
>     disk4
>     disk5
>     disk6
>     disk7

yup

> If you want to create a new pool, then things will be a bit more
> complicated, as you would need to create a new storage config for it
> as well, Move-Disk all the disks over to it. If you have a cluster and
> use the VM replication feature, that new pool must be present on the
> other nodes as well and you will have to remove the replication jobs
> before you move the disks to the new pool and then re-create them once
> all VM disks are on the new pool.

we would keep the nodes all symmetric, so that would not be an issue.
and it's just a few hours of ops pain to de-repl and re-repl.  but what
i do not see is how to tell `/etc/pve/storage.cfg` that pool0 is for
images and pool1 is for incoming replication.  maybe i am just trying to
do something too weird.

randy


From randy at psg.com  Mon May 26 05:40:39 2025
From: randy at psg.com (Randy Bush)
Date: Sun, 25 May 2025 20:40:39 -0700
Subject: [PVE-User] replication failures
Message-ID: <m25xhohweg.wl-randy@psg.com>

three node debian-12 8.4.1 zfs raidz2 ssd cluster, maybe 20vms, all vms
replicate /15 to the next node to the right.  

on one and only of a couple of similar clusters, and on only one
particular node, we're getting replication failuers of the nature of

    2025-05-26T00:16:17.643854+00:00 vm21 pvescheduler[2641364]: command 'zfs destroy images/vm-107-disk-0 at __replicate_107-0_1748217943__' failed: got timeout
    2025-05-26T00:16:37.218095+00:00 vm21 pvescheduler[2641364]: 107-0: got unexpected replication job error - command 'zfs snapshot images/vm-107-disk-0 at __replicate_107-0_1748218563__' failed: got timeout

five to 15 times a day.  zfs load?  flaky disk (smartmon reports
nothing)?  weak ether?  moon in klutz?

how do folk diagnose?

randy


From a.lauterer at proxmox.com  Mon May 26 09:38:30 2025
From: a.lauterer at proxmox.com (Aaron Lauterer)
Date: Mon, 26 May 2025 09:38:30 +0200
Subject: [PVE-User] zfs raidz2 expansion
In-Reply-To: <m21psfjinx.wl-randy@psg.com>
References: <m24ixgm5uj.wl-randy@psg.com>
 <8fa284b2-11b6-4136-82df-cc81a1976213@proxmox.com>
 <m2o6vnkxjq.wl-randy@psg.com>
 <9aed076a-0e2e-46b9-9ee3-2e6d9abdf651@proxmox.com>
 <m2ecwgjr34.wl-randy@psg.com>
 <e8c83c5c-0923-4ced-8c5d-bd371afdf926@proxmox.com>
 <m21psfjinx.wl-randy@psg.com>
Message-ID: <8dab9bf4-0fcf-46c7-9dce-0760735af44e@proxmox.com>


On  2025-05-23  20:17, Randy Bush wrote:
>> There is no option to replicate a full ZFS pool to another.
> 
> not exactly what i want to do.  my bad in saying "full replication."
> what i meant was all vms are replicated. on other nodes.
> 
> i was thinking that each node could have one pool for primary vm images
> and a second to receive replication from other nodes.

Ah okay. No, the way it works is that you have ZFS pools in both nodes 
with the same name. Then when the replication is configured for a guest, 
its disk images are replicated to the other node to the pool with the 
same name.

> 
>> So, you have a current pool with one raidz2 VDEV made up of 4x 2TB
>> disks.
> 
> yup
> 
> 
>> Because if you have another set of 4x 2TB disks, you can just expand
>> the pool with another raidz2 VDEV, without expanding the current VDEV
>> you have.
> 
> yup.  what are the performance implications?

As usually with ZFS, the change will only affect newly written data. It 
will be spread over both VDEVs, with likely a bias to the newer, much 
emptier VDEV. So if you are already happy with the current performance, 
you should see similar or better performance, depending if one or both 
VDEVs are used to read/write data.

> 
>> If you add another VDEV, the pool could have the following layout:
>>
>> pool
>>    raidz2-0
>>      disk0
>>      disk1
>>      disk2
>>      disk3
>>    raidz2-1
>>      disk4
>>      disk5
>>      disk6
>>      disk7
> 
> yup
> 
>> If you want to create a new pool, then things will be a bit more
>> complicated, as you would need to create a new storage config for it
>> as well, Move-Disk all the disks over to it. If you have a cluster and
>> use the VM replication feature, that new pool must be present on the
>> other nodes as well and you will have to remove the replication jobs
>> before you move the disks to the new pool and then re-create them once
>> all VM disks are on the new pool.
> 
> we would keep the nodes all symmetric, so that would not be an issue.
> and it's just a few hours of ops pain to de-repl and re-repl.  but what
> i do not see is how to tell `/etc/pve/storage.cfg` that pool0 is for
> images and pool1 is for incoming replication.  maybe i am just trying to
> do something too weird.

Yeah, see my reply at the beginning. I think you have a more complicated 
view of the replication than it actually is.

If you are okay with the current performance, I would just add the 
second VDEV to the pool with
`zpool add {pool} raidz2 /dev/disk/by-id/nvme-? /dev/disk/by-id/nvme-?`

Before you do it on a production system, you can test the procedure in a 
(virtual) test machine to make sure you get the CLI command correct.

By extending the pool, you don't need to change anything in the storage 
config or replication settings.

> 
> randy
> 


From alexandre.derumier at groupe-cyllene.com  Mon May 26 17:11:00 2025
From: alexandre.derumier at groupe-cyllene.com (DERUMIER, Alexandre)
Date: Mon, 26 May 2025 15:11:00 +0000
Subject: [PVE-User] replication failures
In-Reply-To: <m25xhohweg.wl-randy@psg.com>
References: <m25xhohweg.wl-randy@psg.com>
Message-ID: <76289cc38898b0b5c2fb04784b5d4b139de89b8f.camel@groupe-cyllene.com>

How much time does it take in you do the delete command manually ? 
(zfs destroy images/vm-107-disk-0 at __replicate_107-0_1748217943__)


(maybe the timeout in the code is too short ?)


-------- Message initial --------
De: Randy Bush <randy at psg.com>
R?pondre ?: Proxmox VE user list <pve-user at lists.proxmox.com>
?: ProxMox Users <pve-user at lists.proxmox.com>
Objet: [PVE-User] replication failures
Date: 26/05/2025 05:40:39

three node debian-12 8.4.1 zfs raidz2 ssd cluster, maybe 20vms, all vms
replicate /15 to the next node to the right.? 

on one and only of a couple of similar clusters, and on only one
particular node, we're getting replication failuers of the nature of

??? 2025-05-26T00:16:17.643854+00:00 vm21 pvescheduler[2641364]:
command 'zfs destroy images/vm-107-disk-0 at __replicate_107-
0_1748217943__' failed: got timeout
??? 2025-05-26T00:16:37.218095+00:00 vm21 pvescheduler[2641364]: 107-0:
got unexpected replication job error - command 'zfs snapshot images/vm-
107-disk-0 at __replicate_107-0_1748218563__' failed: got timeout

five to 15 times a day.? zfs load?? flaky disk (smartmon reports
nothing)?? weak ether?? moon in klutz?

how do folk diagnose?

randy

_______________________________________________
pve-user mailing list
pve-user at lists.proxmox.com
https://antiphishing.vadesecure.com/v4?f=Rld2eGhGQ3psZjlOWGwxQ1_ZfFbgqZ
TPaooaLkyo9Iz48f3wEJxfdHSaXhsgUlRBwsSa2EvkACP7Jh9e5TXbPw&i=U2pXU09ocHlt
dTEydGM2aUXXbilnQtz5PQDA1D2RBy8&k=1XpP&r=SjA3d003VWxKRk1kazNaeRJgzukDmh
QdY5g-DacBRkZ4pgKdvLOyt2Z87havu-ae7CZLNw-
FYpOPxDnH4AVQTw&s=6f39617ccf400668f694b93aa3fbcb2782f4bc0a65f6c1bc81b8d
c48b06d54f4&u=https%3A%2F%2Flists.proxmox.com%2Fcgi-
bin%2Fmailman%2Flistinfo%2Fpve-user


From randy at psg.com  Mon May 26 22:02:26 2025
From: randy at psg.com (Randy Bush)
Date: Mon, 26 May 2025 13:02:26 -0700
Subject: [PVE-User] replication failures
In-Reply-To: <76289cc38898b0b5c2fb04784b5d4b139de89b8f.camel@groupe-cyllene.com>
References: <m25xhohweg.wl-randy@psg.com>
 <76289cc38898b0b5c2fb04784b5d4b139de89b8f.camel@groupe-cyllene.com>
Message-ID: <m2tt57gmy5.wl-randy@psg.com>

> How much time does it take in you do the delete command manually ? 
> (zfs destroy images/vm-107-disk-0 at __replicate_107-0_1748217943__)

picked the latest

    # zfs destroy images/vm-107-disk-0 at __replicate_107-0_1748284321__
    could not find any snapshots to destroy; check snapshot names.

> (maybe the timeout in the code is too short ?)

as it seems to have actually happened, perhaps this is the case.

though only this node on this cluster.  hmmmm.

randy


From f.ebner at proxmox.com  Tue May 27 10:12:56 2025
From: f.ebner at proxmox.com (Fiona Ebner)
Date: Tue, 27 May 2025 10:12:56 +0200
Subject: [PVE-User] Interface not renamed...
In-Reply-To: <aDCJFTKo_i7FXI4_@sv.lnf.it>
References: <aDCJFTKo_i7FXI4_@sv.lnf.it>
Message-ID: <9af42a93-1eaa-4b19-b3de-db371fba6e50@proxmox.com>

Hi,
see also:
https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#_naming_conventions

The recommended way to avoid such renaming issues is to pin the name
based on MAC address:
https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#network_override_device_names

Best Regards,
Fiona


From alexandre.derumier at groupe-cyllene.com  Tue May 27 11:07:16 2025
From: alexandre.derumier at groupe-cyllene.com (DERUMIER, Alexandre)
Date: Tue, 27 May 2025 09:07:16 +0000
Subject: [PVE-User] replication failures
In-Reply-To: <m2tt57gmy5.wl-randy@psg.com>
References: <m25xhohweg.wl-randy@psg.com>
 <76289cc38898b0b5c2fb04784b5d4b139de89b8f.camel@groupe-cyllene.com>
 <m2tt57gmy5.wl-randy@psg.com>
Message-ID: <81aa195662f55ebab074c2873f57ec0ebd9c7405.camel@groupe-cyllene.com>


>>picked the latest
>>
>>??? # zfs destroy images/vm-107-disk-0 at __replicate_107-0_1748284321__
>>??? could not find any snapshots to destroy; check snapshot names.

> (maybe the timeout in the code is too short ?)

>>as it seems to have actually happened, perhaps this is the case.

yes,could be the delete task taking too much time (and correctly
finished in background after the timeout on the pve size)

I really don't known how much is the timeout in the code

From gaio at lilliput.linux.it  Tue May 27 22:46:20 2025
From: gaio at lilliput.linux.it (Marco Gaiarin)
Date: Tue, 27 May 2025 22:46:20 +0200
Subject: [PVE-User] Interface not renamed...
In-Reply-To: <9af42a93-1eaa-4b19-b3de-db371fba6e50@proxmox.com>
References: <aDCJFTKo_i7FXI4_@sv.lnf.it>
 <9af42a93-1eaa-4b19-b3de-db371fba6e50@proxmox.com>
Message-ID: <aDYknMHMZyOYoQZx@sv.lnf.it>

Mandi! Fiona Ebner
  In chel di` si favelave...

> see also:
> https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#_naming_conventions
> 
> The recommended way to avoid such renaming issues is to pin the name
> based on MAC address:
> https://pve.proxmox.com/pve-docs/chapter-sysadmin.html#network_override_device_names

Thanks, Fiona; but it is not the renaming 'per se' that scare me, was
written in the documentation, i was prepared of.

But probably i've found the culprit.


Systems get installed in some PVE 6.X version, and have a direct bond
between two nodes to manage replication and migration; in PVE6 we had some
trouble to setup jumbo frames for the bond, because some times the server
boot with MTU=9000 on the bond but not on the member link.

We solved it adding 'allow-bond2' to the member stanzas on
/etc/network/interfaces.


Seems to me that:

1) PVE 7 effectively rename interfaces; as just stated, this is expected, no
 trouble.

2) PVE 8 add 'ifupdown2', that is incompatible with 'allow-bond2' option AND
 in some way ''mangle' all the ifupdown -> ifupdown2 migration: server boot
with NO interface up, but also some other things seems does not work very
well.


Because i've still some server to migrate, i've removed 'allow-bond2'
options BEFORE initiate the upgrade PVE7 -> 8, and migration went smoothly,
precisely:
 - interface start at next boot
 - all interfece went up, even get renamed, because 'altname' get correctly
   used.


I hope i was clear. And useful. ;-)

-- 


From dorsyka at yahoo.com  Thu May 29 08:09:10 2025
From: dorsyka at yahoo.com (dorsy)
Date: Thu, 29 May 2025 08:09:10 +0200
Subject: [PVE-User] replication failures
In-Reply-To: <81aa195662f55ebab074c2873f57ec0ebd9c7405.camel@groupe-cyllene.com>
References: <m25xhohweg.wl-randy@psg.com>
 <76289cc38898b0b5c2fb04784b5d4b139de89b8f.camel@groupe-cyllene.com>
 <m2tt57gmy5.wl-randy@psg.com>
 <81aa195662f55ebab074c2873f57ec0ebd9c7405.camel@groupe-cyllene.com>
Message-ID: <76c48531-292d-458d-b756-e82a36a94339@yahoo.com>

A 'zpool history' could be a source of info what actually happened 
regarding snapshots and when.

On 5/27/2025 11:07 AM, DERUMIER, Alexandre wrote:
>>> picked the latest
>>>
>>>  ??? # zfs destroy images/vm-107-disk-0 at __replicate_107-0_1748284321__
>>>  ??? could not find any snapshots to destroy; check snapshot names.
>> (maybe the timeout in the code is too short ?)
>>> as it seems to have actually happened, perhaps this is the case.
> yes,could be the delete task taking too much time (and correctly
> finished in background after the timeout on the pve size)
>
> I really don't known how much is the timeout in the code
> _______________________________________________
> pve-user mailing list
> pve-user at lists.proxmox.com
> https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-user

-- 
?dv?zlettel,
Dorotovics L?szl?
rendszergazda
IKRON Fejleszt? ?s Szolg?ltat? Kft.
Sz?khely: 6721 Szeged, Szil?gyi utca 5-1.