[pve-devel] [PATCH pve-storage 1/1] storage/plugin: implement ref-counting for disknames in get_next_vm_diskname
Hannes Duerr
h.duerr at proxmox.com
Wed Apr 3 17:07:12 CEST 2024
As Fabian has already mentioned here[0], there can be a race between two
parallel imports. More specifically, if both imports have --allow-rename
set and the desired name already exists, then it can happen that both
imports get the same name. The reason for this is that we currently only
check which names have already been assigned in the vm config and then
use the next free one, and do not check whether a running process has
already been assigned the same name. However, while writing and testing
the patch, I found that this is often not a problem, as
- in most cases the source VM config is locked and therefore only one
process can generate a new name (migrate, clone, move disk, update
config)
- a name must be specified and therefore no new name is generated (pvesm
alloc)
- the timeframe for the race is very short.
At the same time, it is possible that I have not considered all edge
cases and that there are other cases where the race can occur,
especially with regard to remote_migrations. You can provoke the problem
with two parallel imports on a host where local-lvm:vm-100-disk-0
already exists:
pvesm import local-lvm:vm-100-disk-0 raw+size <exported_volume> --allow-rename 1
Now that I've looked into the problem a bit, I'm not sure this patch is
even necessary as it adds more complexity. So I wanted to ask for your
opinion, wether you think it makes sense to add this change or not.
The patch introduces a tmp file which stores the newly assigned disk
names and the pid of the process which requested the disk name. If a
second process is assigned the same name, it will see from the file that
the name has already been assigned to another process, and will take the
next available name. Reading and writing to the tmp file requires a lock
to prevent races.
[0] https://lists.proxmox.com/pipermail/pve-devel/2024-January/061526.html
Signed-off-by: Hannes Duerr <h.duerr at proxmox.com>
---
src/PVE/Storage/Plugin.pm | 99 ++++++++++++++++++++++++++++++++++-----
1 file changed, 86 insertions(+), 13 deletions(-)
diff --git a/src/PVE/Storage/Plugin.pm b/src/PVE/Storage/Plugin.pm
index 7456c8e..f76550a 100644
--- a/src/PVE/Storage/Plugin.pm
+++ b/src/PVE/Storage/Plugin.pm
@@ -10,8 +10,10 @@ use File::chdir;
use File::Path;
use File::Basename;
use File::stat qw();
+use File::Copy;
use PVE::Tools qw(run_command);
+use PVE::ProcFSTools;
use PVE::JSONSchema qw(get_standard_option register_standard_option);
use PVE::Cluster qw(cfs_register_file);
@@ -779,23 +781,94 @@ my $get_vm_disk_number = sub {
sub get_next_vm_diskname {
my ($disk_list, $storeid, $vmid, $fmt, $scfg, $add_fmt_suffix) = @_;
- $fmt //= '';
- my $prefix = ($fmt eq 'subvol') ? 'subvol' : 'vm';
- my $suffix = $add_fmt_suffix ? ".$fmt" : '';
+ my $code = sub {
+ my $reserved_names_file = "/var/tmp/pve-reserved-volnames";
+ my $tmp_file = "/var/tmp/pve-reserved-volnames.tmp";
- my $disk_ids = {};
- foreach my $disk (@$disk_list) {
- my $disknum = $get_vm_disk_number->($disk, $scfg, $vmid, $suffix);
- $disk_ids->{$disknum} = 1 if defined($disknum);
- }
+ $fmt //= '';
+ my $prefix = ($fmt eq 'subvol') ? 'subvol' : 'vm';
+ my $suffix = $add_fmt_suffix ? ".$fmt" : '';
+ my $disk_id;
+ my $disk_ids = {};
- for (my $i = 0; $i < $MAX_VOLUMES_PER_GUEST; $i++) {
- if (!$disk_ids->{$i}) {
- return "$prefix-$vmid-disk-$i$suffix";
+ foreach my $disk (@$disk_list) {
+ my $disknum = $get_vm_disk_number->($disk, $scfg, $vmid, $suffix);
+ $disk_ids->{$disknum} = 1 if defined($disknum);
}
- }
- die "unable to allocate an image name for VM $vmid in storage '$storeid'\n"
+ for (my $i = 0; $i < $MAX_VOLUMES_PER_GUEST; $i++) {
+ if (!$disk_ids->{$i}) {
+ $disk_id = $i;
+ last;
+ }
+ }
+
+ if (! -e $reserved_names_file) {
+ my $create_h = IO::File->new($reserved_names_file, "w") ||
+ die "can't open or create'$reserved_names_file' - $!\n";
+ print $create_h "$storeid $vmid $disk_id $$";
+ $create_h->close;
+
+ return "$prefix-$vmid-disk-$disk_id$suffix";
+ } else {
+ my $collision;
+ my $pid;
+
+ my $in_h = IO::File->new($reserved_names_file, "r") ||
+ die "can't open or create'$reserved_names_file' - $!\n";
+ my $out_h = IO::File->new($tmp_file, "w") ||
+ die "can't open or create'$tmp_file' - $!\n";
+
+ # remove entries when the process does not exist anymore
+ while (my $line = <$in_h>) {
+ ($pid) = $line =~ /$PVE::JSONSchema::PVE_ID \d+ \d+ (\d+)/;
+ next if (!defined $pid || !defined PVE::ProcFSTools::check_process_running($pid));
+ print $out_h $line;
+ }
+
+ $in_h->close;
+ $out_h->close;
+ move($tmp_file, $reserved_names_file);
+
+ for (my $y = $disk_id; $y < $disk_id + 5; $y++) {
+ undef $collision;
+
+ my $input_h = IO::File->new($reserved_names_file, "r") ||
+ die "can't open or create'$reserved_names_file' - $!\n";
+
+ while (my $line = <$input_h>) {
+ if ($disk_ids->{$y} || $line =~ /$storeid $vmid $y \d+/) {
+ $collision=1;
+ last;
+ }
+ }
+
+ # Found new free file name so we add it to the reserved file
+ if (!defined $collision) {
+ my $append_h = IO::File->new($reserved_names_file, "a") ||
+ die "can't open or create'$reserved_names_file' - $!\n";
+
+ print $append_h "$storeid $vmid $disk_id $$";
+ $append_h->close;
+
+ return "$prefix-$vmid-disk-$disk_id$suffix";
+ }
+
+ $input_h->close;
+ }
+
+ return;
+ }
+
+ };
+
+ my $new_disk_name = PVE::Tools::lock_file('/var/lock/pve-disk-names.lck', 10, $code);
+ die $@ if $@;
+
+ if (!$new_disk_name) {
+ die "unable to allocate an image name for VM $vmid in storage '$storeid'\n";
+ }
+ return $new_disk_name;
}
sub find_free_diskname {
--
2.39.2
More information about the pve-devel
mailing list