[pmg-devel] [PATCH pmg-api] fix #5189: cluster: do not synchronize statistics that already exist

Friedrich Weber f.weber at proxmox.com
Wed Jan 24 14:56:26 CET 2024


After restoring a backup from a cluster on a fresh node with
statistics, and then creating a cluster, the following can happen
(node 1 being master and node 2 being a node): `ClusterInfo` on node
1 has no record about the last-synchronized `CStatistic` row id of
node 2. Thus, pmgmirror on node 1 initializes the record with -1 and
tries to synchronize *all* `CStatistic` rows with cid 2 from node 2.
But (some of) these rows may already exist on cid 1, because they
were part of the backup, so pmgmirror on node 1 triggers a Postgres
unique constraint violation, statistics synchronization on node 1
fails, and node 1 remains in the "synchronizing" state.

Fix this as follows: If `ClusterInfo` does not have a record on the
last-synchronized `CStatistic` row id of some node cid, initialize it
with the maximum row id that exists in the local `CStatistic` for that
node cid, or with -1 if the local `CStatistic` has no row for that
node cid. For this, use a SQL query that is already used in
DBTools::update_client_clusterinfo. As a result, pmgmirror will only
synchronize rows with row ids larger than the maximum row id from the
backup, and avoid running into a unique constraint violation.

Signed-off-by: Friedrich Weber <f.weber at proxmox.com>
---

Notes:
    Tested that the steps from #5189 [1] do not produce a constraint
    violation anymore, and both cluster nodes eventually become "active".
    Also tested that this patch doesn't break anything in the "restore
    without statistics" scenario.
    
    However, not 100% sure I haven't overlooked any edge cases, so
    additional testing is very welcome!
    
    [1] https://bugzilla.proxmox.com/show_bug.cgi?id=5189

 src/PMG/Cluster.pm |  3 +--
 src/PMG/DBTools.pm | 15 +++++++++++++++
 2 files changed, 16 insertions(+), 2 deletions(-)

diff --git a/src/PMG/Cluster.pm b/src/PMG/Cluster.pm
index 015e66a..3bf4c41 100644
--- a/src/PMG/Cluster.pm
+++ b/src/PMG/Cluster.pm
@@ -698,8 +698,7 @@ sub sync_statistic_db {
 
     my $count;
 
-    PMG::DBTools::create_clusterinfo_default(
-	$ldb, $rcid, 'lastid_CStatistic', -1, undef);
+    PMG::DBTools::create_clusterinfo_statistic_default($ldb, $rcid);
 
     do { # get new values
 
diff --git a/src/PMG/DBTools.pm b/src/PMG/DBTools.pm
index 9e133bc..6e682f8 100644
--- a/src/PMG/DBTools.pm
+++ b/src/PMG/DBTools.pm
@@ -1139,6 +1139,21 @@ sub create_clusterinfo_default {
     $sth->finish();
 }
 
+sub create_clusterinfo_statistic_default {
+    my ($dbh, $rcid) = @_;
+
+    my $sth = $dbh->prepare("SELECT * FROM ClusterInfo WHERE " .
+	"CID = ? AND Name = 'lastid_CStatistic'");
+    $sth->execute($rcid);
+    if (!$sth->fetchrow_hashref()) {
+	# initialize to maximum CStatistic rid locally present for $rcid (if
+	# any), might already exist if the node was restored from backup
+	$dbh->do("INSERT INTO ClusterInfo (cid, name, ivalue) SELECT $rcid, 'lastid_CStatistic', " .
+		  "COALESCE (max (rid), -1) FROM CStatistic WHERE cid = $rcid");
+    }
+    $sth->finish();
+}
+
 sub read_int_clusterinfo {
     my ($dbh, $rcid, $name) = @_;
 
-- 
2.39.2





More information about the pmg-devel mailing list