[pve-devel] [RFC pve-cluster] pmxcfs: increase max filesize from 128k to 512k

Thomas Lamprecht t.lamprecht at proxmox.com
Mon Sep 12 17:50:54 CEST 2016


This fixes bug 1014 and also fixes a few other problems where user
ran into the file size limitation, I did not found the bug entries
for them, but they covered:
1) there was a maximum of about <1500 services which could be
   managed by our HA manager, as after that the manager_status file
   got to big

2) firewall rules may also reach this limit on a bigger setup

I tested this with concurrent started read/writes of random data
files from and into RAM (tmpfs mounts), as long as we do not flush
often and read everything at once (i.e. write/read with a big block
size) the performance stays good.

The limiting factor in speed is not corosyncs CPG but sqlite, that
can be seen when comparing worst case scenarios between local pmxcfs
and clustered pmxcfs instances and simple debug logging.

We optimize our sqlite usage quite heavy, relevant additional speed
gains cannot be made without loosing reliability, as far as I've
seen.

So I only got into problems if I read/wrote small blocks
with a few hundred big writes started at once, e.g.
for i in {1..100}
do
    dd if=/tmp/random512k.data of="/etc/pve/data$i" bs=1k &
done

As with the above worst case each block gets written as a single
transaction to the database, where each transaction has to be locked
and synced to disk for reliability.
So packing all changes (i.e. the whole file) into one DB transaction
does not produces much overhead of 512k files compared to 128k files

As data written through the PVE framework is written and read in
such a way we can increase this without seeing much of a
performance impact.

It should be also noted that just because files can now get bigger
not a lot will get that. Rather there may be just one to three files
bigger than 128k on some setups.

Signed-off-by: Thomas Lamprecht <t.lamprecht at proxmox.com>
---

While I did test this much some additional testing of worst case
scenarios would, as always, be nice.

 data/src/memdb.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/data/src/memdb.h b/data/src/memdb.h
index 4f4623b..89d8d32 100644
--- a/data/src/memdb.h
+++ b/data/src/memdb.h
@@ -28,7 +28,7 @@
 #include <glib.h>
 #include <sys/statvfs.h>
 
-#define MEMDB_MAX_FILE_SIZE (128*1024)
+#define MEMDB_MAX_FILE_SIZE (512*1024)
 #define MEMDB_MAX_FSSIZE (30*1024*1024)
 #define MEMDB_MAX_INODES 10000
 
-- 
2.1.4





More information about the pve-devel mailing list