[pve-devel] Speed up PVE Backup

Wed Jul 20 12:37:43 CEST 2016

Hi again,

I've been looking around the backup/restore code a bit. I'm focused on 
restore acceleration on Ceph RBD right know.

Sorry if I have something mistaken, I have never developed for Proxmox/Qemu.

I see in line 563 of file
https://git.proxmox.com/?p=pve-qemu-kvm.git;a=blob;f=debian/patches/pve/0011-introduce-new-vma-archive-format.patch;h=1c26209648c210f3b18576abc2c5a23768fd7c7b;hb=HEAD
the function restore_write_data, it is calling full_write (for direct to 
file restore) and bdrv_write (what I suppose is a QEMU abstraction of 
block device).

This is called from restore_extents, where a comment precisely says "try 
to write whole clusters to speedup restore", so this means we're writing 
64KB-8Byte chunks, which is giving a hard time to Ceph-RBD because this 
means lots of ~64KB IOPS.

So, I suggest the following solution to your consideration:
- Create a write buffer on startup (let's asume it's 4MB for example, a 
number ceph rbd would like much more than 64KB). This could even be 
configurable and skip the buffer altogether if buffer_size=cluster_size
- Wrap current "restore_write_data" with a 
"restore_write_data_with_buffer", that does a copy to the 4MB buffer, 
and only calls "restore_write_data" when it's full.
     * Create a new "flush_restore_write_data_buffer" to flush the write 
buffer when device restore reading is complete.

Do you think this is a good idea? If so I will find time to implement 
and test this to check whether restore time improves.

Thanks a lot
Eneko

El 20/07/16 a las 08:24, Eneko Lacunza escribió:
> El 16/02/16 a las 15:52, Stefan Priebe - Profihost AG escribió:
>> Am 16.02.2016 um 15:50 schrieb Dmitry Petuhov:
>>> 16.02.2016 13:20, Dietmar Maurer wrote:
>>>>> Storage Backend is ceph using 2x 10Gbit/s and i'm able to read 
>>>>> from it
>>>>> with 500-1500MB/s. See below for an example.
>>>> The backup process reads 64KB blocks, and it seems this slows down 
>>>> ceph.
>>>> This is a known behavior, but I found no solution to speed it up.
>>> Just done script to speedup my backups from ceph. It's simply does
>>> (actually little more):
>>> rbd snap create $SNAP
>>> rbd export $SNAP $DUMPDIR/$POOL-$VOLUME-$DATE.raw
>>> rbd snap rm $SNAP
>>> for every image in selected pools.
>>>
>>> When exporting to file, it's faster than my temporary HDD can write
>>> (about 120MB/s). But exporting to STDOUT ('-' instead of filename, with
>>> compression or without it) noticeably decreases speed to qemu's levels
>>> (20-30MB/s). That's little strange.
>>>
>>> This method is incompatible with PVE's backup-restore tools, but good
>>> enough for manual disaster recovery from CLI.
>> right - that'S working for me too but just at night and not when a
>> single user wants RIGHT now a backup incl. config.
> Do we have any improvement related to this in the pipeline? Yesterday 
> our 9-osd 3-node cluster restored a backup at 6MB/s... it was very 
> boring, painfull and expensive to wait for it :) (I decided to buy a 
> new server to replace our 7.5-year IBM while waiting ;) )
>
> Our backups are slow too, but we do those during weekend... but 
> usually we want to restore fast... :)
>
> Dietmar, I haven't  looked at the backup/restore code, but do you 
> think we could do something to read/write to storage in larger chunks 
> than the actual 64KB? I'm out of a high work load period and maybe 
> could look at this issue this summer.
>
> Thanks
> Eneko
>

-- 
Zuzendari Teknikoa / Director Técnico
Binovo IT Human Project, S.L.
Telf. 943493611
       943324914
Astigarraga bidea 2, planta 6 dcha., ofi. 3-2; 20180 Oiartzun (Gipuzkoa)
www.binovo.es