[pve-devel] [RFC qemu/storage/qemu-server/container/manager 00/23] backup provider API

Jonathan Nicklin jnicklin at blockbridge.com
Sun Jul 28 15:54:48 CEST 2024


In hyper-converged deployments, the node performing the backup is sourcing ((nodes-1)/(nodes))*bytes) of backup data (i.e., ingress traffic) and then sending 1*bytes to PBS (i.e., egress traffic). If PBS were to pull the data from the nodes directly, the maximum load on any one host would be (1/nodes)*bytes of egress traffic only... that's a considerable improvement!

Further, nodes that don't host OSDs would be completely quiet. So, in the case of non-converged CEPH, the hypervisor nodes do not need to participate in the backup flow at all.

> On Jul 28, 2024, at 2:46 AM, Dietmar Maurer <dietmar at proxmox.com> wrote:
> 
>> Today, I believe the client is reading the data and pushing it to
>> PBS. In the case of CEPH, wouldn't this involve sourcing data from
>> multiple nodes and then sending it to PBS? Wouldn't it be more
>> efficient for PBS to read it directly from storage? In the case of
>> centralized storage, we'd like to eliminate the client load
>> completely, having PBS ingest increment differences directly from
>> storage without passing through the client.
> 
> But Ceph is not a central storage. Instead, data is distributed among the nodes, so you always need to send some data over the network.
> There is no way to "read it directly from storage".
> 




More information about the pve-devel mailing list