[pve-devel] opensource vm scheduler : btrplace

Thomas Lamprecht t.lamprecht at proxmox.com
Wed May 29 10:34:01 CEST 2019


On 5/29/19 10:00 AM, Alexandre DERUMIER wrote:
> and The algorithm compute the whole placements (which is super difficult to implement fastly, as the number of combinaisons compute can be really hurge),
> and give the whole migration order. (benchmarks show some seconds to compute 10000vms on 10000 nodes)

I mean it's effectively the "knapsack problem

> It's also taking in count the estimated time of migration (based on network bandwith and also number of dirty pages changes in qemu),
> and do parallel migrations.


> They are a small interactive demo here
> http://www.btrplace.org/play/
> 
> (source code of the demo frontend:https://github.com/btrplace/play backend: https://github.com/btrplace/playd)

look interesting

> 
> 
> Some presentations (a lot in french, as it's a research project of a french university, but it's seem to be used by nutatix in production):

"Fabien Hermenier", the main contact on the WebSite works for nutanix

> https://webcast.in2p3.fr/video/a_flexible_virtual_machine_placement_algorithm_for_iaas_clouds_to_fit_evolving_user_requirements
> https://fhermeni.github.io/pubs/hermenier-rescom17.pdf
> 
> some academic papers:
> http://www.btrplace.org/pubs/hermenier-socc17.pdf
> http://www.btrplace.org/pubs/kherbache-tcc17.pdf

After quick skip over those it seems that they are only related,
e.g., showing how one could prove certain things in such an evironment
but not the (full) idea behind the scheduler itself.
The "real deal" which actually describe what they do is sadly behind
a paywal: https://ieeexplore.ieee.org/abstract/document/6409358

It seems that the interesting code lives here:
https://github.com/btrplace/scheduler/tree/master/api/src/main/java/org/btrplace

> 
> Now, the main problem, is that it's java. (seem that scientific like it, redhat rhev/ovirt have also implement scheduling algo model with java).
> I don't known if it could be implemented in proxmox? (or at least with a daemon like the daemon, and rest api call from perl to java? Importing java class in perl ???)
> 

That's not good, we really do not want java runtime for anything
in Proxmox, so not completely the holy grail..
But, currently there are some workings going on to see if Rust could
be used for things needing to be fast, so maybe we could take the ideas
and do so? I mean we if this should be used in PVE you need integration
anyway, one needs to have in mind that backups, replications, ... can
happen and one cannot just do a migration on QEMU level.

It also needs to be integrated that a VM which is currently locked (e.g.,
for backup or snapshot) must be marked as temporarily non-migratable,
only if such information can be passed to the scheduler and it can use
our (API) methods this could be of use...
Sorry, I did not wanted to damper your enthusiasm about finally finding
a really good solution for this, just thinking a bout a realistic
integration.. Also the java part really won't fly, not only from me, but
also Dietmar et al. won't like it.

Do you think you can find out about the real algorithms they use?
I guess that porting this over to something without a runtime (Java or
else) should not be to problematic (I hope I'm not to naïve here ^^)
and more people/projects could benefit from it..





More information about the pve-devel mailing list