[pve-devel] opensource vm scheduler : btrplace
Alexandre DERUMIER
aderumier at odiso.com
Wed May 29 16:39:03 CEST 2019
>>here the academic paper of opennebula scheduler
>>https://is.muni.cz/th/o8t7a/thesis.pdf
Damn, sorry, this is not the current scheduler implementation of nebula, this is another new version
improved, but java too :/
----- Mail original -----
De: "aderumier" <aderumier at odiso.com>
À: "pve-devel" <pve-devel at pve.proxmox.com>
Cc: "Thomas Lamprecht" <t.lamprecht at proxmox.com>
Envoyé: Mercredi 29 Mai 2019 16:30:57
Objet: Re: [pve-devel] opensource vm scheduler : btrplace
>>Also, In my research, the opennebula scheduler is more basic, but should be implementable in perl without too much difficulty
>>https://github.com/OpenNebula/one/blob/441cf1f7f9e726cb5f200d661d50e92a4042fff7/src/scheduler/src/sched/Scheduler.cc
>>
>>(It's migrate 1vm, recompute, migrate 1vm, recompute,...).
>>So it's best effort, but could works for basic scheduling.(cpu/ram,ha group, affinity,antifinity)
here the academic paper of opennebula scheduler
https://is.muni.cz/th/o8t7a/thesis.pdf
----- Mail original -----
De: "aderumier" <aderumier at odiso.com>
À: "Thomas Lamprecht" <t.lamprecht at proxmox.com>
Cc: "pve-devel" <pve-devel at pve.proxmox.com>
Envoyé: Mercredi 29 Mai 2019 15:44:29
Objet: Re: [pve-devel] opensource vm scheduler : btrplace
>>It also needs to be integrated that a VM which is currently locked (e.g.,
>>for backup or snapshot) must be marked as temporarily non-migratable,
>>only if such information can be passed to the scheduler and it can use
>>our (API) methods this could be of use...
I think it possible to add new states to the model
https://github.com/btrplace/scheduler/wiki/VMs-and-nodes-life-cycle
>>After quick skip over those it seems that they are only related,
>>e.g., showing how one could prove certain things in such an evironment
>>but not the (full) idea behind the scheduler itself.
>>The "real deal" which actually describe what they do is sadly behind
>>a paywal: https://ieeexplore.ieee.org/abstract/document/6409358
here the full version: (note that they are improvement since 2013)
https://pages.lip6.fr/Julia.Lawall/btrplace-tdsc2013.pdf
>>Sorry, I did not wanted to damper your enthusiasm about finally finding
>>a really good solution for this, just thinking a bout a realistic
>>integration.. Also the java part really won't fly, not only from me, but
>>also Dietmar et al. won't like it.
I don't like java too :p (and his garbage collector)
>>Do you think you can find out about the real algorithms they use?
>>I guess that porting this over to something without a runtime (Java or
>>else) should not be to problematic (I hope I'm not to naïve here ^^)
>>and more people/projects could benefit from it..
they use
http://www.choco-solver.org/
(java too :/)
I don't known if they exist some kind of magic java converter to another language (rust,...) ?
BTW, they demo app is very nice for simulation, could be great web version of HA simulator.
Also, In my research, the opennebula scheduler is more basic, but should be implementable in perl without too much difficulty
https://github.com/OpenNebula/one/blob/441cf1f7f9e726cb5f200d661d50e92a4042fff7/src/scheduler/src/sched/Scheduler.cc
(It's migrate 1vm, recompute, migrate 1vm, recompute,...).
So it's best effort, but could works for basic scheduling.(cpu/ram,ha group, affinity,antifinity)
----- Mail original -----
De: "Thomas Lamprecht" <t.lamprecht at proxmox.com>
À: "pve-devel" <pve-devel at pve.proxmox.com>, "aderumier" <aderumier at odiso.com>
Envoyé: Mercredi 29 Mai 2019 10:34:01
Objet: Re: [pve-devel] opensource vm scheduler : btrplace
On 5/29/19 10:00 AM, Alexandre DERUMIER wrote:
> and The algorithm compute the whole placements (which is super difficult to implement fastly, as the number of combinaisons compute can be really hurge),
> and give the whole migration order. (benchmarks show some seconds to compute 10000vms on 10000 nodes)
I mean it's effectively the "knapsack problem
> It's also taking in count the estimated time of migration (based on network bandwith and also number of dirty pages changes in qemu),
> and do parallel migrations.
> They are a small interactive demo here
> http://www.btrplace.org/play/
>
> (source code of the demo frontend:https://github.com/btrplace/play backend: https://github.com/btrplace/playd)
look interesting
>
>
> Some presentations (a lot in french, as it's a research project of a french university, but it's seem to be used by nutatix in production):
"Fabien Hermenier", the main contact on the WebSite works for nutanix
> https://webcast.in2p3.fr/video/a_flexible_virtual_machine_placement_algorithm_for_iaas_clouds_to_fit_evolving_user_requirements
> https://fhermeni.github.io/pubs/hermenier-rescom17.pdf
>
> some academic papers:
> http://www.btrplace.org/pubs/hermenier-socc17.pdf
> http://www.btrplace.org/pubs/kherbache-tcc17.pdf
>>After quick skip over those it seems that they are only related,
>>e.g., showing how one could prove certain things in such an evironment
>>but not the (full) idea behind the scheduler itself.
>>The "real deal" which actually describe what they do is sadly behind
>>a paywal: https://ieeexplore.ieee.org/abstract/document/6409358
>>It seems that the interesting code lives here:
>>https://github.com/btrplace/scheduler/tree/master/api/src/main/java/org/btrplace
>
> Now, the main problem, is that it's java. (seem that scientific like it, redhat rhev/ovirt have also implement scheduling algo model with java).
> I don't known if it could be implemented in proxmox? (or at least with a daemon like the daemon, and rest api call from perl to java? Importing java class in perl ???)
>
That's not good, we really do not want java runtime for anything
in Proxmox, so not completely the holy grail..
But, currently there are some workings going on to see if Rust could
be used for things needing to be fast, so maybe we could take the ideas
and do so? I mean we if this should be used in PVE you need integration
anyway, one needs to have in mind that backups, replications, ... can
happen and one cannot just do a migration on QEMU level.
It also needs to be integrated that a VM which is currently locked (e.g.,
for backup or snapshot) must be marked as temporarily non-migratable,
only if such information can be passed to the scheduler and it can use
our (API) methods this could be of use...
Sorry, I did not wanted to damper your enthusiasm about finally finding
a really good solution for this, just thinking a bout a realistic
integration.. Also the java part really won't fly, not only from me, but
also Dietmar et al. won't like it.
Do you think you can find out about the real algorithms they use?
I guess that porting this over to something without a runtime (Java or
else) should not be to problematic (I hope I'm not to naïve here ^^)
and more people/projects could benefit from it..
_______________________________________________
pve-devel mailing list
pve-devel at pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
_______________________________________________
pve-devel mailing list
pve-devel at pve.proxmox.com
https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
More information about the pve-devel
mailing list