[pbs-devel] [PATCH proxmox-backup 16/22] file-restore-daemon: add watchdog module

Wolfgang Bumiller w.bumiller at proxmox.com
Wed Feb 17 12:29:03 CET 2021


On Wed, Feb 17, 2021 at 12:14:39PM +0100, Stefan Reiter wrote:
> On 17/02/2021 11:52, Wolfgang Bumiller wrote:
> > On Tue, Feb 16, 2021 at 06:07:04PM +0100, Stefan Reiter wrote:
> > > Add a watchdog that will automatically shut down the VM after 10
> > > minutes, if no API call is received.
> > > 
> > > This is handled using the unix 'alarm' syscall.
> > > 
> > > Signed-off-by: Stefan Reiter <s.reiter at proxmox.com>
> > > ---
> > >   src/api2/types/file_restore.rs             |  3 ++
> > >   src/bin/proxmox-restore-daemon.rs          |  5 ++
> > >   src/bin/proxmox_restore_daemon/api.rs      | 22 ++++++--
> > >   src/bin/proxmox_restore_daemon/mod.rs      |  3 ++
> > >   src/bin/proxmox_restore_daemon/watchdog.rs | 63 ++++++++++++++++++++++
> > >   5 files changed, 91 insertions(+), 5 deletions(-)
> > >   create mode 100644 src/bin/proxmox_restore_daemon/watchdog.rs
> > > 
> > > diff --git a/src/api2/types/file_restore.rs b/src/api2/types/file_restore.rs
> > > index cd8df16a..710c6d83 100644
> > > --- a/src/api2/types/file_restore.rs
> > > +++ b/src/api2/types/file_restore.rs
> > > @@ -8,5 +8,8 @@ use proxmox::api::api;
> > >   pub struct RestoreDaemonStatus {
> > >       /// VM uptime in seconds
> > >       pub uptime: i64,
> > > +    /// time left until auto-shutdown, keep in mind that this is inaccurate when 'keep-timeout' is
> > > +    /// not set, as then after the status call the timer will have reset
> > > +    pub timeout: i64,
> > >   }
> > > diff --git a/src/bin/proxmox-restore-daemon.rs b/src/bin/proxmox-restore-daemon.rs
> > > index 1ec90794..d30da563 100644
> > > --- a/src/bin/proxmox-restore-daemon.rs
> > > +++ b/src/bin/proxmox-restore-daemon.rs
> > > @@ -40,6 +40,9 @@ fn main() -> Result<(), Error> {
> > >           .write_style(env_logger::WriteStyle::Never)
> > >           .init();
> > > +    // start watchdog, failure is a critical error as it leads to a scenario where we never exit
> > > +    watchdog_init()?;
> > > +
> > >       proxmox_backup::tools::runtime::main(run())
> > >   }
> > > @@ -77,6 +80,8 @@ fn accept_vsock_connections(
> > >                   Ok(stream) => {
> > >                       if sender.send(Ok(stream)).await.is_err() {
> > >                           error!("connection accept channel was closed");
> > > +                    } else {
> > > +                        watchdog_ping();
> > 
> > Should the ping not also happen at every api call in case connections
> > get reused?
> > 
> 
> I wanted to keep as much watchdog code out of API calls, lest some new code
> forgets to call a ping(), but yes, I didn't think of connection reuse (it
> doesn't currently happen anywhere, but still good to be safe).

So maybe the API handler should just get some kind of callback to
trigger before api calls.





More information about the pbs-devel mailing list