[pdm-devel] [PATCH datacenter-manager 3/7] server: api: collect failed remotes list while getting status
Christian Ebner
c.ebner at proxmox.com
Tue Oct 14 15:10:26 CEST 2025
On 10/14/25 2:53 PM, Dominik Csapak wrote:
>
>
> On 10/14/25 2:41 PM, Christian Ebner wrote:
>> On 10/14/25 10:37 AM, Dominik Csapak wrote:
>>> comments inline
>>>
>>> On 10/13/25 10:56 AM, Christian Ebner wrote:
>>>> Include name, remote type and error message for failed remotes when
>>>> gathering status information, in order to be able to discriminate
>>>> errors by remote type for the dashboard. To get the per-remote-type
>>>> resources, utilize the now previously exposed remote type filtering
>>>> for resources.
>>>>
>>>> Signed-off-by: Christian Ebner <c.ebner at proxmox.com>
>>>> ---
>>>> server/src/api/resources.rs | 105 +++++++++++++++++++
>>>> +----------------
>>>> 1 file changed, 58 insertions(+), 47 deletions(-)
>>>>
>>>> diff --git a/server/src/api/resources.rs b/server/src/api/resources.rs
>>>> index 029106f..33da5c2 100644
>>>> --- a/server/src/api/resources.rs
>>>> +++ b/server/src/api/resources.rs
>>>> @@ -9,9 +9,9 @@ use futures::FutureExt;
>>>> use pbs_api_types::{DataStoreStatusListItem, NodeStatus};
>>>> use pdm_api_types::remotes::{Remote, RemoteType};
>>>> use pdm_api_types::resource::{
>>>> - PbsDatastoreResource, PbsNodeResource, PveLxcResource,
>>>> PveNodeResource, PveQemuResource,
>>>> - PveSdnResource, PveStorageResource, RemoteResources, Resource,
>>>> ResourceType, ResourcesStatus,
>>>> - SdnStatus, SdnZoneResource, TopEntities,
>>>> + FailedRemote, PbsDatastoreResource, PbsNodeResource,
>>>> PveLxcResource, PveNodeResource,
>>>> + PveQemuResource, PveSdnResource, PveStorageResource,
>>>> RemoteResources, Resource, ResourceType,
>>>> + ResourcesStatus, SdnStatus, SdnZoneResource, TopEntities,
>>>> };
>>>> use pdm_api_types::subscription::{
>>>> NodeSubscriptionInfo, RemoteSubscriptionState,
>>>> RemoteSubscriptions, SubscriptionLevel,
>>>> @@ -373,55 +373,66 @@ pub async fn get_status(
>>>> max_age: u64,
>>>> rpcenv: &mut dyn RpcEnvironment,
>>>> ) -> Result<ResourcesStatus, Error> {
>>>> - let remotes = get_resources(max_age, None, None, rpcenv).await?;
>>>> let mut counts = ResourcesStatus::default();
>>>> - for remote in remotes {
>>>> - if remote.error.is_some() {
>>>> - counts.failed_remotes += 1;
>>>> - } else {
>>>> - counts.remotes += 1;
>>>> - }
>>>> - for resource in remote.resources {
>>>> - match resource {
>>>> - Resource::PveStorage(r) => match r.status.as_str() {
>>>> - "available" => counts.storages.available += 1,
>>>> - _ => counts.storages.unknown += 1,
>>>> - },
>>>> - Resource::PveQemu(r) => match r.status.as_str() {
>>>> - _ if r.template => counts.qemu.template += 1,
>>>> - "running" => counts.qemu.running += 1,
>>>> - "stopped" => counts.qemu.stopped += 1,
>>>> - _ => counts.qemu.unknown += 1,
>>>> - },
>>>> - Resource::PveLxc(r) => match r.status.as_str() {
>>>> - _ if r.template => counts.lxc.template += 1,
>>>> - "running" => counts.lxc.running += 1,
>>>> - "stopped" => counts.lxc.stopped += 1,
>>>> - _ => counts.lxc.unknown += 1,
>>>> - },
>>>> - Resource::PveNode(r) => match r.status.as_str() {
>>>> - "online" => counts.pve_nodes.online += 1,
>>>> - "offline" => counts.pve_nodes.offline += 1,
>>>> - _ => counts.pve_nodes.unknown += 1,
>>>> - },
>>>> - Resource::PveSdn(r) => {
>>>> - if let PveSdnResource::Zone(_) = &r {
>>>> - match r.status() {
>>>> - SdnStatus::Available => {
>>>> - counts.sdn_zones.available += 1;
>>>> - }
>>>> - SdnStatus::Error => {
>>>> - counts.sdn_zones.error += 1;
>>>> - }
>>>> - SdnStatus::Unknown => {
>>>> - counts.sdn_zones.unknown += 1;
>>>> + for remote_type in [RemoteType::Pve, RemoteType::Pbs] {
>>>> + let remote_type_search =
>>>> + SearchTerm::new(remote_type.to_string()).category(Some("remote-
>>>> type"));
>>>> + let remote_type_search = remote_type_search.to_string();
>>>> + let remotes =
>>>> + get_resources_impl(max_age, Some(remote_type_search),
>>>> None, Some(rpcenv)).await?;
>>>
>>>
>>> if the `RemoteResources` struct had the complete (or at least more than
>>> the name) struct of the remote, we could keep the current structure
>>> of just looping over that result instead of manually crafting a search
>>> for each type
>>>
>>> which would also make the patch diff itself a lot smaller i think
>>
>> That will not work though? The whole point of looping over the remote
>> types here is to have it in case of an error state, allowing to add
>> the failed remote including it's type.
>>
>> So adding that additional information to the RemoteResouces does not
>> solve that issue, and I would rather opt to keep the iteration here.
>
> i meant that in `get_resources_impl` we already iterate over all remotes
> to collect the info.
>
> instead of just saving the name per remote we could add more info
> (like the type) so
>
> instead of
> ---8<---
> RemoteResources {
> remote: remote_name,
> resources,
> error,
> }
> --->8---
>
> we could do
>
> ---8<---
> RemoteResources {
> remote,
> resources,
> error,
> }
> --->8---
>
> where `remote` is some struct that also holds the remote-type info?
>
> then we can simply loop over these and extract the type directly from
> that
>
>
> or am I missing something here?
Ah, I see what you mean now! However adding the full remote is not
really an option as this would leak secrets from the remote config. I
could of course extend the RemoteResource by the remote type only, but
that would pollute this API response type with unneeded additional data,
which I would like to avoid unless there is another usecase for this?
I think it is probably best to use an internal return type or tuple for
the get_resources_impl only, dropping the non RemoteResource part when
returned by api calls.
More information about the pdm-devel
mailing list