[pve-devel] [PATCH pve-manager] metrics add OpenTelemetry support

Thomas Lamprecht t.lamprecht at proxmox.com
Tue Jul 15 08:52:50 CEST 2025


Hello,

Am 15.07.25 um 05:55 schrieb nansen.su:
>   This patch adds OpenTelemetry metrics collection to the PVE manager 
>   to improve observability and monitoring capabilities.
> 
>   The implementation includes:
>   - OTLP/HTTP JSON protocol support for OpenTelemetry Collector
>   - Comprehensive metrics collection for nodes, VMs, containers, and storage
>   - Batching with configurable size limits and compression
>   - Full compliance with OpenTelemetry v1 specification
> 
>   Technical features:
>   - Server/port configuration with HTTP/HTTPS protocol support
>   - Gzip compression with configurable body size limits (default 10MB)
>   - Custom HTTP headers (Bearer tokens, API keys)
>   - Resource attributes support (Unicode support)
>   - Timeout control and SSL certificate verification options
>   - Recursive metrics conversion supporting all PVE data types
> 
>   Signed-off-by: Nansen Su <nansen.su at sianit.com>

Thanks for your contribution, can you please send a signed CLA [0] to
office at, this is a requirement for us to take in such a patch.

[0]: https://pve.proxmox.com/wiki/Developer_Documentation#Software_License_and_Copyright

I tried to gave the code a good look, but ran a bit out of time towards the end,
but all in all it doesn't look bad at all, nice!

> 
> ---
>  PVE/ExtMetric.pm                    |   2 +
>  PVE/Status/Makefile                 |   1 +
>  PVE/Status/OpenTelemetry.pm         | 628 ++++++++++++++++++++++++++++
>  www/manager6/dc/MetricServerView.js | 259 ++++++++++++
>  4 files changed, 890 insertions(+)
>  create mode 100644 PVE/Status/OpenTelemetry.pm
> 
> diff --git a/PVE/ExtMetric.pm b/PVE/ExtMetric.pm
> index 02e7c327..ebc2817b 100644
> --- a/PVE/ExtMetric.pm
> +++ b/PVE/ExtMetric.pm
> @@ -6,9 +6,11 @@ use warnings;
>  use PVE::Status::Plugin;
>  use PVE::Status::Graphite;
>  use PVE::Status::InfluxDB;
> +use PVE::Status::OpenTelemetry;
>  
>  PVE::Status::Graphite->register();
>  PVE::Status::InfluxDB->register();
> +PVE::Status::OpenTelemetry->register();
>  PVE::Status::Plugin->init();
>  
>  sub foreach_plug($&) {
> diff --git a/PVE/Status/Makefile b/PVE/Status/Makefile
> index c2f2edbc..eebce6b7 100644
> --- a/PVE/Status/Makefile
> +++ b/PVE/Status/Makefile
> @@ -3,6 +3,7 @@ include ../../defines.mk
>  PERLSOURCE = 			\
>  	Graphite.pm		\
>  	InfluxDB.pm		\
> +	OpenTelemetry.pm	\
>  	Plugin.pm
>  
>  all:
> diff --git a/PVE/Status/OpenTelemetry.pm b/PVE/Status/OpenTelemetry.pm
> new file mode 100644
> index 00000000..3de6ac51
> --- /dev/null
> +++ b/PVE/Status/OpenTelemetry.pm
> @@ -0,0 +1,628 @@
> +package PVE::Status::OpenTelemetry;
> +
> +use strict;
> +use warnings;
> +
> +use PVE::Status::Plugin;
> +use base qw(PVE::Status::Plugin);
> +
> +use JSON;
> +use LWP::UserAgent;
> +use HTTP::Request;

This comes from libhttp-message-perl and we already depend on it in pve-common
and the pve-http-server, but still, for sake of completness we should now also
add libhttp-message-perl as dependency in the debian/control file of the
pve-manager package

> +use IO::Compress::Gzip qw(gzip $GzipError);

IIRC this uses the slower perl based implementation, in our pve-http-server
repo we use the `Compress::Zlib::memGzip` method, which is also shipped by the
core perl modules but uses the system zlib. This here is probably not really
performance critical, but might make sense to use the same thing as the
pve-http-server.

See https://perldoc.perl.org/5.40.1/Compress::Zlib


> +use PVE::Tools qw(extract_param);

nit: please group imports for Proxmox dependencies separately and sort
imports alphabetically in each group.

> +use Encode;
> +use MIME::Base64;

FYI: this imports decode_base64 by default, so you can either use that or
avoid the default import by adding a empty list like qw() at the end:

    use MIME::Base64 qw(); # no default imports

    # or explicitly import decode_base64 for import hygiene and clarity
    use MIME::Base64 qw(decode_base64);

> +
> +sub type {
> +    return 'opentelemetry';
> +}
> +
> +sub properties {
> +    return {
> +        'otel-protocol' => {

I'm fine with the plugin specific otel prefix, but @Dominik: these here might be
a good fit for the property separation? It's not many plugins and each of them has
not that many properties. Or do you know anything that would speak against this?

Anyhow, nothing that needs to block this for real and nothing you @Nansen Su need
to worry about.

> +            type => 'string',
> +            enum => ['http', 'https'],
> +            description => 'HTTP protocol',
> +            default => 'https',
> +        },
> +        'otel-path' => {
> +            type => 'string',
> +            description => 'OTLP endpoint path',
> +            default => '/v1/metrics',
> +            optional => 1,
> +        },
> +        'otel-timeout' => {
> +            type => 'integer',
> +            description => 'HTTP request timeout in seconds',
> +            default => 30,

that's a rather high default timeout given that pvestatd produces stats every
10s, can me lower this to 5s or less?

> +            minimum => 1,
> +            maximum => 300,
> +        },
> +        'otel-headers' => {
> +            type => 'string',
> +            description => 'Custom HTTP headers (JSON format, base64 encoded)',
> +            optional => 1,
> +        },
> +        'otel-verify-ssl' => {
> +            type => 'boolean',
> +            description => 'Verify SSL certificates',
> +            default => 1,
> +        },
> +        'otel-max-body-size' => {
> +            type => 'integer',
> +            description => 'Maximum request body size in bytes',
> +            default => 10_000_000,
> +            minimum => 1024,
> +        },
> +        'otel-resource-attributes' => {
> +            type => 'string',
> +            description => 'Additional resource attributes as JSON, base64 encoded',

Can you provide an example about what one might but in here?

Mostly asking to derive a reasonable maximum length, as we would like to
always have some limit for such free-form strings, especially as pmxcfs,
the FUSE filesystem backing /etc/pve, imposes some relatively low max
file size, so one entry being able to use up all of that is not really
ideal.

Something between 1 and 10 KiB is often a good starting limit, we can
increase this relatively easily if anybody runs into it, but lowering it
in the future is hard. E.g:

    maxLength => 1024,

> +            optional => 1,
> +        },
> +        'otel-compression' => {
> +            type => 'boolean',
> +            description => 'Enable gzip compression for requests',

The property is already named quite generic, maybe make this an enum here
to more easily allow adding other compression algorithms in the future?
Something like:

    type => 'string',
    enum => ['none', 'gzip'],
    default => 'none'
    optional => 1

> +            default => 1,
> +            optional => 1,
> +        },
> +    };
> +}
> +
> +sub options {
> +    return {
> +        server => { optional => 0 },
> +        port => { optional => 1 },
> +        disable => { optional => 1 },
> +        'otel-protocol' => { optional => 1 },
> +        'otel-path' => { optional => 1 },
> +        'otel-timeout' => { optional => 1 },
> +        'otel-headers' => { optional => 1 },
> +        'otel-verify-ssl' => { optional => 1 },
> +        'otel-max-body-size' => { optional => 1 },
> +        'otel-resource-attributes' => { optional => 1 },
> +        'otel-compression' => { optional => 1 },
> +    };
> +}
> +
> +sub _connect {
> +    my ($class, $cfg, $id) = @_;
> +
> +    my $connection = {
> +        id => $id,
> +        cfg => $cfg,
> +        metrics => [],
> +        retry_count => 0,
> +        last_flush => time(),

Above seems unused?

> +        stats => {
> +            total_metrics => 0,
> +            successful_batches => 0,
> +            failed_batches => 0,
> +        }
> +    };
> +
> +    return $connection;
> +}
> +
> +sub _disconnect {
> +    my ($class, $connection) = @_;
> +    # No persistent connection to cleanup
> +}
> +
> +sub _get_otlp_url {
> +    my ($class, $cfg) = @_;
> +    my $proto = $cfg->{'otel-protocol'} || 'https';
> +    my $port = $cfg->{port} || ($proto eq 'https' ? 4318 : 4317);
> +    my $path = $cfg->{'otel-path'} || '/v1/metrics';
> +    
> +    return "${proto}://$cfg->{server}:${port}${path}";
> +}
> +
> +sub _decode_base64_json {
> +    my ($class, $encoded_str) = @_;
> +    return $encoded_str unless defined $encoded_str && $encoded_str ne '';
> +    
> +    # Always attempt base64 decode, fallback to original on any issue
> +    my $decoded_str = MIME::Base64::decode_base64($encoded_str);
> +    
> +    # If decode result is empty or doesn't look right, use original

That seems a bit odd and such encoding "downgrade" things can be prone to bugs,
if we always expect it to be base64 I'd rather enforce that here.

> +    if (!defined $decoded_str || length($decoded_str) == 0) {
> +        return $encoded_str;
> +    }
> +    
> +    return $decoded_str;
> +}
> +
> +sub _parse_headers {
> +    my ($class, $headers_str) = @_;
> +    return {} unless defined $headers_str && $headers_str ne '';
> +    
> +    my $decoded_str = $class->_decode_base64_json($headers_str);
> +    
> +    my $headers = {};
> +    eval {
> +        my $json = JSON->new->decode($decoded_str);
> +        $headers = $json if ref($json) eq 'HASH';

It might be good to die here if ref($json) isn't a hash, to notice the
user about possible misconfiguration?

> +    };
> +    if ($@) {
> +        warn "Failed to parse headers: $@";
> +        warn "Headers string was: $headers_str";

I would slightly prefer having a single warning here, as there is no
guarantee that the two warnings are output close to each other on busy
systems generating lots of logs. E.g.:

    warn "Failed to parse headers '$headers_str' - $@" if $@;

> +    }
> +    return $headers;
> +}
> +
> +sub _parse_resource_attributes {
> +    my ($class, $json_str) = @_;
> +    return [] unless defined $json_str && $json_str ne '';
> +    
> +    my $decoded_str = $class->_decode_base64_json($json_str);
> +    
> +    my $attributes = [];
> +    eval {
> +        # Ensure the JSON string is properly decoded as UTF-8
> +        my $utf8_json = utf8::is_utf8($decoded_str) ? $decoded_str 
> +            : Encode::decode('utf-8', $decoded_str);
> +        my $parsed = JSON->new->utf8(0)->decode($utf8_json);
> +        for my $key (keys %$parsed) {
> +            push @$attributes, {
> +                key => $key,
> +                value => { stringValue => $parsed->{$key} }
> +            };
> +        }
> +    };
> +    if ($@) {
> +        warn "Failed to parse resource_attributes: $@";
> +        warn "Resource attributes string was: $json_str";

same as above w.r.t. single warning.

> +    }
> +    return $attributes;
> +}
> +
> +sub _compress_json {
> +    my ($class, $data) = @_;
> +    
> +    my $json_str = JSON->new->utf8->encode($data);
> +    my $compressed;
> +    
> +    gzip \$json_str => \$compressed
> +        or die "gzip failed: $GzipError";
> +    
> +    return $compressed;
> +}
> +
> +sub _build_otlp_metrics {
> +    my ($class, $metrics_data, $cfg) = @_;
> +    
> +    my $cluster_name = 'proxmox-cluster';

Wouldn't something like 'single-node' make more sense for the fallback
name here?

> +    eval {
> +        my $corosync_conf = PVE::Tools::file_get_contents(
> +            '/etc/pve/corosync.conf', 1);
> +        if ($corosync_conf && $corosync_conf =~ /cluster_name:\s*(\S+)/) {
> +            $cluster_name = $1;
> +        }

no, please do not parse the corosync config here, especially not in such a
hacky way!

Rather use our methods and the in memory cached info to get this, e.g.:

    my $clinfo = PVE::Cluster::get_clinfo();
    $clinfo->{cluster}->{name};

> +    };
> +    # If reading fails, use default cluster name
> +    
> +    my $node_name = PVE::INotify::nodename();
> +    my $pve_version = PVE::pvecfg::version_text();
> +    
> +    return {
> +        resourceMetrics => [{
> +            resource => {
> +                attributes => [
> +                    { key => 'service.name', 
> +                      value => { stringValue => 'proxmox-ve' } },
> +                    { key => 'service.version', 
> +                      value => { stringValue => $pve_version } },
> +                    { key => 'proxmox.cluster', 
> +                      value => { stringValue => $cluster_name } },
> +                    { key => 'proxmox.node', 
> +                      value => { stringValue => $node_name } },
> +                    @{$class->_parse_resource_attributes(
> +                        $cfg->{'otel-resource-attributes'})}
> +                ]
> +            },
> +            scopeMetrics => [{
> +                scope => {},
> +                metrics => $metrics_data
> +            }]
> +        }]
> +    };
> +}
> +
> +
> +sub _convert_node_metrics_recursive {
> +    my ($class, $data, $ctime, $metric_prefix, $attributes) = @_;
> +    
> +    my @metrics = ();
> +    
> +    # Skip non-metric fields
> +    my $skip_fields = {
> +        name => 1,
> +        tags => 1,
> +        vmid => 1,
> +        type => 1,
> +        status => 1,
> +        template => 1,
> +        pid => 1,
> +        agent => 1,
> +        serial => 1,
> +        ctime => 1,
> +        nics => 1,      # Skip nics - handled separately with device labels
> +        storages => 1,  # Skip storages - handled separately with storage labels
> +    };
> +    
> +    # Unit mapping for common metrics
> +    my $unit_mapping = {
> +        # Memory and storage (bytes)
> +        mem => 'bytes',
> +        memory => 'bytes',
> +        swap => 'bytes',
> +        disk => 'bytes',
> +        size => 'bytes',
> +        used => 'bytes',
> +        free => 'bytes',
> +        total => 'bytes',
> +        avail => 'bytes',
> +        available => 'bytes',
> +        arcsize => 'bytes',
> +        blocks => 'bytes',
> +        bavail => 'bytes',
> +        bfree => 'bytes',
> +        
> +        # Network (bytes)
> +        net => 'bytes',
> +        receive => 'bytes',
> +        transmit => 'bytes',
> +        
> +        # CPU and time (seconds or percentage)
> +        cpu => 'percent',
> +        wait => 'seconds',
> +        iowait => 'seconds',
> +        user => 'seconds',
> +        system => 'seconds',
> +        idle => 'seconds',
> +        nice => 'seconds',
> +        steal => 'seconds',
> +        guest => 'seconds',
> +        irq => 'seconds',
> +        softirq => 'seconds',
> +        
> +        # Load average
> +        avg => '1',
> +        
> +        # Counters
> +        cpus => '1',
> +        uptime => 'seconds',
> +        
> +        # File system
> +        files => '1',
> +        ffree => '1',
> +        fused => '1',
> +        favail => '1',
> +        per => 'percent',
> +        fper => 'percent',
> +    };
> +    
> +    for my $key (sort keys %$data) {
> +        next if $skip_fields->{$key};
> +        my $value = $data->{$key};
> +        next if !defined($value);
> +        
> +        my $metric_name = "${metric_prefix}_${key}";
> +        
> +        if (ref($value) eq 'HASH') {
> +            # Recursive call for nested hashes
> +            push @metrics, $class->_convert_node_metrics_recursive(
> +                $value, $ctime, $metric_name, $attributes);
> +        } elsif (!ref($value) && $value ne '' && $value =~ /^[+-]?[0-9]*\.?[0-9]+([eE][+-]?[0-9]+)?$/) {
> +            # Numeric value - create metric
> +            my $unit = '1';  # default unit
> +            
> +            # Try to determine unit based on key name
> +            for my $pattern (keys %$unit_mapping) {
> +                if ($key =~ /\Q$pattern\E/) {
> +                    $unit = $unit_mapping->{$pattern};
> +                    last;
> +                }
> +            }
> +            
> +            # Determine if it's an integer or double
> +            my $data_point = {
> +                timeUnixNano => $ctime * 1_000_000_000,
> +                attributes => $attributes,
> +            };
> +            
> +            if ($value =~ /\./ || $value =~ /[eE]/) {
> +                $data_point->{asDouble} = $value + 0;  # Convert to number
> +            } else {
> +                $data_point->{asInt} = int($value);
> +            }
> +            
> +            push @metrics, {
> +                name => $metric_name,
> +                unit => $unit,
> +                gauge => { dataPoints => [$data_point] }
> +            };
> +        }
> +    }
> +    
> +    return @metrics;
> +}
> +
> +sub update_node_status {
> +    my ($class, $txn, $node, $data, $ctime) = @_;
> +    
> +    my @metrics = ();
> +    my $base_attributes = [
> +        { key => 'node', value => { stringValue => $node } }
> +    ];
> +    
> +    # Convert all node metrics recursively
> +    push @metrics, $class->_convert_node_metrics_recursive($data, $ctime, 'proxmox_node', $base_attributes);
> +    
> +    # Handle special cases that need different attributes
> +    # Network metrics with device labels
> +    if (defined $data->{nics}) {
> +        for my $iface (keys %{$data->{nics}}) {
> +            my $nic_attributes = [
> +                { key => 'node', value => { stringValue => $node } },
> +                { key => 'device', value => { stringValue => $iface } }
> +            ];
> +            
> +            # Use recursive processing for network metrics with device-specific attributes
> +            push @metrics, $class->_convert_node_metrics_recursive($data->{nics}->{$iface}, $ctime, 'proxmox_node_network', $nic_attributes);
> +        }
> +    }
> +    
> +    # Storage metrics with storage labels
> +    if (defined $data->{storages}) {
> +        for my $storage (keys %{$data->{storages}}) {
> +            my $storage_attributes = [
> +                { key => 'node', value => { stringValue => $node } },
> +                { key => 'storage', value => { stringValue => $storage } }
> +            ];
> +            
> +            # Use recursive processing for storage metrics with storage-specific attributes
> +            push @metrics, $class->_convert_node_metrics_recursive($data->{storages}->{$storage}, $ctime, 'proxmox_node_storage', $storage_attributes);
> +        }
> +    }
> +    
> +    push @{$txn->{metrics}}, @metrics;
> +}
> +
> +sub update_qemu_status {
> +    my ($class, $txn, $vmid, $data, $ctime, $nodename) = @_;
> +    
> +    my @metrics = ();
> +    my $vm_attributes = [
> +        { key => 'vmid', value => { stringValue => $vmid } },
> +        { key => 'node', value => { stringValue => $nodename } },
> +        { key => 'name', value => { stringValue => $data->{name} || '' } },
> +        { key => 'type', value => { stringValue => 'qemu' } }
> +    ];
> +    
> +    # Use recursive processing for all VM metrics
> +    push @metrics, $class->_convert_node_metrics_recursive($data, $ctime, 'proxmox_vm', $vm_attributes);
> +    
> +    push @{$txn->{metrics}}, @metrics;
> +}
> +
> +sub update_lxc_status {
> +    my ($class, $txn, $vmid, $data, $ctime, $nodename) = @_;
> +    
> +    my @metrics = ();
> +    my $vm_attributes = [
> +        { key => 'vmid', value => { stringValue => $vmid } },
> +        { key => 'node', value => { stringValue => $nodename } },
> +        { key => 'name', value => { stringValue => $data->{name} || '' } },
> +        { key => 'type', value => { stringValue => 'lxc' } }
> +    ];
> +    
> +    # Use recursive processing for all LXC metrics
> +    push @metrics, $class->_convert_node_metrics_recursive($data, $ctime, 'proxmox_vm', $vm_attributes);
> +    
> +    push @{$txn->{metrics}}, @metrics;
> +}
> +
> +sub update_storage_status {
> +    my ($class, $txn, $nodename, $storeid, $data, $ctime) = @_;
> +    
> +    my @metrics = ();
> +    my $storage_attributes = [
> +        { key => 'node', value => { stringValue => $nodename } },
> +        { key => 'storage', value => { stringValue => $storeid } }
> +    ];
> +    
> +    # Use recursive processing for all storage metrics
> +    push @metrics, $class->_convert_node_metrics_recursive($data, $ctime, 'proxmox_storage', $storage_attributes);
> +    
> +    push @{$txn->{metrics}}, @metrics;
> +}
> +
> +sub flush_data {
> +    my ($class, $txn) = @_;
> +    
> +    return if !$txn->{connection};
> +    return if !$txn->{metrics} || !@{$txn->{metrics}};
> +    
> +    my $metrics = delete $txn->{metrics};
> +    $txn->{metrics} = [];
> +    
> +    eval {
> +        $class->_send_metrics_batched($txn->{connection}, $metrics, $txn->{cfg});
> +        $txn->{stats}->{successful_batches}++;
> +    };
> +    
> +    if (my $err = $@) {
> +        $txn->{stats}->{failed_batches}++;
> +        die "OpenTelemetry export failed '$txn->{id}': $err";
> +    }
> +}
> +
> +sub _send_metrics_batched {
> +    my ($class, $connection, $metrics, $cfg) = @_;
> +    
> +    my $max_body_size = $cfg->{'otel-max-body-size'} || 10_000_000;
> +    my $total_metrics = @$metrics;
> +    
> +    # Estimate metrics per batch based on size heuristics
> +    my $estimated_batch_size = $class->_estimate_batch_size($metrics, $max_body_size, $cfg);
> +    
> +    # If estimated batch size covers all metrics, try sending everything at once
> +    if ($estimated_batch_size >= $total_metrics) {
> +        my $otlp_data = $class->_build_otlp_metrics($metrics, $cfg);
> +        my $serialized_size = $class->_get_serialized_size($otlp_data, $cfg);
> +        
> +        if ($serialized_size <= $max_body_size) {
> +            $class->send($connection, $otlp_data, $cfg);
> +            return;
> +        }
> +        # If estimation was wrong, fall through to batching
> +    }
> +    
> +    # Send in batches
> +    for (my $i = 0; $i < $total_metrics; $i += $estimated_batch_size) {
> +        my $end_idx = $i + $estimated_batch_size - 1;
> +        $end_idx = $total_metrics - 1 if $end_idx >= $total_metrics;
> +        
> +        my @batch_metrics = @$metrics[$i..$end_idx];
> +        my $batch_otlp = $class->_build_otlp_metrics(\@batch_metrics, $cfg);
> +        
> +        # Verify batch size is within limits
> +        my $batch_size_bytes = $class->_get_serialized_size($batch_otlp, $cfg);
> +        if ($batch_size_bytes > $max_body_size) {
> +            # Fallback: send metrics one by one
> +            for my $single_metric (@batch_metrics) {
> +                my $single_otlp = $class->_build_otlp_metrics([$single_metric], $cfg);
> +                $class->send($connection, $single_otlp, $cfg);
> +            }
> +        } else {
> +            $class->send($connection, $batch_otlp, $cfg);
> +        }
> +    }
> +}
> +
> +sub _estimate_batch_size {
> +    my ($class, $metrics, $max_body_size, $cfg) = @_;
> +    
> +    return 1 if @$metrics == 0;
> +    
> +    # Sample first few metrics to estimate size per metric
> +    my $sample_size = @$metrics > 10 ? 10 : @$metrics;
> +    my @sample_metrics = @$metrics[0..$sample_size-1];
> +    
> +    my $sample_otlp = $class->_build_otlp_metrics(\@sample_metrics, $cfg);
> +    my $sample_bytes = $class->_get_serialized_size($sample_otlp, $cfg);
> +    
> +    # Calculate average bytes per metric with overhead
> +    my $bytes_per_metric = $sample_bytes / $sample_size;
> +    
> +    # Add 20% safety margin for OTLP structure overhead
> +    $bytes_per_metric *= 1.2;
> +    
> +    # Calculate how many metrics fit in max_body_size
> +    my $estimated_count = int($max_body_size / $bytes_per_metric);
> +    
> +    # Ensure at least 1 metric per batch, and cap at total metrics
> +    $estimated_count = 1 if $estimated_count < 1;
> +    $estimated_count = @$metrics if $estimated_count > @$metrics;
> +    
> +    return $estimated_count;
> +}
> +
> +
> +sub _get_serialized_size {
> +    my ($class, $data, $cfg) = @_;
> +    
> +    my $serialized;
> +    if ($cfg->{'otel-compression'} // 1) {
> +        $serialized = $class->_compress_json($data);
> +    } else {
> +        $serialized = JSON->new->utf8->encode($data);
> +    }
> +    
> +    return length($serialized);
> +}
> +
> +sub send {
> +    my ($class, $connection, $data, $cfg) = @_;
> +    
> +    my $ua = LWP::UserAgent->new(
> +        timeout => $cfg->{'otel-timeout'} || 5,
> +        ssl_opts => { verify_hostname => $cfg->{'otel-verify-ssl'} // 1 }
> +    );
> +    
> +    my $url = $class->_get_otlp_url($cfg);
> +    
> +    my $request_data;
> +    my %headers = (
> +        'Content-Type' => 'application/json',
> +    );
> +    
> +    # Safely add parsed headers
> +    my $parsed_headers = $class->_parse_headers($cfg->{'otel-headers'});
> +    if ($parsed_headers && ref($parsed_headers) eq 'HASH') {
> +        %headers = (%headers, %$parsed_headers);
> +    }
> +    
> +    if ($cfg->{'otel-compression'} // 1) {
> +        $request_data = $class->_compress_json($data);
> +        $headers{'Content-Encoding'} = 'gzip';
> +    } else {
> +        $request_data = JSON->new->utf8->encode($data);
> +    }
> +    
> +    my $req = HTTP::Request->new('POST', $url, [%headers], $request_data);
> +    
> +    my $response = $ua->request($req);
> +    die "OTLP request failed: " . $response->status_line unless $response->is_success;
> +}
> +
> +sub test_connection {
> +    my ($class, $cfg) = @_;
> +    
> +    my $ua = LWP::UserAgent->new(
> +        timeout => $cfg->{'otel-timeout'} || 5,
> +        ssl_opts => { verify_hostname => $cfg->{'otel-verify-ssl'} // 1 }
> +    );
> +    
> +    my $url = $class->_get_otlp_url($cfg);
> +    
> +    # Send empty metrics payload for testing
> +    my $test_data = {
> +        resourceMetrics => [{
> +            resource => { attributes => [] },
> +            scopeMetrics => [{
> +                scope => {},
> +                metrics => []
> +            }]
> +        }]
> +    };
> +    
> +    my $request_data;
> +    my %headers = (
> +        'Content-Type' => 'application/json',
> +    );
> +    
> +    # Safely add parsed headers
> +    my $parsed_headers = $class->_parse_headers($cfg->{'otel-headers'});
> +    if ($parsed_headers && ref($parsed_headers) eq 'HASH') {
> +        %headers = (%headers, %$parsed_headers);
> +    }
> +    
> +    if ($cfg->{'otel-compression'} // 1) {
> +        $request_data = $class->_compress_json($test_data);
> +        $headers{'Content-Encoding'} = 'gzip';
> +    } else {
> +        $request_data = JSON->new->utf8->encode($test_data);
> +    }
> +    
> +    my $req = HTTP::Request->new('POST', $url, [%headers], $request_data);
> +    
> +    my $response = $ua->request($req);
> +    die "Connection test failed: " . $response->status_line unless $response->is_success;
> +    
> +    return 1;
> +}
> +
> +1;
> \ No newline at end of file

please add a trailing new line at the end of file.

> diff --git a/www/manager6/dc/MetricServerView.js b/www/manager6/dc/MetricServerView.js
> index baae7d71..8f7920ee 100644
> --- a/www/manager6/dc/MetricServerView.js
> +++ b/www/manager6/dc/MetricServerView.js
> @@ -14,6 +14,8 @@ Ext.define('PVE.dc.MetricServerView', {
>                      return 'InfluxDB';
>                  case 'graphite':
>                      return 'Graphite';
> +                case 'opentelemetry':
> +                    return 'OpenTelemetry';
>                  default:
>                      return Proxmox.Utils.unknownText;
>              }
> @@ -106,6 +108,11 @@ Ext.define('PVE.dc.MetricServerView', {
>                      iconCls: 'fa fa-fw fa-bar-chart',
>                      handler: 'addServer',
>                  },
> +                {
> +                    text: 'OpenTelemetry',
> +                    iconCls: 'fa fa-fw fa-bar-chart',
> +                    handler: 'addServer',
> +                },
>              ],
>          },
>          {
> @@ -164,6 +171,29 @@ Ext.define('PVE.dc.MetricServerBaseEdit', {
>                  success: function (response, options) {
>                      let values = response.result.data;
>                      values.enable = !values.disable;
> +                    
> +                    // Handle OpenTelemetry advanced fields conversion
> +                    if (values.type === 'opentelemetry') {
> +                        if (values['otel-headers']) {
> +                            try {
> +                                // Use Proxmox standard base64 decode
> +                                values.headers_advanced = Ext.util.Base64.decode(values['otel-headers']);
> +                            } catch (_e) {
> +                                // Fallback for non-base64 encoded values
> +                                values.headers_advanced = values['otel-headers'];

Also here, would prefer sticking to always expect and enforce base64.

> +                            }
> +                        }
> +                        if (values['otel-resource-attributes']) {
> +                            try {
> +                                // Use Proxmox standard base64 decode
> +                                values.resource_attributes_advanced = Ext.util.Base64.decode(values['otel-resource-attributes']);
> +                            } catch (_e) {
> +                                // Fallback for non-base64 encoded values
> +                                values.resource_attributes_advanced = values['otel-resource-attributes'];
> +                            }
> +                        }
> +                    }
> +                    
>                      me.down('inputpanel').setValues(values);
>                  },
>              });
> @@ -499,3 +529,232 @@ Ext.define('PVE.dc.GraphiteEdit', {
>          },
>      ],
>  });
> +
> +Ext.define('PVE.dc.OpenTelemetryEdit', {
> +    extend: 'PVE.dc.MetricServerBaseEdit',
> +    xtype: 'pveOpenTelemetryEdit',
> +
> +    subject: gettext('OpenTelemetry Server'),
> +
> +    items: [
> +        {
> +            xtype: 'inputpanel',
> +            cbind: {
> +                isCreate: '{isCreate}',
> +            },
> +            onGetValues: function(values) {
> +                values.disable = values.enable ? 0 : 1;
> +                delete values.enable;
> +
> +                // Rename advanced fields to their final names and encode as base64 (same as webhook)
> +                if (values.headers_advanced && values.headers_advanced.trim()) {
> +                    values['otel-headers'] = Ext.util.Base64.encode(values.headers_advanced);
> +                } else {
> +                    values['otel-headers'] = '';
> +                }
> +                delete values.headers_advanced;
> +
> +                if (values.resource_attributes_advanced && values.resource_attributes_advanced.trim()) {
> +                    values['otel-resource-attributes'] = Ext.util.Base64.encode(values.resource_attributes_advanced);
> +                } else {
> +                    values['otel-resource-attributes'] = '';
> +                }
> +                delete values.resource_attributes_advanced;
> +
> +                return values;
> +            },
> +
> +            column1: [
> +                {
> +                    xtype: 'hidden',
> +                    name: 'type',
> +                    value: 'opentelemetry',
> +                    cbind: {
> +                        submitValue: '{isCreate}',
> +                    },
> +                },
> +                {
> +                    xtype: 'pmxDisplayEditField',
> +                    name: 'id',
> +                    fieldLabel: gettext('Name'),
> +                    allowBlank: false,
> +                    cbind: {
> +                        editable: '{isCreate}',
> +                        value: '{serverid}',
> +                    },
> +                },
> +                {
> +                    xtype: 'proxmoxtextfield',
> +                    name: 'server',
> +                    fieldLabel: gettext('Server'),
> +                    allowBlank: false,
> +                    emptyText: gettext('otel-collector.example.com'),
> +                },
> +                {
> +                    xtype: 'proxmoxintegerfield',
> +                    name: 'port',
> +                    fieldLabel: gettext('Port'),
> +                    value: 4318,
> +                    minValue: 1,
> +                    maxValue: 65535,
> +                    allowBlank: false,
> +                },
> +                {
> +                    xtype: 'proxmoxKVComboBox',
> +                    name: 'otel-protocol',
> +                    fieldLabel: gettext('Protocol'),
> +                    value: 'https',
> +                    comboItems: [
> +                        ['http', 'HTTP'],
> +                        ['https', 'HTTPS'],
> +                    ],
> +                    allowBlank: false,
> +                },
> +                {
> +                    xtype: 'proxmoxtextfield',
> +                    name: 'otel-path',
> +                    fieldLabel: gettext('Path'),
> +                    value: '/v1/metrics',
> +                    allowBlank: false,
> +                },
> +            ],
> +
> +            column2: [
> +                {
> +                    xtype: 'checkbox',
> +                    name: 'enable',
> +                    fieldLabel: gettext('Enabled'),
> +                    inputValue: 1,
> +                    uncheckedValue: 0,
> +                    checked: true,
> +                },
> +                {
> +                    xtype: 'proxmoxintegerfield',
> +                    name: 'otel-timeout',
> +                    fieldLabel: gettext('Timeout (s)'),
> +                    value: 5,
> +                    minValue: 1,
> +                    maxValue: 300,
> +                    allowBlank: false,
> +                },
> +                {
> +                    xtype: 'proxmoxcheckbox',
> +                    name: 'otel-verify-ssl',
> +                    fieldLabel: gettext('Verify SSL'),
> +                    inputValue: 1,
> +                    uncheckedValue: 0,
> +                    defaultValue: 1,
> +                    cbind: {
> +                        value: function(get) {
> +                            return get('isCreate') ? 1 : undefined;
> +                        }
> +                    },
> +                },
> +                {
> +                    xtype: 'proxmoxintegerfield',
> +                    name: 'otel-max-body-size',
> +                    fieldLabel: gettext('Max Body Size (bytes)'),
> +                    value: 10000000,
> +                    minValue: 1024,
> +                    allowBlank: false,
> +                },
> +                {
> +                    xtype: 'proxmoxcheckbox',
> +                    name: 'otel-compression',
> +                    fieldLabel: gettext('Enable Compression'),
> +                    inputValue: 1,
> +                    uncheckedValue: 0,
> +                    defaultValue: 1,
> +                    cbind: {
> +                        value: function(get) {
> +                            return get('isCreate') ? 1 : undefined;
> +                        }
> +                    },
> +                },
> +            ],
> +
> +
> +            columnB: [
> +                {
> +                    xtype: 'fieldset',
> +                    title: gettext('Advanced JSON Configuration'),
> +                    collapsible: true,
> +                    collapsed: true,
> +                    items: [
> +                        {
> +                            xtype: 'textarea',
> +                            name: 'headers_advanced',
> +                            fieldLabel: gettext('HTTP Headers (JSON)'),
> +                            labelAlign: 'top',
> +                            emptyText: gettext('{\n  "Authorization": "Bearer token",\n  "X-Custom-Header": "value"\n}'),
> +                            rows: 4,
> +                            validator: function(value) {
> +                                if (!value || value.trim() === '') {
> +                                    return true;
> +                                }
> +                                try {
> +                                    JSON.parse(value);
> +                                    return true;
> +                                } catch (_e) {
> +                                    return gettext('Invalid JSON format');
> +                                }
> +                            },
> +                        },
> +                        {
> +                            xtype: 'textarea',
> +                            name: 'resource_attributes_advanced',
> +                            fieldLabel: gettext('Resource Attributes (JSON)'),
> +                            labelAlign: 'top',
> +                            emptyText: gettext('{\n  "environment": "production",\n  "datacenter": "dc1",\n  "region": "us-east-1"\n}'),
> +                            rows: 4,
> +                            validator: function(value) {
> +                                if (!value || value.trim() === '') {
> +                                    return true;
> +                                }
> +                                try {
> +                                    JSON.parse(value);
> +                                    return true;
> +                                } catch (_e) {
> +                                    return gettext('Invalid JSON format');
> +                                }
> +                            },
> +                        },
> +                    ],
> +                },
> +            ],
> +        },
> +    ],
> +
> +    initComponent: function() {
> +        var me = this;
> +        var initialLoad = true;
> +
> +        me.callParent();
> +
> +        // Auto-adjust port when protocol changes (only for user interaction)
> +        me.on('afterrender', function() {
> +            var protocolField = me.down('[name=otel-protocol]');
> +            var portField = me.down('[name=port]');
> +
> +            if (protocolField && portField) {
> +                // Set flag to false after initial load
> +                me.on('loadrecord', function() {
> +                    setTimeout(function() {
> +                        initialLoad = false;
> +                    }, 100);
> +                });
> +
> +                protocolField.on('change', function(field, newValue) {
> +                    // Only auto-adjust port if this is user interaction, not initial load
> +                    if (!initialLoad) {
> +                        if (newValue === 'https') {
> +                            portField.setValue(4318);
> +                        } else {
> +                            portField.setValue(4317);
> +                        }
> +                    }
> +                });
> +            }
> +        });
> +    },
> +});
> \ No newline at end of file





More information about the pve-devel mailing list