[pve-devel] [PATCH manager v2 4/7] revised task log API call for PVE

Wed Oct 5 14:30:50 CEST 2022

Am 07/09/2022 um 10:56 schrieb Daniel Tschlatscher:
> The API call for fetching a tasklog with limit=0 now returns the whole
> log as a file stream rather than reading all lines in memory and then
> transfering them in JSON format. The behaviour when the url parameter
> limit is undefined or not 0 is the same as before, in accordance
> with the API specification.
> 
> Signed-off-by: Daniel Tschlatscher <d.tschlatscher at proxmox.com>
> ---
>  PVE/API2/Tasks.pm | 13 ++++++++++---
>  1 file changed, 10 insertions(+), 3 deletions(-)
> 
> diff --git a/PVE/API2/Tasks.pm b/PVE/API2/Tasks.pm
> index 9cd1e56b..88762f2f 100644
> --- a/PVE/API2/Tasks.pm
> +++ b/PVE/API2/Tasks.pm
> @@ -5,6 +5,7 @@ use warnings;
>  use POSIX;
>  use IO::File;
>  use File::ReadBackwards;
> +use File::stat;
>  use PVE::Tools;
>  use PVE::SafeSyslog;
>  use PVE::RESTHandler;
> @@ -387,11 +388,17 @@ __PACKAGE__->register_method({
>  	    $rpcenv->check($user, "/nodes/$node", [ 'Sys.Audit' ]);
>  	}
>  
> -	my ($count, $lines) = PVE::Tools::dump_logfile($filename, $start, $limit);
> +	if ($limit == 0) {
> +	    # TCP Max Transfer Unit size is 1500, compression for lower numbers has no effect

What do you mean here? The MTU of the Ethernet data layer? As that would make
more sense, there the most common (but not guaranteed) MTU is indeed 1500. TCP
can send more data per packet just fine if the MTU of the underlying (ethernet)
network is bigger, like the often used MTU of 9000 in LANs.

Note also that the actual file size that can be transmitted in one packet would
need to factor in IP, TCP and HTTP overhead though.

For IPv4 you'd use 16 byte for the IP and 24 bytes for the TCP part of the
packet, so it'd be 1460 bytes, the HTTP header isn't as deterministic and for
IPv6 it's more overhead eating away the possible payload size.

That alls said, we could still use the 1500 as cut-off, nothing inherently against
that per se, but the comment should not refer to the confusing TCP MTU and mention
that the boundary depends.

For finding a cutoff we could look at file distribution size of task log in existing
(non-test) instances, e.g. with:

find /var/log/pve/tasks/ -mindepth 2 -type f -print0 | xargs -0 ls -l | awk '{size[int(log($5)/log(2))]++}END{for (i in size) printf("%10d %3d\n", 2^i, size[i])}' | sort -n

For three relatively active and long lived host this gives:
    Size       A      B     C
                       1
         8    567    669   121
        16      3     28
        32     40    106     6
        64     60     23
       128     63     28
       256     22     25     8
       512      8     12     4
      1024      8     32    40
      2048      2     12    17
      4096      5      1
      8192     14      1   695
     16384     18            1
     32768     24

I then compressed all files with gzip and reran:

        32   1
        64 596
       128 154
       256  14
       512  13
      1024   4
      2048  14
      4096  18
      8192  24

        32   1
        64 784
       128  74
       256  38
       512  41
      1024   2
      4096   1
      8192   1

So, I'd just use 1024 as cut-off, that isn't bound to such dynamic limits like TCP
actual payload per single package size, but will still fit in most and even captures
most of the previously selected files in practice anyway. Also gzip is normally
able to compress text quite well so doing that with log (i.e., not random data) text
files bigger than 1KiB will almost never over the uncompressed limit. If you rather
like a higher size you can also use 2 or 4 KiB, both fine too IMO.

> +	    my $use_compression = stat($filename)->size > 1500;

should be able to use: -s $filename > X

> +	    return PVE::Tools::stream_file($filename, $param->{upid}, $use_compression);

IMO the helper isn't to useful, I'd just do this inline.

> +	} else {

no else required, other branch returns.

> +	    my ($count, $lines) = PVE::Tools::dump_logfile($filename, $start, $limit);
>  
> -	$rpcenv->set_result_attrib('total', $count);
> +	    $rpcenv->set_result_attrib('total', $count);
>  
> -	return $lines;
> +	    return $lines;
> +	}
>      }});
>  
>