[pve-devel] [PATCH v4 pve-storage 06/16] cephconfig: support line-continuations in parser

Max Carrara m.carrara at proxmox.com
Wed Mar 20 17:59:12 CET 2024


On Tue Mar 19, 2024 at 4:59 PM CET, Max Carrara wrote:
> On Tue Mar 19, 2024 at 10:37 AM CET, Fabian Grünbichler wrote:
> > On March 5, 2024 4:07 pm, Max Carrara wrote:
> > > Ceph's docs state the following [0]:
> > >> The backslash character `\` is used as the line-continuation marker
> > >> that combines the next line with the current one.
> > > 
> > > This commit implements the support of such line-continuations in our
> > > parser.
> > > 
> > > The line following a line ending with a '\' has its whitespace
> > > preserved, which matches the behaviour in Ceph's original
> > > implementation [1]. In other words, leading and trailing whitespace is
> > > not stripped from a continued line.
> >
> > it's actually a bit more complicated.. ceph only supports line
> > continuations inside values (well, in key value lines after the key ;)),
> > and only if they are unquoted..

Upon further research and confirming the behaviour via `ceph-conf`
(thanks for the tip btw!) line continuations are in fact supported in
different parts as well.

Consider the following example 'ceph.conf' file:
```
[clie\
nt]      # some comment
foo\
\
\
\
= \
bar
```

The continued `client` section header does actually get parsed by
`ceph-conf` without any issues - the trailing comment and whitespace
are also ignored.

Where it gets really interesting is the continuation right after 'foo':
Because keys are defined using `raw[]` [0], whatever is skipped by the
parser is still included in the parsed output [1].

This has the consequence that the four continued lines are in fact not
skipped and instead read as literal newline characters.

After the equals sign, the line continuation is skipped as expected.

By providing literal newlines via the shell, the above can easily be
verified:

$ ceph-conf -c ceph_cancer.conf -s client foo^M^M^M^M
bar

(The ^M is a literal newline and can usually be obtained by typing
CTRL+V, Enter in your shell.)

To make matters even worse, quoted values may in fact be *directly*
followed by continuations (`ceph-conf` fails otherwise):

```
[client]
foo = "bar"\

baz = qux
```

The above is considered "correct" because the escaped newline counts as
whitespace. If you were to put some spaces into the empty line after the
"foo" key, these would be skipped as well.

For completeness's sake, this also parses:

```
[client]
foo = "bar"\
   # some comment
baz = qux
```

However, the following is invalid:

```
[client]
foo = "bar"\
baz = qux
```

... because the parser sees:

```
[client]
foo = "bar"baz = qux
```

... which is not allowed, because a quoted value may only be followed
what the grammar defines as "empty_line" [2].

So, this doesn't really make the parsing logic regarding
line continuations any simpler:

  1. Section headers may contain line continuations
  2. Section headers may be followed by whitespace + comments (after ']'
  3. Keys are parsed "raw" and may therefore be continued
     --> Will probably just not handle this case, as there are no config
     keys that contain newline characters or anything of the sort
     - why would there be? Why would a user need this?
  4. Unquoted values may contain line continuations
  5. Quoted values may be *directly* followed by a line continuation
     character, as long as the remaining stuff is whitespace or a
     comment
  6. Bonus point: Quoted values MUST NOT *contain* line continuations,
     as they're parsed as `lexeme[]`s [3]

... so, see you in v5 ;)

[0]: https://git.proxmox.com/?p=ceph.git;a=blob;f=ceph/src/common/ConfUtils.cc;h=2f78fd02bf9e27467275752e6f3bca0c5e3946ce;hb=refs/heads/master#l182
[1]: https://www.boost.org/doc/libs/1_53_0/libs/spirit/doc/html/spirit/qi/reference/directive/raw.html
[2]: https://git.proxmox.com/?p=ceph.git;a=blob;f=ceph/src/common/ConfUtils.cc;h=2f78fd02bf9e27467275752e6f3bca0c5e3946ce;hb=refs/heads/master#l188
[3]: https://www.boost.org/doc/libs/1_53_0/libs/spirit/doc/html/spirit/qi/reference/directive/lexeme.html

>
> As mentioned in my other reply, I'll probably have to revise the whole
> parsing logic to take that into account... but thanks for being so
> thorough!
>
> >
> > > 
> > > [0]: https://docs.ceph.com/en/reef/rados/configuration/ceph-conf/#changes-introduced-in-octopus
> > > [1]: https://git.proxmox.com/?p=ceph.git;a=blob;f=ceph/src/common/ConfUtils.cc;h=2f78fd02bf9e27467275752e6f3bca0c5e3946ce;hb=refs/heads/master#l262
> > > 
> > > Signed-off-by: Max Carrara <m.carrara at proxmox.com>
> > > ---
> > > Changes v2 --> v3:
> > >   * new
> > > Changes v3 --> v4:
> > >   * none
> > > 
> > >  src/PVE/CephConfig.pm | 28 ++++++++++++++++++++++++----
> > >  1 file changed, 24 insertions(+), 4 deletions(-)
> > > 
> > > diff --git a/src/PVE/CephConfig.pm b/src/PVE/CephConfig.pm
> > > index 74a92eb..80f71b0 100644
> > > --- a/src/PVE/CephConfig.pm
> > > +++ b/src/PVE/CephConfig.pm
> > > @@ -19,13 +19,33 @@ sub parse_ceph_config {
> > >      return $cfg if !defined($raw);
> > >  
> > >      my @lines = split /\n/, $raw;
> > > +    my @lines_normalized;
> > > +
> > > +    my $re_comment_not_escaped = qr/(?<!\\)(#|;).*$/;
> > > +    my $re_leading_ws = qr/^\s+/;
> > > +    my $re_trailing_ws = qr/\s+$/;
> > > +
> > > +    while (scalar(@lines)) {
> > > +	my $line = shift(@lines);
> > > +	$line =~ s/$re_comment_not_escaped//;
> > > +	$line =~ s/$re_leading_ws//;
> > > +	$line =~ s/$re_trailing_ws//;
> > > +	next if !$line;
> > > +
> > > +	# merge lines ending with continuation character '\'
> > > +	while ($line =~ s/\\$//) {
> > > +	    my $next_line = shift(@lines);
> > > +	    $next_line =~ s/$re_comment_not_escaped//;
> > > +	    $next_line =~ s/$re_trailing_ws//;
> > > +	    $line .= $next_line;
> > > +	}
> > > +
> > > +	push(@lines_normalized, $line);
> > > +    }
> > >  
> > >      my $section;
> > >  
> > > -    for my $line (@lines) {
> > > -	$line =~ s/(?<!\\)(#|;).*$//;
> > > -	$line =~ s/^\s+//;
> > > -	$line =~ s/\s+$//;
> > > +    for my $line (@lines_normalized) {
> > >  	next if !$line;
> > >  
> > >  	if ($line =~ m/^\[(.+)\]$/) {
> > > -- 
> > > 2.39.2
> > > 
> > > 
> > > 
> > > _______________________________________________
> > > pve-devel mailing list
> > > pve-devel at lists.proxmox.com
> > > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
> > > 
> > > 
> > > 
> >
> >
> > _______________________________________________
> > pve-devel mailing list
> > pve-devel at lists.proxmox.com
> > https://lists.proxmox.com/cgi-bin/mailman/listinfo/pve-devel





More information about the pve-devel mailing list