[pve-devel] firewall : possible bug/race when cluster.fw is replicated and rules are updated ?
Stefan Priebe - Profihost AG
s.priebe at profihost.ag
Tue Jan 8 21:59:44 CET 2019
Hi Alexandre,
Am 08.01.19 um 21:55 schrieb Alexandre DERUMIER:
>>> But, file_set_contents - which save_clusterfw_conf uses - does this already[0],
>>> so maybe this is the "high-level fuse rename isn't atomic" bug again...
>>> May need to take a closer look tomorrow.
>
> mmm, ok.
>
> In my case, it was with a simple file copy (cp /tmp/cluster.fw /etc/pve/firewall/cluster.fw). (I manage cluster.fw through puppet for multiple cluster).
> can reproduce it too with a simple vi edition.
>
> I think others users could trigger this too, if they use scripts to automate ipset (blacklist, ....).
>
> Maybe could it be better to only disable firewall when option is setup with "enabled:0", and if cluster.fw is missing, don't change any rules.
> ²²²
for those cases i use something like (pseudocode - i use salt not puppet):
- manage copy of file
- if file has changed trigger:
- mv -v $managedfile $realfile
Greets,
Stefan
>
>
>
> ----- Mail original -----
> De: "Thomas Lamprecht" <thomas at lamprecht.org>
> À: "pve-devel" <pve-devel at pve.proxmox.com>, "aderumier" <aderumier at odiso.com>
> Envoyé: Mardi 8 Janvier 2019 20:58:51
> Objet: Re: [pve-devel] firewall : possible bug/race when cluster.fw is replicated and rules are updated ?
>
> Hi,
>
> On 1/8/19 7:37 PM, Alexandre DERUMIER wrote:
>> I'm able to reproduce with:
>> ---------------------------
>> on 1 host:
>>
>> cluster.fw:
>> [OPTIONS]
>>
>> enable: 1
>> policy_in: ACCEPT
>>
>>
>>
>>
>> #!/usr/bin/perl
>>
>> use IO::File;
>> use PVE::Firewall;
>> use Data::Dumper;
>> use Time::HiRes qw ( time alarm sleep usleep );
>>
>> while(1){
>>
>> $filename = "/etc/pve/firewall/cluster.fw";
>>
>> if (my $fh = IO::File->new($filename, O_RDONLY)) {
>>
>> $cluster_conf = PVE::Firewall::parse_clusterfw_config($filename, $fh, $verbose);
>> my $cluster_options = $cluster_conf->{options};
>>
>> if (!$cluster_options->{enable}) {
>> print Dumper($cluster_options);
>> die "error\n";
>> }
>>
>> }
>> usleep(100);
>> };
>>
>>
>> the script is running fine.
>>
>>
>> on another host, edit the file (simple open/write),
>> then the script on first host, return
>>
>> $VAR1 = {};
>> error
>
> that is expected, AFAICT, a modify operation shouldn't be:
> * read FILE -> modify -> write FILE
> but rather:
> * read FILE -> modify -> write FILE.TMP -> move FILE.TMP to FILE
> if it's wanted that always a valid content is read. Else yes, you may have a small
> time window where the file is truncated.
>
> But, file_set_contents - which save_clusterfw_conf uses - does this already[0],
> so maybe this is the "high-level fuse rename isn't atomic" bug again...
> May need to take a closer look tomorrow.
>
> [0]: https://git.proxmox.com/?p=pve-common.git;a=blob;f=src/PVE/Tools.pm;h=accf6539da94d2b5d5b6f4539310fe5c4d526c7e;hb=HEAD#l213
>
>>
>> ----- Mail original -----
>> De: "aderumier" <aderumier at odiso.com>
>> À: "pve-devel" <pve-devel at pve.proxmox.com>
>> Envoyé: Mardi 8 Janvier 2019 19:15:06
>> Objet: [pve-devel] firewall : possible bug/race when cluster.fw is replicated and rules are updated ?
>>
>> Hi,
>> I'm currently debugging a possible firewalling problem.
>> I'm running some cephfs client in vm, firewalled by proxmox.
>> cephfs client are really sensitive to network problem, and mainly with packets logss or dropped packets.
>>
>> I'm really not sure, but I have currently puppet updating my cluster.fw, at regular interval,
>> and sometimes, I have all the vm on a specific host (or multiple hosts), at the same time, have a small disconnect (maybe some second).
>>
>>
>> I would like to known, if cluster.fw replication is atomic in /etc/pve/ ?
>> or if they are any chance, that during file replication, the firewall try to read the file,
>> it could be empty ?
>>
>>
>> I just wonder (I'm really really not sure) if I could trigger this:
>>
>>
>> sub update {
>> my $code = sub {
>>
>> my $cluster_conf = load_clusterfw_conf();
>> my $cluster_options = $cluster_conf->{options};
>>
>> if (!$cluster_options->{enable}) {
>> PVE::Firewall::remove_pvefw_chains();
>> return;
>> }
>>
>>
>> cluster.conf not readable/absent/.... , and remove_pvefw_chains called.
>> then after some seconds, rules are applied again.
>>
>>
>> I'm going to add some log to try to reproduce it. (BTW, it could be great to logs rules changed, maybe an audit log with a diff could be great)
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel at pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>>
>> _______________________________________________
>> pve-devel mailing list
>> pve-devel at pve.proxmox.com
>> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
> _______________________________________________
> pve-devel mailing list
> pve-devel at pve.proxmox.com
> https://pve.proxmox.com/cgi-bin/mailman/listinfo/pve-devel
>
More information about the pve-devel
mailing list