[pve-devel] pvefw: masquerade problems and conntrack zones

Alexandre DERUMIER aderumier at odiso.com
Wed Mar 12 04:05:23 CET 2014


Ok, I have done tests (can't sleep this night ;)

I have same result than you. 
routing works perfectly (using veth seem to wonderful for this)
but masquerade/snat don't work


host config
-----------
auto vmbr1
iface vmbr1 inet manual
        bridge_ports bond0.94
        bridge_stp off
        bridge_fd 0
        post-up echo 0 > /sys/devices/virtual/net/vmbr0/bridge/multicast_snooping


auto pm1
iface pm1 inet static
        address 10.3.94.31
        netmask 255.255.255.0
        gateway 10.3.94.1
        VETH_BRIDGETO vmbr1



auto vmbr14
iface vmbr14 inet manual
        bridge_ports none
        bridge_stp off
        bridge_fd 0

auto pm14
iface pm14 inet static
        address 10.2.0.1
        netmask 255.255.255.0
        VETH_BRIDGETO vmbr14



iptables -A POSTROUTING -s 10.2.0.100/32 -o pm1 -j MASQUERADE


guest config(pluged on vmbr14)
------------------------
auto eth0
iface eth0 inet static
        address 10.2.0.100
        netmask 255.255.255.0
        gateway 10.2.0.1



test from guest : ping 8.8.8.8

iptables logs:
--------------
Mar 12 02:57:26 kvmtest1 kernel: PREROUTING: IN=vmbr14 OUT= PHYSIN=tap110i0 MAC=a6:16:41:ea:75:88:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=55364 DF PROTO=ICMP TYPE=8 CODE=0 ID=2047 SEQ=1 
Mar 12 02:57:26 kvmtest1 kernel: PVEFW-FORWARD: IN=vmbr14 OUT=vmbr14 PHYSIN=tap110i0 PHYSOUT=pm14peer SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=55364 DF PROTO=ICMP TYPE=8 CODE=0 ID=2047 SEQ=1 
Mar 12 02:57:26 kvmtest1 kernel: PVEFW-FORWARD: IN=vmbr14 OUT=vmbr14 PHYSIN=tap110i0 PHYSOUT=pm14peer SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=55364 DF PROTO=ICMP TYPE=8 CODE=0 ID=2047 SEQ=1 
Mar 12 02:57:26 kvmtest1 kernel: POSTROUTING: IN= OUT=vmbr14 PHYSIN=tap110i0 PHYSOUT=pm14peer SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=55364 DF PROTO=ICMP TYPE=8 CODE=0 ID=2047 SEQ=1 MARK=0x1 
Mar 12 02:57:26 kvmtest1 kernel: PVEFW-FORWARD: IN=pm14 OUT=pm1 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=55364 DF PROTO=ICMP TYPE=8 CODE=0 ID=2047 SEQ=1 
Mar 12 02:57:26 kvmtest1 kernel: PVEFW-FORWARD: IN=pm14 OUT=pm1 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=55364 DF PROTO=ICMP TYPE=8 CODE=0 ID=2047 SEQ=1 
Mar 12 02:57:26 kvmtest1 kernel: PVEFW-FORWARD: IN=vmbr1 OUT=vmbr1 PHYSIN=pm1peer PHYSOUT=bond0.94 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=55364 DF PROTO=ICMP TYPE=8 CODE=0 ID=2047 SEQ=1 


So, postrouting is called only once, so it's impossible to do the nat


adding iptables -t raw -A PREROUTING -s '10.2.0.100/32' -i vmbr14 -j CT --zone 1
----------------------------------------------------------------------------------
now,packet it's corretcly natted out, but don't seem to be return correctly

Mar 12 03:47:27 kvmtest1 kernel: [  705.085259] PREROUTING: IN=vmbr14 OUT= PHYSIN=tap110i0 MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=25454 DF PROTO=ICMP TYPE=8 CODE=0 ID=2004 SEQ=1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.085282] PVEFW-FORWARD: IN=vmbr14 OUT=vmbr14 PHYSIN=tap110i0 PHYSOUT=pm14peer MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=25454 DF PROTO=ICMP TYPE=8 CODE=0 ID=2004 SEQ=1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.085292] PVEFW-FORWARD: IN=vmbr14 OUT=vmbr14 PHYSIN=tap110i0 PHYSOUT=pm14peer MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=25454 DF PROTO=ICMP TYPE=8 CODE=0 ID=2004 SEQ=1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.085312] POSTROUTING: IN= OUT=vmbr14 PHYSIN=tap110i0 PHYSOUT=pm14peer SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=25454 DF PROTO=ICMP TYPE=8 CODE=0 ID=2004 SEQ=1 MARK=0x1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.085338] PREROUTING: IN=pm14 OUT= MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=25454 DF PROTO=ICMP TYPE=8 CODE=0 ID=2004 SEQ=1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.085355] PVEFW-FORWARD: IN=pm14 OUT=pm1 MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25454 DF PROTO=ICMP TYPE=8 CODE=0 ID=2004 SEQ=1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.085365] PVEFW-FORWARD: IN=pm14 OUT=pm1 MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25454 DF PROTO=ICMP TYPE=8 CODE=0 ID=2004 SEQ=1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.085374] POSTROUTING: IN= OUT=pm1 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25454 DF PROTO=ICMP TYPE=8 CODE=0 ID=2004 SEQ=1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.085399] PREROUTING: IN=vmbr1 OUT= PHYSIN=pm1peer MAC=00:08:7c:bd:ae:40:46:f8:0d:60:cc:81:08:00 SRC=10.3.94.31 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25454 DF PROTO=ICMP TYPE=8 CODE=0 ID=2004 SEQ=1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.085413] PVEFW-FORWARD: IN=vmbr1 OUT=vmbr1 PHYSIN=pm1peer PHYSOUT=bond0.94 MAC=00:08:7c:bd:ae:40:46:f8:0d:60:cc:81:08:00 SRC=10.3.94.31 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25454 DF PROTO=ICMP TYPE=8 CODE=0 ID=2 SEQ=1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.085423] POSTROUTING: IN= OUT=vmbr1 PHYSIN=pm1peer PHYSOUT=bond0.94 SRC=10.3.94.31 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25454 DF PROTO=ICMP TYPE=8 CODE=0 ID=2 SEQ=1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.097420] PVEFW-FORWARD: IN=vmbr14 OUT=vmbr14 PHYSIN=pm14peer PHYSOUT=tap110i0 MAC=1e:0b:85:27:8d:65:3a:3d:76:04:9d:37:08:00 SRC=8.8.8.8 DST=10.2.0.100 LEN=84 TOS=0x08 PREC=0x40 TTL=46 ID=0 PROTO=ICMP TYPE=0 CODE=0 ID=2004 SEQ=1 
Mar 12 03:47:27 kvmtest1 kernel: [  705.097435] PVEFW-FORWARD: IN=vmbr14 OUT=vmbr14 PHYSIN=pm14peer PHYSOUT=tap110i0 MAC=1e:0b:85:27:8d:65:3a:3d:76:04:9d:37:08:00 SRC=8.8.8.8 DST=10.2.0.100 LEN=84 TOS=0x08 PREC=0x40 TTL=46 ID=0 PROTO=ICMP TYPE=0 CODE=0 ID=2004 SEQ=1 


adding iptables -t raw -A PREROUTING -d '10.2.0.100/32' -i vmbr14 -j CT --zone 1
-------------------------------------------------------------------------------
now it's working

Mar 12 03:50:08 kvmtest1 kernel: [  865.632159] PREROUTING: IN=vmbr14 OUT= PHYSIN=tap110i0 MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=25462 DF PROTO=ICMP TYPE=8 CODE=0 ID=2009 SEQ=1 
Mar 12 03:50:08 kvmtest1 kernel: [  865.632182] PVEFW-FORWARD: IN=vmbr14 OUT=vmbr14 PHYSIN=tap110i0 PHYSOUT=pm14peer MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=25462 DF PROTO=ICMP TYPE=8 CODE=0 ID=2009 SEQ=1 
Mar 12 03:50:08 kvmtest1 kernel: [  865.632192] PVEFW-FORWARD: IN=vmbr14 OUT=vmbr14 PHYSIN=tap110i0 PHYSOUT=pm14peer MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=25462 DF PROTO=ICMP TYPE=8 CODE=0 ID=2009 SEQ=1 
Mar 12 03:50:08 kvmtest1 kernel: [  865.632213] POSTROUTING: IN= OUT=vmbr14 PHYSIN=tap110i0 PHYSOUT=pm14peer SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=25462 DF PROTO=ICMP TYPE=8 CODE=0 ID=2009 SEQ=1 MARK=0x1 
Mar 12 03:50:08 kvmtest1 kernel: [  865.632239] PREROUTING: IN=pm14 OUT= MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=64 ID=25462 DF PROTO=ICMP TYPE=8 CODE=0 ID=2009 SEQ=1 
Mar 12 03:50:08 kvmtest1 kernel: [  865.632256] PVEFW-FORWARD: IN=pm14 OUT=pm1 MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25462 DF PROTO=ICMP TYPE=8 CODE=0 ID=2009 SEQ=1 
Mar 12 03:50:08 kvmtest1 kernel: [  865.632265] PVEFW-FORWARD: IN=pm14 OUT=pm1 MAC=3a:3d:76:04:9d:37:1e:0b:85:27:8d:65:08:00 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25462 DF PROTO=ICMP TYPE=8 CODE=0 ID=2009 SEQ=1 
Mar 12 03:50:08 kvmtest1 kernel: [  865.632274] POSTROUTING: IN= OUT=pm1 SRC=10.2.0.100 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25462 DF PROTO=ICMP TYPE=8 CODE=0 ID=2009 SEQ=1 
Mar 12 03:50:08 kvmtest1 kernel: [  865.632299] PREROUTING: IN=vmbr1 OUT= PHYSIN=pm1peer MAC=00:08:7c:bd:ae:40:46:f8:0d:60:cc:81:08:00 SRC=10.3.94.31 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25462 DF PROTO=ICMP TYPE=8 CODE=0 ID=2009 SEQ=1 
Mar 12 03:50:08 kvmtest1 kernel: [  865.632314] PVEFW-FORWARD: IN=vmbr1 OUT=vmbr1 PHYSIN=pm1peer PHYSOUT=bond0.94 MAC=00:08:7c:bd:ae:40:46:f8:0d:60:cc:81:08:00 SRC=10.3.94.31 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25462 DF PROTO=ICMP TYPE=8 CODE=0 ID=7 SEQ=1 
Mar 12 03:50:08 kvmtest1 kernel: [  865.632324] POSTROUTING: IN= OUT=vmbr1 PHYSIN=pm1peer PHYSOUT=bond0.94 SRC=10.3.94.31 DST=8.8.8.8 LEN=84 TOS=0x00 PREC=0x00 TTL=63 ID=25462 DF PROTO=ICMP TYPE=8 CODE=0 ID=7 SEQ=1 




So, it seem that postrouting occur once by zone, or something like that

from the original commit of conntrack zone:
"
https://lists.linux-foundation.org/pipermail/containers/2010-January/022514.html

This is mainly useful when connecting multiple private networks using
the same addresses (which unfortunately happens occasionally) to pass
the packets through a set of veth devices and SNAT each network to a
unique address, after which they can pass through the "main" zone and
be handled like regular non-clashing packets and/or have NAT applied a
second time based f.i. on the outgoing interface.

Something like this, with multiple tunl and veth devices, each pair
using a unique zone:

  <tunl0 / zone 1>
     |
  PREROUTING
     |
  FORWARD
     |
  POSTROUTING: SNAT to unique network
     |
  <veth1 / zone 1>
  <veth0 / zone 0>
     |
  PREROUTING
     |
  FORWARD
     |
  POSTROUTING: SNAT to eth0 address
     |
  <eth0>
"


So,I think using zones is the goodway. But not available in 2.6.32.
(maybe it can be easily backported ?)


----- Mail original ----- 

De: "Dietmar Maurer" <dietmar at proxmox.com> 
À: "Alexandre DERUMIER" <aderumier at odiso.com> 
Cc: pve-devel at pve.proxmox.com 
Envoyé: Mardi 11 Mars 2014 17:12:06 
Objet: RE: [pve-devel] pvefw: masquerade problems and conntrack zones 

> >>I guess arp is not very reliable, and we currently do not even have IPs on 
> network interfaces. 
> >> 
> >>IMHO it is better to spent time to write an OVS controller instead of 
> adding such hacks. 
> 
> Ok,sure, no problem. I'll try with veth, now that I understand correctly what 
> you want. 

Sorry for the confusion. 



More information about the pve-devel mailing list