Given is the following network architecture: connection of a virtual bridge br0 and a remote ethernet-switch through vxlan tunnel via WLAN:
host [br0: tap0,vxlan0]
| ||
| ===========
| ||
| ||
VM (WLAN access point) [br0: eth0, wlan0] ||
| ||
| ||
------------- ||
| ||
STA [wlan0, br0: eth0, vxlan0]
|
|
|----------------------------------|
Switch
|
----------
|
notebook [eth0]
The configuration of the vxlan is:
host: route add -net 224.0.0.0 netmask 240.0.0.0 dev br0
ip li add vxlan0 type vxlan id 1 group 239.1.1.1 dev br0
STA: route add -net 224.0.0.0 netmask 240.0.0.0 dev wlan0
ip li add vxlan0 type vxlan id 1 group 239.1.1.1 dev wlan0
This means: the endpoints of the vxlan tunnel are br0 (host) and STA (wlan0).
Between them, there is the WLAN AP (a VM belonging to the host).
Now the problem:
If the VM (=AP) runs e.g. Linux 3.4.x, all is working fine as expected.
If the VM runs 3.12.x or even 3.10.x, the tunnel works fine a few minutes after creation. Afterwards it is broken.
Broken means:
A "dhcpcd eth0" e.g. on the notebook times out, doesn't work any more. Traces show:
The udp-tunnel-packages sent by the STA through vxlan0 can be seen on the host / tap0, but they can't be seen on vxlan0 (if it works, they can be seen on the vxlan0 device, too).
On the host runs Linux 3.10.x, on the STA 3.11.6.
Any idea why vxlan is broken w/ Linux 3.12.x or 3.10.x on the VM (AP)?
Thanks in advance for any hint,
regards,
Andreas
On Fri, 3 Jan 2014 15:27:19 +0100
Andreas Hartmann <[email protected]> wrote:
[...]
> Now the problem:
>
> If the VM (=AP) runs e.g. Linux 3.4.x, all is working fine as expected.
> If the VM runs 3.12.x or even 3.10.x, the tunnel works fine a few minutes after creation. Afterwards it is broken.
>
> Broken means:
> A "dhcpcd eth0" e.g. on the notebook times out, doesn't work any more. Traces show:
> The udp-tunnel-packages sent by the STA through vxlan0 can be seen on the host / tap0, but they can't be seen on vxlan0 (if it works, they can be seen on the vxlan0 device, too).
>
> On the host runs Linux 3.10.x, on the STA 3.11.6.
Some more findings:
- Problem can be seen with Linux 3.7 in the AP (VM), too.
- *Problem disappears* if the bridge device br0 on the host is set to
promiscuous mode.
- Sometimes, there can be seen the warning
"notebook dhcpcd[2784]: eth0: bad UDP checksum, ignoring"
when starting dhcpcd on the notebook with br0 / host set to promiscuous
mode (nevertheless dhcpcd worked fine). I never saw this warning
before.
Any idea how to fix the problem w/o running the bridge br0 on the host
in promiscuous mode?
Thanks for any hint,
Andreas
Hi!
For all others, having problems w/ broken multicast:
See the solution here:
http://article.gmane.org/gmane.linux.kernel/1625590
Regards,
Andreas