Leaving this here in the hopes of helping some desperate future Googler…
I set up a VLAN trunk on a fresh install of Ubuntu Server 16.04.4 LTS that looks like this:
- bond0: native LAN (192.168.1.7/24)
- bond0.10: VLAN10 (192.168.10.7/24)
N.B. It doesn’t matter that I’m using a bonded interface. This could just as easily be eth0, eth0.10, eth0.20, etc. The following tip still applies.
- I can ping each interface from other machines on the same network segments, e.g. 192.168.10.7 from 192.168.10.10.
- I can ping all interfaces from the inter-VLAN router.
- I have firewall rules on the inter-VLAN router allowing all traffic from VLAN10 to the native LAN.
- I can ping other servers’ native LAN interfaces from VLAN10, e.g. 192.168.1.6 from 192.168.10.10.
- I can’t ping this server’s native LAN interface from VLAN10, e.g. 192.168.1.7 from 192.168.10.10.
What is different about the native LAN interface on this server? Why can’t the inter-VLAN router just pass the traffic to the correct interface?
ifdown bond0.10 fixes it, and
ifup bond0.10 breaks it again. I spent too many hours today rebooting switches, fiddling with LAG, and trying to chase down bad ARP table entries.
It turns out that the inter-VLAN router is in fact doing its job. The “problem” is due to one of the anti-spoofing mechanisms in the Linux kernel that’s enabled by default in Ubuntu Server 16.04.4 LTS. If an interface gets a packet with a source address that matches the subnet of a different interface, the kernel considers the packet to be spoofed. Fortunately, we can disable this mechanism.
I applied this patch to
/etc/sysctl.d/10-network-security.conf to disable it globally and for the VLAN trunk interface, while leaving it enabled for other interfaces by default:
--- 10-network-security.conf.orig 2018-03-27 15:59:14.617843015 -0500 +++ 10-network-security.conf 2018-03-27 15:28:27.711511823 -0500 @@ -2,7 +2,10 @@ # Turn on Source Address Verification in all interfaces by default to # prevent some spoofing attacks. net.ipv4.conf.default.rp_filter=1 -net.ipv4.conf.all.rp_filter=1 + +# Let VLAN trunk hold multiple addresses. +net.ipv4.conf.all.rp_filter=0 +net.ipv4.conf.bond0.rp_filter=0 # Turn on SYN-flood protections. Starting with 2.6.26, there is no loss # of TCP functionality/features under normal conditions. When flood
N.B. The exact location of these tunables may vary on your system, or you may have to just add them from scratch. Replace
bond0with your VLAN trunk interface.
The other solution is to simply make sure that devices on a given subnet use the appropriate interface on the server, e.g. devices on my VLAN10 should always use this server’s 192.168.10.7 address. Unfortunately, my local DNS resolver (Unbound) does not support per-interface name resolution at this time.