The situation is thus, you have an upstream connection (say, to the Internet) for which you have a single IP address. Multiple hosts need access to this network, using usual protocols like TCP or UDP. This is accomplished with a masquerading device, translating a different subnet's (usually a private RFC1918 address space) addresses to the single upstream address and rewriting port numbers to provide return paths to the individual hosts.
So far so good. Now if want to expose services that do not require establishment from this private network first, the normal solution is to configure the router to fixedly map the port numbers needed to the appropriate local address. Or in some cases to map an entire protocol that the NAT functionality cannot support, like IPsec, or even SCTP, to a single address in the private network.
This works well enough, most of the time. But some protocols, espeically those designed to work on large-scale networks, or at least networks which do not have the complex and essentially broken addressing structure that NAT imposes are not as easy to simply rewrite the transport and have continue to function. One solution here is to have the NAT device become ever more complex, understanding these protocols and doing higher-layer translations, acting as an ALG for the protocol.
The other is what I am attempting here. I seek to create a NAT router that does not itself use the upstream IP address for communications directly, but via its private address; reservingthe up stream address for the designated server host, if no NAT mapping, or other configuration is required. This allows the router to continue to firewall the entire network, but places a single 'designated' server in a sort of DMZ, allowing the local network to access it, and it to the local network (firewall rules permitting), as well as directly assigning the upstream IP adddress to its interface, so that protocols relying on accurate addressing information can run unaffected.
----------------- | Uplink Device | ---------------- ^ | v ---------- --------------------- | Router | <-> | Designated Server | ---------- --------------------- ^ | v ------------------ | Client Devices | ------------------
The premise of this setup is that the designated server need not have any special configuration, it appears to be sitting directly connected to the upstream network, and is configured as such. The only complication or unusual piece of configuration is to ensure that the emphemeral ports used by the designated server to do not overlap the ports used in NAT rewriting by the router. Similarly clients do not have any unusual configuration, they behave exactly as any other host behind a masquerading NAT device.
All of the complexity of routing and address translation is confined to the router device, in the diagram above.
[wan-ip] |---------------- wan0------------| | | | ROUTER | [lan-ip] [wan-gw-ip] lan0 dmz0 |---------------------------------| ^ ^ | | V V |--------------| |-----[wan-ip]-----| | CLIENT | | server0 | | HOSTS | | | |--------------| | DESIGNATED | | SERVER | |------------------|
So in this configuration, the designated server is configured as if it were directly connected to the upstream network. The router needs specific forwarding and routing rules so that all traffic destined to the upstream network from the designated server is directly forwarded, while traffic from the lan side is masqueraded. Return path traffic follows port numbers for when those are recognized by the router, in order to operate the NAT portion of its function, but for other port numbers, or any protocols for which port numbers are not available, the traffic is directly forwarded to the designated server.
Traffic for the designated server from the lan is addressed to the [wan-ip], for which the router needs to have a route, as well, the router's [lan-ip] is configured as an onlink route for the designated server, and the gateway for the lan addresses.
Simple, on paper. But how to configure the router to do this?
After reading the documentation on iproute2 and nftables, I figured I may as well give it a go, setting up a pile of virtual machines to test the behavior and get the configuration working. (Or otherwise; at this point I am not convinced the networking stack will let me do this.)
For a test, I need to define the addresses, so here goes.
|192.168.34.0/24||WAN Subnet||The directly connected upstream subnet|
|192.168.34.21/24||WAN IP||The assigned from upstream IP address|
|192.168.34.254/24||WAN GW||The assigned from upstream gateway address|
|192.168.35.21/24||WAN Host||An arbitrary host 'beyond' the WAN gateway|
|10.10.10.0/24||LAN Subnet||The local (private) LAN subnet|
|10.10.10.254/24||LAN Gatway||The local address of the router on the LAN subnet|
|10.10.10.1/24||LAN Client Host||An arbitrary host on the LAN subnet which should be a "client" (subject to NAT to reach the WAN)|
With that defined, I've set about creating a set of virtual machines connected by virtual bridges using libvirt/kvm to test how to configure this. I'm using a Debian "Buster" install DVD to build machines without access to the normal network/Internet so that the testing can be isolated sensibly.
The networks and domains were defined using these XML files.
<network> <name>test-lan</name> <bridge name="test-lan" stp="off"/> <domain name="lan.invalid"/> </network>
<domain type='kvm' id='15'> <name>router</name> <uuid>65de0f9d-8a54-4d56-a0b3-f651a835bb5f</uuid> <memory unit='KiB'>262144</memory> <currentMemory unit='KiB'>262144</currentMemory> <vcpu placement='static'>1</vcpu> <resource> <partition>/machine</partition> </resource> <os> <type arch='x86_64' machine='pc-1.1'>hvm</type> <boot dev='hd'/> <boot dev='cdrom'/> </os> <features> <acpi/> <apic/> <pae/> </features> <cpu mode='custom' match='exact' check='full'> <model fallback='forbid'>kvm64</model> <feature policy='require' name='hypervisor'/> </cpu> <clock offset='utc'/> <on_poweroff>destroy</on_poweroff> <on_reboot>restart</on_reboot> <on_crash>destroy</on_crash> <devices> <emulator>/usr/bin/qemu-system-x86_64</emulator> <disk type='file' device='disk'> <driver name='qemu' type='qcow2' cache='none'/> <source file='/home/enimihil/vm/test-router.qcow2'/> <backingStore/> <target dev='vda' bus='virtio'/> <alias name='virtio-disk0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x06' function='0x0'/> </disk> <disk type='file' device='cdrom'> <driver name='qemu' type='raw'/> <source file='/home/enimihil/vm/debian-testing-amd64-DVD-1.iso'/> <backingStore/> <target dev='hdc' bus='ide'/> <readonly/> <alias name='ide0-1-0'/> <address type='drive' controller='0' bus='1' target='0' unit='0'/> </disk> <controller type='usb' index='0' model='piix3-uhci'> <alias name='usb'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x2'/> </controller> <controller type='ide' index='0'> <alias name='ide'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x01' function='0x1'/> </controller> <controller type='virtio-serial' index='0'> <alias name='virtio-serial0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x05' function='0x0'/> </controller> <controller type='pci' index='0' model='pci-root'> <alias name='pci.0'/> </controller> <filesystem type='mount' accessmode='passthrough'> <driver type='path'/> <source dir='/home/enimihil/vm/'/> <target dir='/shared'/> <alias name='fs0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x0a' function='0x0'/> </filesystem> <interface type='network'> <mac address='52:54:00:4e:3f:5c'/> <source network='test-lan' bridge='test-lan'/> <target dev='vnet0'/> <model type='virtio'/> <alias name='net0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x03' function='0x0'/> </interface> <interface type='network'> <mac address='52:54:00:61:08:1f'/> <source network='test-wan' bridge='test-wan'/> <target dev='vnet1'/> <model type='virtio'/> <alias name='net1'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x08' function='0x0'/> </interface> <interface type='network'> <mac address='52:54:00:ad:8a:69'/> <source network='test-dmz' bridge='test-dmz'/> <target dev='vnet2'/> <model type='virtio'/> <alias name='net2'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x09' function='0x0'/> </interface> <serial type='pty'> <source path='/dev/pts/13'/> <target type='isa-serial' port='0'> <model name='isa-serial'/> </target> <alias name='serial0'/> </serial> <console type='pty' tty='/dev/pts/13'> <source path='/dev/pts/13'/> <target type='serial' port='0'/> <alias name='serial0'/> </console> <channel type='spicevmc'> <target type='virtio' name='com.redhat.spice.0' state='disconnected'/> <alias name='channel0'/> <address type='virtio-serial' controller='0' bus='0' port='1'/> </channel> <input type='tablet' bus='usb'> <alias name='input0'/> <address type='usb' bus='0' port='1'/> </input> <input type='mouse' bus='ps2'> <alias name='input1'/> </input> <input type='keyboard' bus='ps2'> <alias name='input2'/> </input> <graphics type='spice' port='5900' autoport='yes' listen='127.0.0.1'> <listen type='address' address='127.0.0.1'/> </graphics> <sound model='ich6'> <alias name='sound0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x04' function='0x0'/> </sound> <video> <model type='qxl' ram='65536' vram='65536' vgamem='16384' heads='1' primary='yes'/> <alias name='video0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x02' function='0x0'/> </video> <memballoon model='virtio'> <alias name='balloon0'/> <address type='pci' domain='0x0000' bus='0x00' slot='0x07' function='0x0'/> </memballoon> </devices> <seclabel type='none' model='none'/> <seclabel type='dynamic' model='dac' relabel='yes'> <label>+77:+1012</label> <imagelabel>+77:+1012</imagelabel> </seclabel> </domain>
Since the nftables package is not available on the installation DVD(s), I had to manually download a version from the mirrors and shared it over 9pfs with the guest machines, along with the libnftnl7 dependency.
So, with my various machines created and connected to their virtual networks, set about to manually configure the client machine, server, and wan-gw as fairly standard hosts.
Set up the client machine with a private address and default route.
client# ip link set ens9 name client0 client# ip link set up client0 client# ip addr add 10.10.10.1/24 dev client0 client# ip route add default via 10.10.10.254 client# sysctl -w net.ipv4.conf.all.log_martians=1
Set up the WAN gateway with the WAN subnet (eventually add routed network beyond for it.)
wan-gw# ip link set ens9 name wan0 wan-gw# ip link set up client0 wan-gw# ip addr add 192.168.34.254/24 dev wan0 client# sysctl -w net.ipv4.conf.all.log_martians=1
Set up the server as if it were directly connected to the WAN subnet.
server# ip link set ens9 name server0 server# ip link set up server0 server# ip addr add 192.168.34.21/24 dev server0 server# ip route add default via 192.168.34.254 server# sysctl -w net.ipv4.conf.all.log_martians=1
And now comes the complex part, how to set up the router to make all of this talk the way I want it to.
router# ip link set ens3 name lan0 router# ip link set up lan0 router# ip addr add 10.10.10.254/24 dev lan0 router# ip link set ens8 name wan0 router# ip link set up wan0 router# ip addr add 192.168.34.21/24 dev wan0 router# ip route add default via 192.168.34.254 router# ip link set ens9 name dmz0 router# ip link set up dmz0 router# ip addr add 192.168.34.254/24 dev dmz0 router# sysctl -w net.ipv4.ip_forward=1 router# sysctl -w net.ipv4.conf.all.log_martians=1
And now we see, looking at the routing table on the router, our first challenge.
router# ip route default via 192.168.34.254 dev lan0 10.10.10.0/24 dev lan0 proto kernel scope link src 10.10.10.254 192.168.34.0/24 dev wan0 proto kernel scope link src 192.168.34.21 192.168.34.0/24 dev dmz0 proto kernel scope link src 192.168.34.254
We now have the same WAN subnet on two different interfaces. The reality is that there's only a single 192.168.34.21/32 on the dmz0 interface, and the rest will be reachable via wan0.
A ping 192.168.34.21 from the client does not appear on the server's network interface, as the router absorbs the traffic and responds.
Removing the address on the router allows the client to ping the server, via the router, however the wan-gw cannot reach the server as the router does not answer arp requests and the wan-gw needs the server address to be on link.
Similarly, the router has the WAN GW address configured as a local address so that the server can route through it, and this causes the server to be able to ping the WAN gateway address, though it does not actually reach the wan-gw host.
Let's try policy routing.
# Add a mapping to 1=dmz in /etc/iproute2/rt_tables router# ip addr del 192.168.34.254/24 dev dmz0 router# ip addr del 192.168.34.21/24 dev wan0 router# ip rule add from 192.168.34.21 iif dmz0 table dmz router# ip rule add to 192.168.34.21 iif wan0 table dmz router# ip route add 192.168.34.254/32 table dmz dev wan0 router# ip route add 192.168.34.21/32 table dmz dev dmz0
Now our router doesn't know how to route to the 192.168.34.0/24 subnet from itself. But, now when we try to ping the wan-gw from the server, we see that the server is attempting to resolve the link address of the wan-gw (and failing to do so.) Likewise, the wan-gw attempts to resolve via ARP the link address of the server.
This is a job for the proxy ARP feature, so lets turn that on for the relevant interfaces.
Ah, success, we can now ping between the WAN GW and the server.
However, we can't ping or access the WAN GW from the router, nor the serverHowever, we can't ping or access the WAN GW from the router, nor the server. We are also unable to ping the server IP from the client, nor the wan-gw from the client.
Adding the WAN IP to the main routing table gets us the ability for the client to talk to the server.
At this point, the policy routing rules and the dmz table aren't used.
Our default route still doesn't work, though. Adding a route to the main table allows the ping to traverse to the wan-gw from the client, but we need to configure NAT, as the client address is not routable from the wan-gw.
This seems to not work, for anything I've tried, without configuring addresses on the router's interfaces for wan0 and dmz0. So back to that.
router# ip addr add 192.168.34.21/24 dev wan0 router# ip addr add 192.168.34.254/24 dev dmz0 router# ip route del 192.168.34.0/24 dev wan0 router# ip route del 192.168.34.0/24 dev dmz0 router# ip route del table local local 192.168.34.21/24 dev wan0 router# ip route del table local local 192.168.34.254/24 dev dmz0
Here we delete all the automagic routes and local routes for the addresses on the router's interfaces that aren't its own. Proxy ARP is still active and the previous direct /32 routes for the specific IP addresses we need to reach are still there.
Now a ping from the server to the wan IP goes to the wan gateway and vice versa, due to the proxy arp and removed local routes.
Now for the NAT.