Hello,
Would anyone be so kind as to help me to understand what policies should I apply to allow traffic from an external subnet?
I have a bunch of K8S-nodes and a separate server that works as a VPN gateway, it’s connected to the same VLAN. The nodes have the following IP-addresses: 10.13.17.1/22
, 10.13.17.2/22
, 10.13.17.3/22
and so on. The VPN gateway has 10.13.16.253/22
.
The Cluster IP CIDR is 10.233.0.0/18
, the pod IP CIDR is 10.233.64.0/18
.
The VPN server supports an IPSec site-to-site connection with a remote network, 10.103.103.0/24
. Also, this server supports BGP sessions with all K8S-nodes, so its route table is full of prefixes announced by Calico nodes (10.233.0.0/18
is present too as well, of course).
When I establish a connection to a service inside of the cluster from the VPN-server, everything is good. The client (10.13.16.253
) sends a SYN-packet to the service (10.233.10.101:1337
), the worker receives this packet, changes it’s destination IP-address to the IP-address of the pod (10.233.103.49:1337
) and changes it’s source IP-address to some IP-address (10.233.110.0
) that will help the worker to receive the reply and give it back to the connection initiator. Here’s what happens on the worker that receives this SYN-packet:
22:04:25.866546 IP 10.13.16.253.56297 > 10.233.10.101.1337: Flags [S], seq 3575679444, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 1385938010 ecr 0], length 0
22:04:25.866656 IP 10.233.110.0.54430 > 10.233.103.49.1337: Flags [S], seq 3575679444, win 65228, options [mss 1460,nop,wscale 7,sackOK,TS val 1385938010 ecr 0], length 0
22:04:25.867313 IP 10.233.103.49.1337 > 10.233.110.0.54430: Flags [S.], seq 2017844946, ack 3575679445, win 28960, options [mss 1460,sackOK,TS val 1201488363 ecr 1385938010,nop,wscale 7], length 0
22:04:25.867533 IP 10.233.10.101.1337 > 10.13.16.253.56297: Flags [S.], seq 2017844946, ack 3575679445, win 28960, options [mss 1460,sackOK,TS val 1201488363 ecr 1385938010,nop,wscale 7], length 0
So, the connection is established and everyone is happy.
But when I try to connect to the same service from the external network (10.103.103.0/24
) the worker who receives the SYN-packet does NOT change the source IP-address, it changes the destination IP-address only, so the packet’s source IP-address is unchanged.
21:56:05.794171 IP 10.103.103.1.52132 > 10.233.10.101.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195801472 ecr 0,nop,wscale 7], length 0
21:56:05.794242 IP 10.103.103.1.52132 > 10.233.103.49.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195801472 ecr 0,nop,wscale 7], length 0
21:56:21.826153 IP 10.103.103.1.52132 > 10.233.10.101.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195817504 ecr 0,nop,wscale 7], length 0
21:56:21.826199 IP 10.103.103.1.52132 > 10.233.103.49.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195817504 ecr 0,nop,wscale 7], length 0
21:56:53.924191 IP 10.103.103.1.52132 > 10.233.10.101.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195849600 ecr 0,nop,wscale 7], length 0
21:56:53.924254 IP 10.103.103.1.52132 > 10.233.103.49.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195849600 ecr 0,nop,wscale 7], length 0
The destination IP-address is changed, so I can see these packets on the worker where the pod is running, but there are no replies to them:
21:56:05.794602 IP 10.103.103.1.52132 > 10.233.103.49.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195801472 ecr 0,nop,wscale 7], length 0
21:56:21.826553 IP 10.103.103.1.52132 > 10.233.103.49.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195817504 ecr 0,nop,wscale 7], length 0
21:56:53.924556 IP 10.103.103.1.52132 > 10.233.103.49.1337: Flags [S], seq 3759345254, win 29200, options [mss 1460,sackOK,TS val 195849600 ecr 0,nop,wscale 7], length 0
The external network (10.103.103.0/24
) is being advertised by the VPN server via BGP, so all the workers know that this network is accessible via 10.13.16.253
. When I run the ping-test from a host in the external network (10.103.103.1
) to the IP-address of the service (10.233.10.101
), the test passes, VPN works fine and routing tables seem to be correct.
So, why does the network “trust” to 10.13.16.253
and doesn’t trust to 10.103.103.1
? And why does the worker perform SNAT and DNAT for the packets from 10.13.16.253
and does not perform SNAT for the packets from 10.103.103.1
? Should I add some policies to allow this traffic?
Thanks in advance for any clues!