Sunday, December 11, 2011

WAN simulation in LAN for Dev-Servers

Application behaviour under different network circumstances is usually an area which is neglected untill the day application starts showing unexpected behaviours during (or around) network outage times.

Though Networking stack makes sure the reliability of the information packets relayed to application layer and hence it is assumed that network situation is ideal i.e 100% reliable mainly because applications are developed in LAN environments which are always working almost ideally.

Things gets worse when production applications start receiving malformed or de-sequenced packets sometimes with jitter or retransmissions. Though this happens rarely but I believe in making an application which is designed to handle the worse situations as well.

So the question is how to generate a production(WAN) environment to test the behaviour of application. Since we can not control the networking layer on WAN so we can not guess/wait when things gets worse on internet and we test our application.

So, WAN simulation in LAN environment is the topic and we will see how to do this using simple commands in linux servers.

First lets look at our ideal LAN dev-server setup.

Normal Ping on LAN
We've Server-A and Server-B on Linux and a Windows Client. Lets Say Server-A is a VoIP server and Server-B is its DB server.

What we do next is introduce a 120ms delay for all the egress and ingress traffic on Server-A. 

Server-A with 120ms Delay on Inbound/Outbound Traffic

We use the following sequence of commands on Server-A to do this.

# tc qdisc add dev eth0 root handle 1:0 netem delay 120msec

The above command will implement a qdisc on device eth0 using netem module (network emulator) to delay all the packets by 120ms.

Once this delay is implemented we can change the delay using this command.

# tc qdisc change dev eth0 root handle 1:0 netem delay 180msec

To remove this delay altogether from the eth0 interface use the following command.

# tc qdisc del dev eth0 root handle 1:0 netem delay 180msec

So, Putting just a delay was easy, lets start adding some jitter
# tc qdisc change dev eth0 root handle 1:0 netem delay 180ms 80ms

Here's the output from Server-B when pinging Server-A

[root@Asterisk ~]# ping 192.168.56.102
PING 192.168.56.102 (192.168.56.102) 56(84) bytes of data.
64 bytes from 192.168.56.102: icmp_seq=1 ttl=64 time=184 ms
64 bytes from 192.168.56.102: icmp_seq=2 ttl=64 time=119 ms
64 bytes from 192.168.56.102: icmp_seq=3 ttl=64 time=160 ms
64 bytes from 192.168.56.102: icmp_seq=4 ttl=64 time=198 ms
64 bytes from 192.168.56.102: icmp_seq=5 ttl=64 time=221 ms
64 bytes from 192.168.56.102: icmp_seq=6 ttl=64 time=257 ms
64 bytes from 192.168.56.102: icmp_seq=7 ttl=64 time=161 ms
64 bytes from 192.168.56.102: icmp_seq=8 ttl=64 time=102 ms
64 bytes from 192.168.56.102: icmp_seq=9 ttl=64 time=209 ms

--- 192.168.56.102 ping statistics ---
9 packets transmitted, 9 received, 0% packet loss, time 8007ms
rtt min/avg/max/mdev = 102.371/179.395/257.091/46.304 ms
[root@Asterisk ~]#

Now, We start dropping packets.
On Server-A:
# tc qdisc add dev eth0 root handle 1:0 netem delay 180ms drop 50%

And following are the ping stats on Server-B

[root@Asterisk ~]# ping 192.168.56.102
PING 192.168.56.102 (192.168.56.102) 56(84) bytes of data.
64 bytes from 192.168.56.102: icmp_seq=6 ttl=64 time=181 ms
64 bytes from 192.168.56.102: icmp_seq=7 ttl=64 time=181 ms
64 bytes from 192.168.56.102: icmp_seq=9 ttl=64 time=183 ms
64 bytes from 192.168.56.102: icmp_seq=10 ttl=64 time=182 ms
64 bytes from 192.168.56.102: icmp_seq=11 ttl=64 time=182 ms
64 bytes from 192.168.56.102: icmp_seq=16 ttl=64 time=180 ms
64 bytes from 192.168.56.102: icmp_seq=19 ttl=64 time=182 ms
64 bytes from 192.168.56.102: icmp_seq=23 ttl=64 time=181 ms

--- 192.168.56.102 ping statistics ---
24 packets transmitted, 8 received, 66% packet loss, time 23013ms
rtt min/avg/max/mdev = 180.001/182.037/183.630/1.148 ms
[root@Asterisk ~]#

We can keep on playing with this using the below netem help.

Usage: ... netem [ limit PACKETS ]
                 [ delay TIME [ JITTER [CORRELATION]]]
                 [ distribution {uniform|normal|pareto|paretonormal}]
                 [ drop PERCENT [CORRELATION]]
                 [ corrupt PERCENT [CORRELATION]]
                 [ duplicate PERCENT [CORRELATION]]
                 [ reorder PRECENT [CORRELATION][ gap DISTANCE ]]

The problem with above is that when we implement qdisc netem on any specific interface it is by default used on all the packets coming in from any other host or network.

We may want to impose these delays and packets distortions on just one particular IP or subnet by using the set of following commands.

# tc qdisc add dev eth0 root handle 1: prio
# tc qdisc add dev eth0 parent 1:3 handle 30: netem delay 180ms
# tc filter add dev eth0 protocol ip parent 1:0 prio 3 u32 match ip dst 192.168.56.2/32 flowid 1:3

We created a qdisc handle, put a delay of  180ms on one of the handle's branch and created a filter for that branch that if an ip 192.168.56.2 is found force it to the 180ms branch else let the packets go as usual.

Here's what it looked like.

Server-A with 180ms delay for Server-B only

So see its easy to create a WAN simulation environment within your LAN. In addition to implemented the netem functions on one particular port of the Server-A as well. That requires using IPtables.

This was the quickest blog I could write despite being very busy at home work.