Floating Static Routes
Floating static routes are a useful and simple measure to provide backup routes via another hop or link. However, a floating static route just "lurks" there and does not provide load balancing! This can be as simple as two default routes that just differ in terms of metric or cost. As long as the preferred route with the better metric is available, the floating static route with the less attractive metric floats unused but suddenly takes over if the preferred route disappears. A requirement is an operating system that supports metrics for static routes. Lab 8-1 shows an example of this setup.
Equal-Cost Multi-Path (ECMP) Routing
Equal-Cost Multi-Path (ECMP) is a forwarding mechanism for routing packets along multiple paths of equal cost with the goal to achieve almost equally distributed link load sharing. This, of course, significantly impacts a router's next-hop (path) decision.
For further details, look at RFC 2991, "Multipath Issues in Unicast and Multicast Next-Hop Selection," and RFC 2992, "Analysis of an Equal-Cost Multi-Path Algorithm."
ECMP is available only for Linux in the open-source UNIX world; it solely is a feature of the underlying network stack. The terminology stems from the world of link-state routing protocols, which facilitate a cost-based metric; OSPF and Intermediate System-to-Intermediate System (IS-IS) explicitly allow ECMP routing. Load balancing can be carried out based on equal cost or unequal cost, per packet or per destination.
Because of the absence of a metric (weight), static routes under Cisco IOS Software support only equal-cost load sharing. To disable destination-based fast switching, you can force Cisco IOS Software to process switch on a per-packet basis with the interface command no ip route-cache. However, this does not affect CEF. Use ip load-sharing per-packet in that case.
As previously mentioned, UNIX stacks in general have a per-packet-based view of the world, in contrast to the Cisco default per-connection view (CEF, fast-switching cache). This behavior changes when enabling ECMP in the Linux kernel or deploying a route cache. In that case, the Linux OS performs per-flow balancing that can be changed to per-packet behavior with the equalize flag of the ip route command. Besides its merits, it can introduce performance issues because of stream rearrangement, in particular when dealing with real-time Voice over IP (VoIP) traffic. This has to be taken into consideration as well when changing Cisco forwarding settings from per destination to per packet. Figure 8-4 shows a possible scenario for ECMP where load balancing is desirable.
Lab 8-1: Interface Metrics, Floating Static Routes, and Multiple Equal-Cost Routes (ECMP)
Example 8-8 demonstrates the use of different metrics (the least preferable is referred to as a floating static route) for the same prefix as well as equal metrics for load balancing. The BSD world only supplies metrics in context with interfaces (see Example 8-9). This is not a deficiency, but is instead a design choice to leave these issues to dynamic routing protocols.
If you use two routes with an equal metric value, load balancing is done on a per-connection basis; if you specify the Linux equalize keyword on two routes, load balancing is done on a per-packet basis. In this case, the route is just recomputed for every packet. Without it, the route stays cached and bound to a specific next hop as long as it is up and alive.
Example 8-8 starts with adding two unequal-cost routes to the same destination prefix via the route and different prefixes via the ip route command sequence to demonstrate a floating static route setup. This is different on a Cisco router: The floating static route is added only to the routing table "on demand," when the preferable prefix route fails, hence the name floating. The second command sequence of Example 8-8 establishes a mix of per-destination and per-packet (equalize) load-balanced ECMP. As you can see from the following output in Example 8-9, both unequal metric routes are placed in the routing table. The ECMP routes have the same weight, 1, and the two per-packet load-balanced ECMP routes are labeled with the equalize flag (shaded text).
Example 8-8. Linux ECMP Setup Example
[root@callisto:~#] route add –net 11.1.1.0/24 metric 2 gw 192.168.1.254 [root@callisto:~#] route add –net 11.1.1.0/24 metric 1 gw 192.168.14.254 [root@callisto:~#] ip route add 11.1.2.0/24 via 192.168.1.254 metric 2 [root@callisto:~#] ip route add 11.1.2.0/24 via 192.168.14.254 metric 1 [root@callisto:~#] ip route add 10.1.1.0/24 equalize nexthop via 192.168.1.254 dev eth1 nexthop via 192.168.14.254 dev eth0 [root@callisto:~#] ip route add 10.1.5.0/24 nexthop via 192.168.1.254 nexthop via 192.168.14.254
Example 8-9. Linux ECMP Setup Result
[root@callisto:~#] ip route help Usage: ip route { list | flush } SELECTOR ip route get ADDRESS [ from ADDRESS iif STRING ] [ oif STRING ] [ tos TOS ] ip route { add | del | change | append | replace | monitor } ROUTE SELECTOR := [ root PREFIX ] [ match PREFIX ] [ exact PREFIX ] [ table TABLE_ID ] [ proto RTPROTO ] [ type TYPE ] [ scope SCOPE ] ROUTE := NODE_SPEC [ INFO_SPEC ] NODE_SPEC := [ TYPE ] PREFIX [ tos TOS ] [ table TABLE_ID ] [ proto RTPROTO ] [ scope SCOPE ] [ metric METRIC ] INFO_SPEC := NH OPTIONS FLAGS [ nexthop NH ]... NH := [ via ADDRESS ] [ dev STRING ] [ weight NUMBER ] NHFLAGS OPTIONS := FLAGS [ mtu NUMBER ] [ advmss NUMBER ] [ rtt NUMBER ] [ rttvar NUMBER ] [ window NUMBER] [ cwnd NUMBER ] [ ssthresh REALM ] [ realms REALM ] TYPE := [ unicast | local | broadcast | multicast | throw | unreachable | prohibit | blackhole | nat ] TABLE_ID := [ local | main | default | all | NUMBER ] SCOPE := [ host | link | global | NUMBER ] FLAGS := [ equalize ] NHFLAGS := [ onlink | pervasive ] RTPROTO := [ kernel | boot | static | NUMBER ] [root@callisto:~#] ip route show 192.168.1.0/24 dev eth1 scope link 192.168.1.0/24 dev ipsec0 proto kernel scope link src 192.168.1.1 10.1.5.0/24 nexthop via 192.168.1.254 dev eth1 weight 1 nexthop via 192.168.14.254 dev eth0 weight 1 192.168.14.0/24 dev eth0 scope link 11.1.2.0/24 via 192.168.14.254 dev eth0 metric 1 11.1.2.0/24 via 192.168.1.254 dev eth1 metric 2 10.1.1.0/24 equalize nexthop via 192.168.1.254 dev eth1 weight 1 nexthop via 192.168.14.254 dev eth0 weight 1 11.1.1.0/24 via 192.168.14.254 dev eth0 metric 1 11.1.1.0/24 via 192.168.1.254 dev eth1 metric 2 127.0.0.0/8 dev lo scope link default via 192.168.1.254 dev eth1 [root@callisto:~#] route -n Kernel IP routing table Destination Gateway Genmask Flags Metric Ref Use Iface 192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 eth1 192.168.1.0 0.0.0.0 255.255.255.0 U 0 0 0 ipsec0 10.1.5.0 192.168.1.254 255.255.255.0 UG 0 0 0 eth1 192.168.14.0 0.0.0.0 255.255.255.0 U 0 0 0 eth0 11.1.2.0 192.168.14.254 255.255.255.0 UG 1 0 0 eth0 11.1.2.0 192.168.1.254 255.255.255.0 UG 2 0 0 eth1 10.1.1.0 192.168.1.254 255.255.255.0 UG 0 0 0 eth1 11.1.1.0 192.168.14.254 255.255.255.0 UG 1 0 0 eth0 11.1.1.0 192.168.1.254 255.255.255.0 UG 2 0 0 eth1 127.0.0.0 0.0.0.0 255.0.0.0 U 0 0 0 lo 0.0.0.0 192.168.1.254 0.0.0.0 UG 0 0 0 eth1
As previously mentioned, BSD Unices do not provide metrics in context with the route command. However, you can assign metrics to interfaces, as demonstrated in Example 8-10 (shaded text).
Example 8-10. Example for OpenBSD Interface Metrics
[root@ganymed:~#] ifconfig ne4 metric 5 [root@ganymed:~#] ifconfig -A ... ne4: flags=8863metric 5 mtu 1500 media: Ethernet 10baseT full-duplex inet 192.168.2.254 netmask 0xffffff00 broadcast 192.168.2.255 inet6 fe80::5054:5ff:fee3:e42f%ne4 prefixlen 64 scopeid 0x2 ...
Linux TEQL (True Link Equalizer)
Several approaches exist to accomplish traffic flows over equal- or unequal-cost paths or interfaces. We have investigated Ethernet channel bonding (Layer 1) and ECMP so far. Other approaches are as follows:
-
-
TEQL, the "true" (or "trivial") link equalizer, which is unique to the Linux kernel. TEQL facilitates a queuing approach via the tc (traffic control) tool, which is an integral part of the Linux iproute2 suite of tools.
As always with link equalizing or ECMP, consider the negative implications of packet reordering, especially with heavily unbalanced links. (Note the following caveat.) TEQL support has to be compiled as a kernel module. Example 8-11 shows an example setup equalizing over two Ethernet network interface cards (NICs). TEQL uses its own virtual device, teql0.
Example 8-11. Joining Slaves to a Linux Equalizer Interface
[root@ganymed:~#] insmod sch_teql [root@ganymed:~#] tc qdisc add dev eth0 root teql0 [root@ganymed:~#] tc qdisc add dev eth1 root teql0 [root@callisto:~#] ifconfig -a teql0 Link encap:UNSPEC HWaddr 00-00-00-00-00-00-00-00-00-00-00-00-00-00-00-00 NOARP MTU:1500 Metric:1 RX packets:0 errors:0 dropped:0 overruns:0 frame:0 TX packets:0 errors:0 dropped:0 overruns:0 carrier:0 collisions:0 txqueuelen:100 RX bytes:0 (0.0 b) TX bytes:0 (0.0 b) [root@callisto:~#] tc -s qdisc qdisc teql0 8001: dev eth0 Sent 0 bytes 0 pkts (dropped 0, overlimits 0) qdisc teql0 8002: dev eth1 Sent 0 bytes 0 pkts (dropped 0, overlimits 0)
CAUTION
As Alexej Kuznetsov's said, "This device (teql0) puts no limitations on physical slave characteristics; [for example,] it will equalize 9600-baud line and 100-Mb Ethernet perfectly. Certainly, [a] large difference in link speeds will make the resulting equalized link unusable because of huge packet reordering. I estimate an upper useful difference as ~10 times."[3]
Adding Static Routes via Routing Daemons
Based on this chapter, you should now understand why adding static routes from the UNIX command line differs from adding static routes to the configuration of routing protocol daemons. The former acts directly on the FIB, whereas the latter acts on the RIB.
You should also note that because of this design paradigm, the routing engines differentiate between redistributing static routes added via routing protocol daemons and kernel routes added via the shell. For a complete treatise, three example configuration fragments for MRTd, GateD, and Zebra are offered (Examples 8-12 through 8-14). You can configure a preference value (administrative distance in Cisco notation) for comparison with other internal routing feeds, whereas from the UNIX shell only metrics are possible.
Example 8-12. MRTd Static Routes
[root@callisto:~#] cat /etc/mrtd.conf ... router rip network 192.168.1.0/24 network 192.168.14.0/24 redistribute connected redistribute static redistribute kernel ! route 0.0.0.0/0 192.168.1.254 1 ...
Example 8-13. GateD Static Routes
[root@callisto:~#] cat /etc/gated.conf ... static{ host 172.16.5.5 gateway 192.168.1.254 reject; host 172.16.5.6 gateway 192.168.1.254 retain; host 172.16.5.7 gateway 192.168.1.254 noinstall; 172.16.1.0 mask 255.255.255.0 gateway 192.168.1.254 preference 1 interface eth1; 172.16.2.0 masklen 24 gateway 192.168.1.254 blackhole; default gateway 192.168.1.254; }; export proto rip{ proto static; proto direct; proto kernel; }; ...
Example 8-14. Zebra Static Routes
[root@callisto:~#] cat /usr/local/etc/zebra.conf ... ip route 172.16.7.0/24 eth1 1 ip route 172.16.44.0/24 Null0 ...