Redundancy for small sites.

We are working with alot of customers having lots of small “sites”, meaning that each site range from having 1 to 20 devices. The devices can be a user workstation or it can be some sort of automatic equipment having a VPN tunnel back to the main headquarter.

As the importance of each site grows, we are seeing these customers asking for a fail-over method for these sites and with the rise of cheap 3G connections it only seems natural to go down that route.

Cisco makes some small branch routers with 3G as part of the features. Routers such as the Cisco 819 suits the type of customers asking for this type of solution.

With these routers you basically get a normal WAN interface consisting of a Gigabit or FastEthernet interface and a Cellular interface which you have to tie together with a dialer interface.

To create the automatic failover, i have a small topology consisting of 4 routers:

The CE router is the site-router (think 819’ish router), but instead of a Cellular interface, i have created a Serial interface. (I dont have the possibility for 3G in my home lab).

ISP1 and ISP2 are two different ISP’s, but could also be the same ISP, providing services through different delivery methods.

What we want to accomplish is the ability to choose a different default route in case our primary provider fails. This failure can be both a link failure, but also a routing failure. How “far out” you want to meassure this routing failure is up to you, but choose a sensible and stable point.

In our example, the 43.43.43.43/32 is our “stable” point.

This is how we are going to accomplish the task:

First off, we need to use IP SLA to provide us with an up/down situation toward the stable point. We will just use regular icmp-echo, sent at a 5 second interval.

We then create a tracking object which we will use with our static routing commands. The tracking object references the IP SLA.

But in order for this to work, we must make sure that we always use our primary interface/path for sending these echo’s. If not, we would have continous flapping. When primary goes down, it switches to the secondary, but since we might be able to reach the stable point through our second path, it will reinstall the primary default route and on and on.

To pull this off, we will use a local policy (a policy which only local router traffic must abide to). In this policy, we specify ICMP traffic top our stable point, set the next-hop and if this doesnt work, send ICMP to null0 to drop it.

Lets check out the configuration on the CE router:

Define the IP SLA:

Schedule the IP SLA to start now and run forever:

Create an access-list for our local traffic:

Create a policy-map that controls the path the ICMP traffic will take:

Apply the local policy:

Set up our static routing. Utilize the primary path if tracking object is up, if not use secondary path with an AD of 253:

So lets verify our solution. Under normal circumstances:

We can see that we have our static route to our primary path. This must mean that our tracking object is up:

Everything good here.

Lets verify that we have end-to-end connectivity:

Superb!

Lets simulate an interface-down scenario on ISP1:

And now on CE:

Do we still have reachability:

Awesome.

Lets instead pull the ISP1 link back up, but simulate a routing failure by denying ICMP on ISP1:

Again on CE:

Again our tracking object goes down and we still have reachability. Lets make sure we go through ISP2 now:

And when ISP1 comes back online:

Nice and very useful.

I hope this is something you can use as well.