Category Archives: Technology

ISIS Authentication types (packet captures)

In this post i would like to highlight a couple of “features” of ISIS.
More specifically the authentication mechanism used and how it looks in the data plane.

I will do this by configuring a couple of routers and configure the 2 authentication types available. I will then look at packet captures taken from the link between them and illustrate how its used by the ISIS process.

The 2 types of Authentication are link-level authentication of the Hello messages used to establish an adjacency and the second type is the authentication used to authenticate the LSP’s (Link State Packet) themselves.

First off, here is the extremely simple topology, but its all thats required for this purpose:

Simple, right? 2 routers with 1 link between them on Gig1. They are both running ISIS level-2-only mode, which means they will only try and establish a L2 adjacency with their neighbors. Each router has a loopback interface, which is also advertised into ISIS.

First off, lets look at the relevant configuration of CSR-02 for the Link-level authentication:

key chain MY-CHAIN
 key 1
  key-string WIPPIE
!
interface GigabitEthernet1
 ip address 10.1.2.2 255.255.255.0
 ip router isis 1
 negotiation auto
 no mop enabled
 no mop sysid
 isis authentication mode md5
 isis authentication key-chain MY-CHAIN

Without the same configuration on CSR-01, this is what we see in the data path (captured on CSR-02’s G1 interface):

And we also see that we dont have a full adjacency on CSR-01:

CSR-01#sh isis nei

Tag 1:
System Id       Type Interface     IP Address      State Holdtime Circuit Id
CSR-02          L2   Gi1           10.1.2.2        INIT  26       CSR-02.01

Lets apply the same authentication configuration on CSR-01 and see the result:

key chain MY-CHAIN
 key 1
  key-string WIPPIE
!
interface GigabitEthernet1
 ip address 10.1.2.1 255.255.255.0
 ip router isis 1
 negotiation auto
 no mop enabled
 no mop sysid
 isis authentication mode md5
 isis authentication key-chain MY-CHAIN

We now have a full adjacency:

CSR-01#sh isis neighbors 

Tag 1:
System Id       Type Interface     IP Address      State Holdtime Circuit Id
CSR-02          L2   Gi1           10.1.2.2        UP    8        CSR-02.01     

And we have routes from CSR-02:

CSR-01#sh ip route isis | beg Gate
Gateway of last resort is not set

      2.0.0.0/32 is subnetted, 1 subnets
i L2     2.2.2.2 [115/20] via 10.1.2.2, 00:01:07, GigabitEthernet1

Now, this is what we now see from CSR-02’s perspective again:

The Link-level authentication is fairly easy to spot in no time, because you simply wont have a stable adjacency formed.

The second type is LSP authentication. Lets look at the configuration of CSR-02 for this type of authentication:

CSR-02#sh run | sec router isis
 ip router isis 1
 ip router isis 1
router isis 1
 net 49.0000.0000.0002.00
 is-type level-2-only
 authentication mode text
 authentication key-chain MY-CHAIN

In this example, i have selected plain-text authentication, which i certainly dont recommend in production, but its great for our example.

Again, this is what it looks like in the data packet (from CSR-01 to CSR-02) without authentication enabled on CSR-01:

As you can see, we have the LSP that contains CSR-01’s prefixes, but nowhere is authentication present in the packet.

Lets enable it on CSR-01 and see the result:

CSR-01#sh run | sec router isis
 ip router isis 1
 ip router isis 1
router isis 1
 net 49.0000.0000.0001.00
 is-type level-2-only
 authentication mode text
 authentication key-chain MY-CHAIN

The result in the data packet:

Here we clearly have the authentication (with type = 10 (cleartext)) and we can see the password (WIPPIE) because we have selected cleartext.

The result is we a validated ISIS database on both routers.

Thats all folks, hope it helps to understand the difference between the 2 types of authentication in ISIS.

Take care!

 

 

Progress update – 10/07-2017

Hello folks,

Im currently going through the INE DC videos and learning a lot about fabrics and how they work along with a fair bit of UCS information on top of that!

Im spending an average of 2.5 hours on weekdays for study and a bit more in the weekends when time permits.

I still have no firm commitment to the CCIE DC track, but at some point I need to commit to it and really get behind it. One of these days 😉

I mentioned it to the wife-to-be a couple of days ago and while she didn’t applaud the idea, at least she wasn’t firmly against it, which is always something I guess! Its very important for me to have my family behind me in these endeavours!

Im still a bit concerned about the lack of rack rentals for DCv2 from INE, which is something I need to have in place before I order a bootcamp or more training materials from them. As people know by now, I really do my best learning in front of the “system”, trying out what works and what doesn’t.

Now to spin up a few N9K’s in the lab and play around with NX-OS unicast and multicast routing!

Take care.

New Lab Server & random updates

New Server:

So I just completed a purchase off eBay for a new server for my lab purposes.

For a while now I’ve been limited to 32Gb of memory on my old ESXi server, which is really more like 20Gb when my regular servers have had their share. Running a combination of different types of devices, each taking at least 4Gb of memory, doesn’t leave much room for larger labs.

I decided to go with a “real” server this time around. So I got an older Cisco UCS C200 M2 server with 2 x Xeon 5570 processors and an additional 96 Gb ram (on top of the 24 it came with). That stil leaves room for a bit of memory upgrades in the future (it supports a total of 192Gb) (had a budget on this one, so couldn’t go crazy).

Work:

Work has been crazy lately. 2 of my Team members just resigned so a lot of workload has to be shifted until we find suitable replacements. That means I’ve been working 65+ hour work weeks for a while now. Something that I dont find even remotely amusing to be honest. But I’ve been reassured that everything is being done to interview candidates, so im hopeful it will work out after the summer holidays.

We have a lot of interesting projects coming up, fx. our first production environment running Cisco ACI. This also included some very good training. Its really a different ball-game compared to the old way of doing Datacenters.

Also on my plate is some iWan solutions. Pretty interesting all in all.

Study:

Im still reading my way through the Cisco Intelligent WAN (IWAN) book. Its still on my list of things to do to take the exam I mentioned earlier, but keeping the work network running takes priority. Also I can’t help but feeling the pull of another CCIE when time permits, but its still just a thought (we all know how that usually goes right? 🙂 )

Personal:

September 16, my long-time girlfriend and I are getting married! Yes.. Married. Scary, but still something I look forward to. We’ve been together for an amazing 11 years on that date so its about time (she keeps telling me). As you may know, I proposed when I went to Las Vegas for Cisco Live last year, so its very memorable 🙂

Thats about it for now!

Take care!

A look at Auto-Tunnel Mesh Groups

In this post I would like to give a demonstration of using the Auto-Tunnel Mesh group feature.

As you may know, manual MPLS-TE tunnels are first and foremost unidirectional, meaning that if you do them between two PE nodes, you have to do a tunnel in each direction with the local PE node being the headend.

Now imagine if your network had 10 PE routers and you wanted to do a full mesh between them, this can become pretty burdensome and error-prone.
Thankfully there’s a method to avoid doing this manual configuration and instead rely on your IGP to signal its willingness to become part of a TE “Mesh”. Thats what the Auto-Tunnel Mesh Group feature is all about!

toplogy

In my small SP setup, I only have 3 PE devices, namely PE-1, PE-2 and PE-3. I also only have one P node, called P-1.
However small this setup is, its enough to demonstrate the power of the Auto-Tunnel mesh functionality.

Beyond that, I have setup a small MPLS L3 VPN service for customer CUST-A, which has a presence on all 3 PE nodes. The VPNv4 address-family is using a RR which for this purpose is P-1.

We are running OSPF as the IGP of choice. This means that our Mesh membership will be signaled using Opaque LSA’s, which I will show you later on.

The goal of the lab is to use the Auto-Tunnel mesh functionality to create a full mesh of tunnels between my PE nodes and use this exclusively for label switching and to do so with a general template that would scale to many more PE devices than just the 3 in this lab.

The very first thing you want to do is to enable MPLS-TE both globally and on your interfaces. We can verify this on PE-2:

PE-2:

mpls traffic-eng tunnels
!
interface GigabitEthernet2
ip address 10.2.100.2 255.255.255.0
negotiation auto
mpls traffic-eng tunnels
!

The second thing you want to do is to enable the mesh-feature globally using the following command as configured on PE-2 as well:

PE-2:

mpls traffic-eng auto-tunnel mesh

Starting off with MPLS-TE, we need to make sure our IGP is actually signaling this to begin with. I have configured MPLS-TE on the area 0 which is the only area in use in our topology:

PE-2:

router ospf 1
network 0.0.0.0 255.255.255.255 area 0
mpls traffic-eng router-id Loopback0
mpls traffic-eng area 0
mpls traffic-eng mesh-group 100 Loopback0 area 0

Dont get hung up on the last configuration line. I will explain this shortly. However notice the “mpls traffic-eng area 0” and “mpls traffic-eng router-id loopback0”. After those two lines are configured, you should be able to retrieve information on the MPLS-TE topology as seen from your IGP:

PE-2:

PE-2#sh mpls traffic-eng topology brief
My_System_id: 2.2.2.2 (ospf 1 area 0)

Signalling error holddown: 10 sec Global Link Generation 22

IGP Id: 1.1.1.1, MPLS TE Id:1.1.1.1 Router Node (ospf 1 area 0)
Area mg-id's:
: mg-id 100 1.1.1.1 :
link[0]: Broadcast, DR: 10.1.100.100, nbr_node_id:8, gen:14
frag_id: 2, Intf Address: 10.1.100.1
TE metric: 1, IGP metric: 1, attribute flags: 0x0
SRLGs: None

IGP Id: 2.2.2.2, MPLS TE Id:2.2.2.2 Router Node (ospf 1 area 0)
link[0]: Broadcast, DR: 10.2.100.100, nbr_node_id:9, gen:19
frag_id: 2, Intf Address: 10.2.100.2
TE metric: 1, IGP metric: 1, attribute flags: 0x0
SRLGs: None

IGP Id: 3.3.3.3, MPLS TE Id:3.3.3.3 Router Node (ospf 1 area 0)
Area mg-id's:
: mg-id 100 3.3.3.3 :
link[0]: Broadcast, DR: 10.3.100.100, nbr_node_id:11, gen:22
frag_id: 2, Intf Address: 10.3.100.3
TE metric: 1, IGP metric: 1, attribute flags: 0x0
SRLGs: None

IGP Id: 10.1.2.2, MPLS TE Id:22.22.22.22 Router Node (ospf 1 area 0)
link[0]: Broadcast, DR: 10.1.100.100, nbr_node_id:8, gen:17
frag_id: 3, Intf Address: 10.1.100.100
TE metric: 10, IGP metric: 10, attribute flags: 0x0
SRLGs: None

link[1]: Broadcast, DR: 10.2.100.100, nbr_node_id:9, gen:17
frag_id: 4, Intf Address: 10.2.100.100
TE metric: 10, IGP metric: 10, attribute flags: 0x0
SRLGs: None

link[2]: Broadcast, DR: 10.3.100.100, nbr_node_id:11, gen:17
frag_id: 5, Intf Address: 10.3.100.100
TE metric: 10, IGP metric: 10, attribute flags: 0x0
SRLGs: None

IGP Id: 10.1.100.100, Network Node (ospf 1 area 0)
link[0]: Broadcast, Nbr IGP Id: 10.1.2.2, nbr_node_id:5, gen:13

link[1]: Broadcast, Nbr IGP Id: 1.1.1.1, nbr_node_id:6, gen:13

IGP Id: 10.2.100.100, Network Node (ospf 1 area 0)
link[0]: Broadcast, Nbr IGP Id: 10.1.2.2, nbr_node_id:5, gen:18

link[1]: Broadcast, Nbr IGP Id: 2.2.2.2, nbr_node_id:1, gen:18

IGP Id: 10.3.100.100, Network Node (ospf 1 area 0)
link[0]: Broadcast, Nbr IGP Id: 10.1.2.2, nbr_node_id:5, gen:21

link[1]: Broadcast, Nbr IGP Id: 3.3.3.3, nbr_node_id:7, gen:21

The important thing to notice here is that we are indeed seeing the other routers in the network, all the PE devices as well as the P device.

Now to the last line of configuration under the router ospf process:

PE-2:

"mpls traffic-eng mesh-group 100 Loopback0 area 0"

What this states is that we would like to use the Auto-Tunnel Mesh group feature, with this PE node being a member of group 100, using loopback0 for communication on the tunnel and running within the area 0.

This by itself only handles the signaling, but we also want to deploy a template in order to create the individual tunnel interfaces. This is done in the following manner:

PE-2:

interface Auto-Template100
ip unnumbered Loopback0
tunnel mode mpls traffic-eng
tunnel destination mesh-group 100
tunnel mpls traffic-eng autoroute announce
tunnel mpls traffic-eng path-option 10 dynamic

Using the Auto-Template100 interface, we, as we would also do in manual TE, specify our loopback address, the tunnel mode and the path option. Note that here we are simply following the IGP, which sort of defeats the purpose of many MPLS-TE configurations. But with our topology there is no path diversity so it wouldnt matter anyways.

Also, the autoroute announce command is used to force traffic into the tunnels.

The important thing is the “tunnel destination mesh-group 100” which ties this configuration snippet into the OSPF one.

After everything is setup, you should see some dynamic tunnels being created on each PE node:

PE-2:

PE-2#sh ip int b | incl up
GigabitEthernet1 100.100.101.100 YES manual up up
GigabitEthernet2 10.2.100.2 YES manual up up
Auto-Template100 2.2.2.2 YES TFTP up up
Loopback0 2.2.2.2 YES manual up up
Tunnel64336 2.2.2.2 YES TFTP up up
Tunnel64337 2.2.2.2 YES TFTP up up

Lets verify the current RIB configuration after this step:

PE-2:

PE-2#sh ip route | beg Gateway
Gateway of last resort is not set

1.0.0.0/32 is subnetted, 1 subnets
O 1.1.1.1 [110/12] via 1.1.1.1, 00:29:13, Tunnel64336
2.0.0.0/32 is subnetted, 1 subnets
C 2.2.2.2 is directly connected, Loopback0
3.0.0.0/32 is subnetted, 1 subnets
O 3.3.3.3 [110/12] via 3.3.3.3, 00:28:48, Tunnel64337
10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
O 10.1.100.0/24 [110/11] via 10.2.100.100, 00:29:13, GigabitEthernet2
C 10.2.100.0/24 is directly connected, GigabitEthernet2
L 10.2.100.2/32 is directly connected, GigabitEthernet2
O 10.3.100.0/24 [110/11] via 10.2.100.100, 00:29:13, GigabitEthernet2
22.0.0.0/32 is subnetted, 1 subnets
O 22.22.22.22 [110/2] via 10.2.100.100, 00:29:13, GigabitEthernet2

Very good. We can see that in order to reach 1.1.1.1/32 which is PE-1’s loopback, we are indeed routing through one of the dynamic tunnels.
The same goes for 3.3.3.3/32 towards PE-3’s loopback.
PE-2:

PE-2#traceroute 1.1.1.1 so loo0
Type escape sequence to abort.
Tracing the route to 1.1.1.1
VRF info: (vrf in name/id, vrf out name/id)
1 10.2.100.100 [MPLS: Label 17 Exp 0] 16 msec 22 msec 22 msec
2 10.1.100.1 25 msec * 19 msec

We can see that traffic towards that loopback is indeed being label-switched. And just to make it obvious, let me make sure we are not using LDP 🙂

PE-2:

PE-2#sh mpls ldp neighbor
PE-2#

On P-1, it being the midpoint of our LSP’s, we would expect 6 unidirectional tunnels in total:

P-1:

P-1#sh mpls for
Local Outgoing Prefix Bytes Label Outgoing Next Hop
Label Label or Tunnel Id Switched interface
16 Pop Label 3.3.3.3 64336 [6853] \
472 Et2/0 10.1.100.1
17 Pop Label 2.2.2.2 64336 [2231] \
2880 Et2/0 10.1.100.1
18 Pop Label 1.1.1.1 64336 [4312] \
2924 Et2/1 10.2.100.2
19 Pop Label 1.1.1.1 64337 [4962] \
472 Et2/2 10.3.100.3
20 Pop Label 2.2.2.2 64337 [6013] \
562 Et2/2 10.3.100.3
21 Pop Label 3.3.3.3 64337 [4815] \
0 Et2/1 10.2.100.2

Exactly what we expected.
The following is the output of the command: “show ip ospf database opaque-area” on PE-2. I have cut it down to the relevant opaque-lsa part (we are using 2 types, one for the general MPLS-TE and one for the Mesh-Group feature):

LS age: 529
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 4.0.0.0
Opaque Type: 4
Opaque ID: 0
Advertising Router: 1.1.1.1
LS Seq Number: 80000002
Checksum: 0x5364
Length: 32

Capability Type: Mesh-group
Length: 8
Value:

0000 0064 0101 0101

LS age: 734
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 4.0.0.0
Opaque Type: 4
Opaque ID: 0
Advertising Router: 2.2.2.2
LS Seq Number: 80000002
Checksum: 0x6748
Length: 32

Capability Type: Mesh-group
Length: 8
Value:

0000 0064 0202 0202

LS age: 701
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 4.0.0.0
Opaque Type: 4
Opaque ID: 0
Advertising Router: 3.3.3.3
LS Seq Number: 80000002
Checksum: 0x7B2C
Length: 32

Capability Type: Mesh-group
Length: 8
Value:

0000 0064 0303 0303

I have highlighted the interesting parts, which is the Advertising Router and the value of the TLV, those starting with 0000 0064, which is in fact the membership of “100” being signaled across the IGP area.
Okay, all good i hear you say, but lets do an end-to-end test from the CE devices in Customer CUST-A’s domain:

R1:

R1#sh ip route | beg Gateway
Gateway of last resort is not set

10.0.0.0/32 is subnetted, 3 subnets
C 10.1.1.1 is directly connected, Loopback0
B 10.2.2.2 [20/0] via 100.100.100.100, 00:37:46
B 10.3.3.3 [20/0] via 100.100.100.100, 00:37:36
100.0.0.0/8 is variably subnetted, 2 subnets, 2 masks
C 100.100.100.0/24 is directly connected, FastEthernet0/0
L 100.100.100.1/32 is directly connected, FastEthernet0/0

So we are learning the routes on the customer side (through standard IPv4 BGP).

R1:

R1#ping 10.2.2.2 so loo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.2.2.2, timeout is 2 seconds:
Packet sent with a source address of 10.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 40/72/176 ms
R1#ping 10.3.3.3 so loo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.3.3.3, timeout is 2 seconds:
Packet sent with a source address of 10.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/32/48 ms

We have reachability! – What about traceroute:

R1:

R1#traceroute 10.2.2.2 so loo0

Type escape sequence to abort.
Tracing the route to 10.2.2.2

1 100.100.100.100 28 msec 20 msec 12 msec
2 10.1.100.100 [MPLS: Labels 18/17 Exp 0] 44 msec 136 msec 60 msec
3 100.100.101.100 [MPLS: Label 17 Exp 0] 28 msec 32 msec 12 msec
4 100.100.101.3 28 msec 32 msec 24 msec
R1#traceroute 10.3.3.3 so loo0

Type escape sequence to abort.
Tracing the route to 10.3.3.3

1 100.100.100.100 48 msec 16 msec 8 msec
2 10.1.100.100 [MPLS: Labels 19/17 Exp 0] 48 msec 12 msec 52 msec
3 100.100.102.100 [MPLS: Label 17 Exp 0] 16 msec 28 msec 36 msec
4 100.100.102.4 68 msec 56 msec 48 msec

Just what we would expect from our L3 MPLS VPN service. A transport label (this time through MPLS-TE) and a VPN label as signaled through MP-BGP.

To round it off, I have attached the following from a packet capture on P-1’s interface toward PE-1 and then re-issued the ICMP-echo from R1’s loopback toward R2’s loopback adress:

wireshark-output

With that, I hope its been informative for you. Thanks for reading!

References:

http://www.cisco.com/c/en/us/td/docs/ios/12_0s/feature/guide/gsmeshgr.html

Configurations:

configurations

Practical DMVPN Example

In this post, I will put together a variety of different technologies involved in a real-life DMVPN deployment.

This includes things such as the correct tunnel configuration, routing-configuration using BGP as the protocol of choice, as well as NAT toward an upstream provider and front-door VRF’s in order to implement a default-route on both the Hub and the Spokes and last, but not least a newer feature, namely Per-Tunnel QoS using NHRP.

So I hope you will find the information relevant to your DMVPN deployments.

First off, lets take a look at the topology I will be using for this example:
Topology

As can be seen, we have a hub router which is connected to two different ISP’s. One to a general purpose internet provider (the internet cloud in this topology) which is being used as transport for our DMVPN setup, as well as a router in the TeleCom network (AS 59701), providing a single route for demonstration purposes (8.8.8.8/32). We have been assigned the 70.0.0.0/24 network from TeleCom to use for internet access as well.

Then we have to Spoke sites, with a single router in each site (Spoke-01 and Spoke-02 respectively).
Each one has a loopback interface which is being announced.

The first “trick” here, is to use the so-called Front Door VRF feature. What this basically allows is to have your transport interface located in a separate VRF. This allows us to have 2 default (0.0.0.0) routes. One used for the transport network and one for the “global” VRF, which is being used by the clients behind each router.

I have created a VRF on the 3 routers in our network (the Hub, Spoke-01 and Spoke-02) called “Inet_VRF”. Lets take a look at the configuration for this network along with its routing information (RIB):


HUB#sh run | beg vrf defi
vrf definition Inet_VRF
!
address-family ipv4
exit-address-family

HUB#sh ip route vrf Inet_VRF | beg Ga
Gateway of last resort is 130.0.0.1 to network 0.0.0.0

S* 0.0.0.0/0 [1/0] via 130.0.0.1
130.0.0.0/16 is variably subnetted, 2 subnets, 2 masks
C 130.0.0.0/30 is directly connected, GigabitEthernet1
L 130.0.0.2/32 is directly connected, GigabitEthernet1

Very simple indeed. We are just using the IPv4 address-family for this VRF and we have a static default route pointing toward the Internet Cloud.

The spokes are very similar:

Spoke-01:





Spoke-01#sh ip route vrf Inet_VRF | beg Gat
Gateway of last resort is 140.0.0.1 to network 0.0.0.0

S* 0.0.0.0/0 [1/0] via 140.0.0.1
140.0.0.0/16 is variably subnetted, 2 subnets, 2 masks
C 140.0.0.0/30 is directly connected, GigabitEthernet1
L 140.0.0.2/32 is directly connected, GigabitEthernet1



Spoke-02:




Spoke-02#sh ip route vrf Inet_VRF | beg Gat
Gateway of last resort is 150.0.0.1 to network 0.0.0.0

S* 0.0.0.0/0 [1/0] via 150.0.0.1
150.0.0.0/16 is variably subnetted, 2 subnets, 2 masks
C 150.0.0.0/30 is directly connected, GigabitEthernet1
L 150.0.0.2/32 is directly connected, GigabitEthernet1



With this in place, we should have full reachability to the internet interface address of each router in the Inet_VRF:


HUB#ping vrf Inet_VRF 140.0.0.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 140.0.0.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 6/22/90 ms

HUB#ping vrf Inet_VRF 150.0.0.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 150.0.0.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 4/16/65 ms

With this crucial piece of configuration for the transport network, we can now start building our DMVPN network, starting with the Hub configuration:

HUB#sh run int tun100
Building configuration...



Current configuration : 452 bytes
!
interface Tunnel100
ip address 172.16.0.100 255.255.255.0
no ip redirects
ip mtu 1400
ip nat inside
ip nhrp network-id 100
ip nhrp redirect
ip tcp adjust-mss 1360
load-interval 30
nhrp map group 10MB-Group service-policy output 10MB-Parent
nhrp map group 30MB-Group service-policy output 30MB-Parent
tunnel source GigabitEthernet1
tunnel mode gre multipoint
tunnel vrf Inet_VRF
tunnel protection ipsec profile DMVPN-PROFILE shared
end



There are a fair bit of configuration here. However, pay attention to the “tunnel vrf Inet_VRF” command, as this is the piece that ties into your transport address for the tunnel. So basically we use G1 for the interface, and this is located in the Inet_VRF.

Also notice that we are running crypto on top of our tunnel to protect it from prying eyes. The relevant configuration is here:

crypto keyring MY-KEYRING vrf Inet_VRF
pre-shared-key address 0.0.0.0 0.0.0.0 key SUPER-SECRET
!
!
!
!
!
crypto isakmp policy 1
encr aes 256
hash sha256
authentication pre-share
group 2
!
!
crypto ipsec transform-set TRANSFORM-SET esp-aes 256 esp-sha256-hmac
mode transport
!
crypto ipsec profile DMVPN-PROFILE
set transform-set TRANSFORM-SET



Pretty straightforward with a static pre shared key in place for all nodes.

With the crypto in place, you should have a SA for it installed:


HUB#sh crypto isakmp sa
IPv4 Crypto ISAKMP SA
dst src state conn-id status
130.0.0.2 150.0.0.2 QM_IDLE 1002 ACTIVE
130.0.0.2 140.0.0.2 QM_IDLE 1001 ACTIVE



One for each spoke is in place an in the correct state (QM_IDLE = Good).

So now, lets verify the entire DMVPN solution with a few “show” commands:

HUB#sh dmvpn
Legend: Attrb --> S - Static, D - Dynamic, I - Incomplete
N - NATed, L - Local, X - No Socket
T1 - Route Installed, T2 - Nexthop-override
C - CTS Capable
# Ent --> Number of NHRP entries with same NBMA peer
NHS Status: E --> Expecting Replies, R --> Responding, W --> Waiting
UpDn Time --> Up or Down Time for a Tunnel
==========================================================================

Interface: Tunnel100, IPv4 NHRP Details
Type:Hub, NHRP Peers:2,

# Ent Peer NBMA Addr Peer Tunnel Add State UpDn Tm Attrb
----- --------------- --------------- ----- -------- -----
1 140.0.0.2 172.16.0.1 UP 05:24:09 D
1 150.0.0.2 172.16.0.2 UP 05:24:03 D


We have 2 spokes associated with our Tunnel100. One where the public address (NBMA) is 140.0.02 and the other 150.0.0.2. Inside the tunnel, the spokes have an IPv4 address of 172.16.0.1 and .2 respectively. Also, we can see that these are being learned Dynamically (The D in the 5 column).

All is well and good so far. But we still need to run a routing protocol across the tunnel interface in order to exchange routes. BGP and EIGRP are good candidates for this and in this example I have used BGP. And since we are running Phase 3 DMVPN, we can actually get away with just receiving a default route on the spokes!. At this point remember that our previous default route pointing toward the internet was in the Inet_VRF table, so these two wont collide.

Take a look at the BGP configuration on the hub router:


HUB#sh run | beg router bgp
router bgp 100
bgp log-neighbor-changes
bgp listen range 172.16.0.0/24 peer-group MYPG
network 70.0.0.0 mask 255.255.255.0
neighbor MYPG peer-group
neighbor MYPG remote-as 100
neighbor MYPG next-hop-self
neighbor MYPG default-originate
neighbor MYPG route-map ONLY-DEFAULT-MAP out
neighbor 133.1.2.2 remote-as 59701
neighbor 133.1.2.2 route-map TO-UPSTREAM-SP out



And the referenced route-maps:


ip prefix-list ONLY-DEFAULT-PFX seq 5 permit 0.0.0.0/0
!
ip prefix-list OUR-SCOPE-PFX seq 5 permit 70.0.0.0/24
!
route-map TO-UPSTREAM-SP permit 5
match ip address prefix-list OUR-SCOPE-PFX
!
route-map TO-UPSTREAM-SP deny 10
!
route-map ONLY-DEFAULT-MAP permit 10
match ip address prefix-list ONLY-DEFAULT-PFX



We are using the BGP listen feature, which makes it dynamic to setup BGP peers. We allow everything in the 172.16.0.0/24 network to setup a BGP session and we are using the Peer-Group MYPG for controlling the settings. Notice that we are sending out only a default route to the spokes.

Also pay attention to the fact that we are sending the 70.0.0.0/24 upstream to the TeleCom ISP. Since we are going to use this network for NAT’ing purposes only, we have a static route to Null0 installed as well:


HUB#sh run | incl ip route
ip route 70.0.0.0 255.255.255.0 Null0



For the last part of our BGP configuration, lets take a look at the Spoke configuration, which is very simple and straightforward:


Spoke-01#sh run | beg router bgp
router bgp 100
bgp log-neighbor-changes
redistribute connected route-map ONLY-LOOPBACK0
neighbor 172.16.0.100 remote-as 100



And the associated route-map:


route-map ONLY-LOOPBACK0 permit 10
match interface Loopback0

So basically thats a cookie-cutter configuration thats being reused on Spoke-02 as well.

So what does the routing end up with on the Hub side of things:


HUB# sh ip bgp | beg Networ
Network Next Hop Metric LocPrf Weight Path
0.0.0.0 0.0.0.0 0 i
*>i 1.1.1.1/32 172.16.0.1 0 100 0 ?
*>i 2.2.2.2/32 172.16.0.2 0 100 0 ?
*> 8.8.8.8/32 133.1.2.2 0 0 59701 i
*> 70.0.0.0/24 0.0.0.0 0 32768 i

We have a default route, which we injected using the default-information originate command. Then we receive the loopback addresses from each of the two spokes. Finally we have the network statement, the hub inserted, sending it to the upstream TeleCom. Finally we have 8.8.8.8/32 from TeleCom 🙂

Lets look at the spokes:


Spoke-01#sh ip bgp | beg Network
Network Next Hop Metric LocPrf Weight Path
*>i 0.0.0.0 172.16.0.100 0 100 0 i
*> 1.1.1.1/32 0.0.0.0 0 32768 ?

Very straightforward and simple. A default from the hub and the loopback we ourself injected.

Right now, we are in a situation where we have the correct routing information and we are ready to let NHRP do its magic. Lets take a look what happens when Spoke-01 sends a ping to Spoke-02’s loopback address:


Spoke-01#ping 2.2.2.2 so loo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2.2.2.2, timeout is 2 seconds:
Packet sent with a source address of 1.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 8/17/32 ms

Everything works, and we the math is right, we should see an NHRP shortcut being created for the Spoke to Spoke tunnel:


Spoke-01#sh ip nhrp shortcut
2.2.2.2/32 via 172.16.0.2
Tunnel100 created 00:00:06, expire 01:59:53
Type: dynamic, Flags: router rib
NBMA address: 150.0.0.2
Group: 30MB-Group
172.16.0.2/32 via 172.16.0.2
Tunnel100 created 00:00:06, expire 01:59:53
Type: dynamic, Flags: router nhop rib
NBMA address: 150.0.0.2
Group: 30MB-Group

and on Spoke-02:


Spoke-02#sh ip nhrp shortcut
1.1.1.1/32 via 172.16.0.1
Tunnel100 created 00:01:29, expire 01:58:31
Type: dynamic, Flags: router rib
NBMA address: 140.0.0.2
Group: 10MB-Group
172.16.0.1/32 via 172.16.0.1
Tunnel100 created 00:01:29, expire 01:58:31
Type: dynamic, Flags: router nhop rib
NBMA address: 140.0.0.2
Group: 10MB-Group

And the RIB on both routers should reflect this as well:


Gateway of last resort is 172.16.0.100 to network 0.0.0.0



B* 0.0.0.0/0 [200/0] via 172.16.0.100, 06:09:37
1.0.0.0/32 is subnetted, 1 subnets
C 1.1.1.1 is directly connected, Loopback0
2.0.0.0/32 is subnetted, 1 subnets
H 2.2.2.2 [250/255] via 172.16.0.2, 00:02:08, Tunnel100
172.16.0.0/16 is variably subnetted, 3 subnets, 2 masks
C 172.16.0.0/24 is directly connected, Tunnel100
L 172.16.0.1/32 is directly connected, Tunnel100
H 172.16.0.2/32 is directly connected, 00:02:08, Tunnel100

The “H” routes are from NHRP

And just to verify that our crypto is working, heres a capture from wireshark on the internet “Cloud” when pinging from Spoke-01 to Spoke-02:

wireshark

Now lets turn our attention to the Quality of Service aspect of our solution.

We have 3 facts to deal with.

1) The Hub router has a line-rate Gigabit Ethernet circuit to the Internet.
2) The Spoke-01 site has a Gigabit Ethernet circuit, but its a subrate to 10Mbit access-rate.
3) The Spoke-02 site has a Gigabit Ethernet circuit, but its a subrate to 30Mbit access-rate.

We somehow want to signal to the Hub site to “respect” these access-rates. This is where the “Per-Tunnel QoS” feature comes into play.

If you remember the Hub tunnel100 configuration, which looks like this:


interface Tunnel100
ip address 172.16.0.100 255.255.255.0
no ip redirects
ip mtu 1400
ip nat inside
ip nhrp network-id 100
ip nhrp redirect
ip tcp adjust-mss 1360
load-interval 30
nhrp map group 10MB-Group service-policy output 10MB-Parent
nhrp map group 30MB-Group service-policy output 30MB-Parent
tunnel source GigabitEthernet1
tunnel mode gre multipoint
tunnel vrf Inet_VRF
tunnel protection ipsec profile DMVPN-PROFILE shared



We have these 2 “nhrp map” statements. What these in effect does is to provide a framework for the Spokes to register using one of these two maps for the Hub to use to that individual spoke.

So these are the policy-maps we reference:


HUB#sh policy-map
Policy Map 30MB-Child
Class ICMP
priority 5 (kbps)
Class TCP
bandwidth 50 (%)

Policy Map 10MB-Parent
Class class-default
Average Rate Traffic Shaping
cir 10000000 (bps)
service-policy 10MB-Child

Policy Map 10MB-Child
Class ICMP
priority 10 (%)
Class TCP
bandwidth 80 (%)


Policy Map 30MB-Parent
Class class-default
Average Rate Traffic Shaping
cir 30000000 (bps)
service-policy 30MB-Child



We have a hiearchical policy for both the 10Mbit and 30Mbit groups. each with their own child policies.

On the Spoke side of things, all we have to do is to tell the Hub which group to use:


interface Tunnel100
bandwidth 10000
ip address 172.16.0.1 255.255.255.0
no ip redirects
ip mtu 1400
ip nhrp map 172.16.0.100 130.0.0.2
ip nhrp map multicast 130.0.0.2
ip nhrp network-id 100
ip nhrp nhs 172.16.0.100
ip nhrp shortcut
ip tcp adjust-mss 1360
load-interval 30
nhrp group 10MB-Group
qos pre-classify
tunnel source GigabitEthernet1
tunnel mode gre multipoint
tunnel vrf Inet_VRF
tunnel protection ipsec profile DMVPN-PROFILE



Here on Spoke-01, we request that the QoS group to be used is 10MB-Group.

And on Spoke-02:


interface Tunnel100
bandwidth 30000
ip address 172.16.0.2 255.255.255.0
no ip redirects
ip mtu 1400
ip nhrp map 172.16.0.100 130.0.0.2
ip nhrp map multicast 130.0.0.2
ip nhrp network-id 100
ip nhrp nhs 172.16.0.100
ip nhrp shortcut
ip tcp adjust-mss 1360
load-interval 30
nhrp group 30MB-Group
qos pre-classify
tunnel source GigabitEthernet1
tunnel mode gre multipoint
tunnel vrf Inet_VRF
tunnel protection ipsec profile DMVPN-PROFILE



We request the 30MB-Group.

So lets verify that the Hub understands this and applies it accordingly:


HUB#sh nhrp group-map
Interface: Tunnel100
NHRP group: 10MB-Group
QoS policy: 10MB-Parent
Transport endpoints using the qos policy:
140.0.0.2



NHRP group: 30MB-Group
QoS policy: 30MB-Parent
Transport endpoints using the qos policy:
150.0.0.2



Excellent. and to see that its actually applied correctly:


HUB#sh policy-map multipoint tunnel 100

 

Interface Tunnel100 <--> 140.0.0.2

Service-policy output: 10MB-Parent

Class-map: class-default (match-any)
903 packets, 66746 bytes
30 second offered rate 0000 bps, drop rate 0000 bps
Match: any
Queueing
queue limit 64 packets
(queue depth/total drops/no-buffer drops) 0/0/0
(pkts output/bytes output) 900/124744
shape (average) cir 10000000, bc 40000, be 40000
target shape rate 10000000

Service-policy : 10MB-Child

queue stats for all priority classes:
Queueing
queue limit 512 packets
(queue depth/total drops/no-buffer drops) 0/0/0
(pkts output/bytes output) 10/1860

Class-map: ICMP (match-all)
10 packets, 1240 bytes
30 second offered rate 0000 bps, drop rate 0000 bps
Match: protocol icmp
Priority: 10% (1000 kbps), burst bytes 25000, b/w exceed drops: 0
Class-map: TCP (match-all)
890 packets, 65494 bytes
30 second offered rate 0000 bps, drop rate 0000 bps
Match: access-group 110
Queueing
queue limit 64 packets
(queue depth/total drops/no-buffer drops) 0/0/0
(pkts output/bytes output) 890/122884
bandwidth 80% (8000 kbps)

Class-map: class-default (match-any)
3 packets, 12 bytes
30 second offered rate 0000 bps, drop rate 0000 bps
Match: any

queue limit 64 packets
(queue depth/total drops/no-buffer drops) 0/0/0
(pkts output/bytes output) 0/0

Interface Tunnel100 <--> 150.0.0.2

Service-policy output: 30MB-Parent

Class-map: class-default (match-any)
901 packets, 66817 bytes
30 second offered rate 0000 bps, drop rate 0000 bps
Match: any
Queueing
queue limit 64 packets
(queue depth/total drops/no-buffer drops) 0/0/0
(pkts output/bytes output) 901/124898
shape (average) cir 30000000, bc 120000, be 120000
target shape rate 30000000

Service-policy : 30MB-Child

queue stats for all priority classes:
Queueing
queue limit 512 packets
(queue depth/total drops/no-buffer drops) 0/0/0
(pkts output/bytes output) 10/1860

Class-map: ICMP (match-all)
10 packets, 1240 bytes
30 second offered rate 0000 bps, drop rate 0000 bps
Match: protocol icmp
Priority: 5 kbps, burst bytes 1500, b/w exceed drops: 0
Class-map: TCP (match-all)
891 packets, 65577 bytes
30 second offered rate 0000 bps, drop rate 0000 bps
Match: access-group 110
Queueing
queue limit 64 packets
(queue depth/total drops/no-buffer drops) 0/0/0
(pkts output/bytes output) 891/123038
bandwidth 50% (15000 kbps)

Class-map: class-default (match-any)
0 packets, 0 bytes
30 second offered rate 0000 bps, drop rate 0000 bps
Match: any

queue limit 64 packets
(queue depth/total drops/no-buffer drops) 0/0/0
(pkts output/bytes output) 0/0


The last piece of the QoS puzzle is to make sure you have a service-policy applied on the transport interfaces on the spokes as well:


Spoke-01#sh run int g1
Building configuration...

&nbsp;


Current configuration : 210 bytes
!
interface GigabitEthernet1
description -= Towards Internet Router =-
bandwidth 30000
vrf forwarding Inet_VRF
ip address 140.0.0.2 255.255.255.252
negotiation auto
service-policy output 10MB-Parent
end



and on Spoke-02:


poke-02#sh run int g1
Building configuration...

&nbsp;


Current configuration : 193 bytes
!
interface GigabitEthernet1
description -= Towards Internet Router =-
vrf forwarding Inet_VRF
ip address 150.0.0.2 255.255.255.252
negotiation auto
service-policy output 30MB-Parent
end



The last thing i want to mention is the NAT on the hub to use the 70.0.0.0/24 network for the outside world. Pretty straight forward NAT (inside on the tunnel interface 100 and outside on the egress interface toward Telecom, G2):


HUB#sh run int g2
Building configuration...

&nbsp;


Current configuration : 106 bytes
!
interface GigabitEthernet2
ip address 133.1.2.1 255.255.255.252
ip nat outside
negotiation auto
end



Also the NAT configuration itself:


ip nat pool NAT-POOL 70.0.0.1 70.0.0.253 netmask 255.255.255.0
ip nat inside source list 10 pool NAT-POOL overload
!
HUB#sh access-list 10
Standard IP access list 10
10 permit 1.1.1.1
20 permit 2.2.2.2



We are only NAT’ing the two loopbacks from the spokes on our network.

Lets do a final verification on the spokes to 8.8.8.8/32:




Spoke-01#ping 8.8.8.8 so loo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 8.8.8.8, timeout is 2 seconds:
Packet sent with a source address of 1.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 12/23/56 ms

and Spoke-02:


Spoke-02#ping 8.8.8.8 so loo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 8.8.8.8, timeout is 2 seconds:
Packet sent with a source address of 2.2.2.2
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 10/14/27 ms



Lets verify the NAT state table on the hub:


HUB#sh ip nat trans
Pro Inside global Inside local Outside local Outside global
icmp 70.0.0.1:1 1.1.1.1:5 8.8.8.8:5 8.8.8.8:1
icmp 70.0.0.1:2 2.2.2.2:0 8.8.8.8:0 8.8.8.8:2
Total number of translations: 2



All good!.

I hope you have had a chance to look at some of the fairly simple configuration snippets thats involved in these techniques and how they fit together in the overall scheme of things.

If you have any questions, please let me know!

Have fun with the lab!

(Configurations will be added shortly!)

Cisco Live US! 2016

I am fortunate enough, to be able to goto Cisco Live US! again this year.
Last year was such an experience, that my hopes are really high for this year as well.

I will be arriving on Friday the 8th and leaving on the 15th. Not a long stay this time, but it was what my boss could arrange for.
Again this year I will be bringing my better half, so she can experience the city and hopefully we’ll get a few hours of sightseeing in between commitments.

One of the things that im really looking forward to, is meeting up with friends and peers. This year is a bonus for me, as I get to say Congratulations to my friend Daniel (lostintransit.se) in person and not just on the phone, on passing the CCDE practical exam!

Also, a first for me, will be meeting up with Darren (mellowd.co.uk). We have been talking for a long time on twitter, mail and webex and im really looking forward to meeting him in person.

When we get closer to the event, I will be posting my Cisco Live! schedule here.

If you happen to be around the Las Vegas area, or even at Cisco Live!, drop me a line and maybe we can meet up!

See you there!

February – A busy month indeed!

Wow, what a busy month this has been!

So I started my new job on February 1st and thus far, everything has been really great.
My new coworkers are very friendly and helpful.

I’ve spent the better part of february, trying to get to grips with the SP network I will be focusing on from now on. Im still not where I want to be yet, but im getting there. One of the guys I will be working very closely with, started cleaning up the network when he was hired 9 months ago and he’s done a really great job with what he’s had to work with.

There are still some work to be done however, which is the very reason they have hired me and another very good friend of mine. A well run network is a dynamic beast which needs to be tamed. On top of that, the company growth has been around 30% a year, so alot of structure and processes needs to come with that growth, which is where I can really make a difference.

I’ve also had the good fortune of being selected as a 2016 Cisco Champion, which was a very nice surprise. I tried to squeeze in a few good technical posts last year, which I hope was useful to someone on the net. I’ve attended a few briefings so far, but they have mainly been about topics which I dont know enough about to offer any commentary on (UCS for example). Im hoping they are working on something in the routing realm as well 🙂

So for 2016, my primary objective is as previously stated, the CCDE certification.
Next month is Jeremy Filliben’s CCDE bootcamp, which I will be attending. I hope this will kick my butt into gear (I knew the change of jobs would hit my CCDE preparation). Im still aiming for a shot at the practical in late August.

Our Slack study group (Which Daniel Dib and I started) has grown quite a bit and includes a fair number of experts in different areas. If you are serious about CCDE or network design in general, dont be shy to mail me for an invite.

There are however, technologies which I also want to be familiar with to the level of blogging about them.

These include:

– Practical Segment Routing.
– Cisco’s iWAN solution.
– A deep knowledge of the ASR9K platform.
– Programmability (Python, API’s, etc.).
Now, back to work I go 🙂

/Kim

Doing right in the VAR role!

This post is my follow-up on a recent discussion on twitter.

Working for a VAR (Value Added Reseller) is not always the glamours life some make it out to be.

Working as a consultant, what you are really doing, is being the CEO of your own service company.
What you are selling, is basically your own services. The fact that your paycheck is being signed by someone else doesnt/shouldnt really matter.

The customer is building a relationship with you, as much as the company you are working for.
On top of that, you are continually building rapor in the networking world, so in my opinion, I would rather leave the customer with a good solution, rather than having to stick with the insane budgets that sales people end up shaving a project down to, just to get the contract.

So what can you do to create the outcome that is beneficial for all parties concerned (The customer, Your employer and yourself)?

Well, what I have tried in the past, is try and emphasize the importance of leaving the customer with the right solution based on his/her requirements and constraints. This discussion should involve both the technical side of things, as well as any account manager(s)/sales people involved. Try and focus on the long term results, such as customer satisfaction and reoccurring sales because of it.

Toward the customer, do your best to explain why solution X is better than Y, because of the requirements that are in place. Most people are sensible enough that, if you just take your time to explain the solution and have your reasoning in place, they will understand. Both of these (explaining and reasoning), is important for you to build the before mentioned rapor with the customer.

In the end, you should end up with a customer that will ask you for advice when in need, and trust your judgement when you recommend a solution. By doing this you effectively put the “fluff” from the account manager(s) aside and focus on the important work.

As engineers, we tend to focus on the technical side of a solution, but to be successful in our role(s), we also need to pay attention to the human/social aspect. Personally, this is an ongoing exercise, which I try to be very cognitive about when engaging with the customer.

So to summarize:

– Be a teamplayer, but know you are the one who has to face the customer regularly.
– Do your best to understand the customer and his/her requirements.
– Take your time to explain your solution to the customer.
– Never take the customer for granted.
– Pay attention to the social/human aspect when engaging with both the customer and your coworkers.

Now go and have a sit-down with that customer of yours!

/Kim

My first Cisco Live!

Even though im still in San Diego, Cisco Live! US 2015 is but a memory.

But what a memory it is! It being my first time attending a Cisco Live conference, I didn’t really know what to expect.

What I was met with, was a conference full of really sharp and nice people. The conference staff was very helpful and polite and really made an impression on me, from the time I first stepped onto the pavement outside San Diego convention center.

We (I brought my better half to the US) arrived very late on saturday, so after a good nights sleep I took the bus to the convention center to register and pick up the first piece of swag, the famous Cisco Live bag.

One of the great benefits of attending the conference was meeting with my good friend Daniel Dib (from lostintransit.se). I hadn’t seen him since January, so it was really cool to meet up with him during the week.

On Monday Daniel and I attended a session together, but most other sessions I went to alone. For the record, I paid for this trip out of my own pocket, so I didn’t have any co-workers or anything to tag along.

Tuesday was also spent in sessions, but in the evening there was the famous CCIE party, where my +1 was a friend of Daniel. Rihkka, a network engineer from Finland. She’s also in the VIP program along with Daniel. It was very nice meeting her. Also met with people from back home for morning coffee (which was actually Iced Coffee since it was fairly hot outside).

Wednesday I gave the CCDE written a shot, but unfortunately didn’t pass. Its a really weird exam if you ask me. Its supposed to be a technology design exam, but, I dont know. It just irks me somehow. Also, I had the good fortune of having a talk with Jeremy Filliben (http://www.jeremyfilliben.com), a renowned CCDE trainer. He gave some good advice along with information about his upcoming CCDE training programs. Very nice guy indeed. Since I lost track of Daniel for lunch, I was fortunate enough to run into Rihkka again. So we had lunch together where she could explain a bit more about Finland to me 🙂

Thursday, things slowed down a bit. I decided to cancel some sessions to get time to meet up with folks. During this day I also had a chance to meet with some folks from the Cisco Champion program, which was very good. Unfortunately a private QoS session for Cisco Champions was cancelled, which I was pretty bummed about. It would have been pretty awesome to meet Tim Szigeti. The QoS guy 🙂

Finally I said my goodbye’s to Daniel and headed back to the hotel. My first Cisco Live conference under my belt!

Would I recommend this as a good value for a network engineer? Absolutely!! the inspiration from the breakout sessions alone would be enough to justify it, but the social aspect of it all is what really makes the point!

I will do my very best to come back next year!

MPLS VPN’s over mGRE

This blog post outlines what “MPLS VPNs over mGRE” is all about as well as provide an example of such a configuration.

So what is “MPLS VPNs over mGRE”? – Well, basically its taking regular MPLS VPN’s and using it over an IP only core network. Since VPN’s over MPLS is one of the primary drivers for implementing an MPLS network in the first place, using the same functionality over an IP-only core might be very compelling for some not willing/able to run MPLS label switching in the core.

Instead of using labels to switch the traffic from one PE to another, mGRE (Multipoint GRE) is used as the encapsulation technology instead.

Be advised that 1 label is still being used however. This is the VPN label that’s used to identify which VRF interface to switch the traffic to when its received by a PE. This label is, just as in regular MPLS VPN’s, assigned by the PE through MP-BGP.

So how is this actually performed? – Well, lets take a look at an example.

The topology I will be using is as follows:

Topology for MPLS VPN's over mGRE

Topology for MPLS VPN’s over mGRE

** Note: I ran into an issue with VIRL, causing my CSR-3 to R3 to fail when establishing EIGRP adjacency. So i will not be using this in the examples to come. I noted this behavior on the VIRL community forums in case you are interested.

In this topology we have a core network, consisting of CSR-1 to CSR-5. They are all running OSPF in area 0. No MPLS is configured, so its pure IP routing end-to-end.

Lets take a look at CSR-5’s RIB:

CSR-5#sh ip route | beg Gateway
Gateway of last resort is not set

      1.0.0.0/32 is subnetted, 1 subnets
O        1.1.1.1 [110/2] via 192.168.15.1, 00:39:00, GigabitEthernet2
      2.0.0.0/32 is subnetted, 1 subnets
O        2.2.2.2 [110/2] via 192.168.25.2, 00:38:50, GigabitEthernet3
      3.0.0.0/32 is subnetted, 1 subnets
O        3.3.3.3 [110/2] via 192.168.35.3, 00:38:50, GigabitEthernet4
      4.0.0.0/32 is subnetted, 1 subnets
O        4.4.4.4 [110/2] via 192.168.45.4, 00:39:10, GigabitEthernet5
      5.0.0.0/32 is subnetted, 1 subnets
C        5.5.5.5 is directly connected, Loopback0
      192.168.15.0/24 is variably subnetted, 2 subnets, 2 masks
C        192.168.15.0/24 is directly connected, GigabitEthernet2
L        192.168.15.5/32 is directly connected, GigabitEthernet2
      192.168.25.0/24 is variably subnetted, 2 subnets, 2 masks
C        192.168.25.0/24 is directly connected, GigabitEthernet3
L        192.168.25.5/32 is directly connected, GigabitEthernet3
      192.168.35.0/24 is variably subnetted, 2 subnets, 2 masks
C        192.168.35.0/24 is directly connected, GigabitEthernet4
L        192.168.35.5/32 is directly connected, GigabitEthernet4
      192.168.45.0/24 is variably subnetted, 2 subnets, 2 masks
C        192.168.45.0/24 is directly connected, GigabitEthernet5
L        192.168.45.5/32 is directly connected, GigabitEthernet5

And to verify that we are not running any MPLS switching:

CSR-5#sh mpls for
Local      Outgoing   Prefix           Bytes Label   Outgoing   Next Hop
Label      Label      or Tunnel Id     Switched      interface

So we have our connected interfaces along with the loopbacks of all the routers in the core network.

Lets take a look at CSR-1’s configuration, with regards to its VRF configuration in particular:

vrf definition CUST-A
 rd 100:100
 !
 address-family ipv4
  route-target export 100:100
  route-target import 100:100
 exit-address-family
!
!
interface GigabitEthernet2
 vrf forwarding CUST-A
 ip address 10.0.1.1 255.255.255.0
 negotiation auto
!
router eigrp 1
!
address-family ipv4 vrf CUST-A autonomous-system 100
  redistribute bgp 1 metric 1 1 1 1 1
  network 0.0.0.0
 exit-address-family

We have our VRF CUST-A configured, with a RD of 100:100 along with 100:100 as both import and export Route-Targets. Just as we would configure for a regular MPLS L3 VPN.

We use our GigabithEthernet2 interface as our attachment circuit to our CUST-A. In addition we have EIGRP 100 running as the VRF aware IGP towards R1. And finally we are redistributing BGP into the VRF.

Lets make sure we are receiving routes from R1 into the VRF RIB:

CSR-1#sh ip route vrf CUST-A eigrp | beg Gateway
Gateway of last resort is not set

      100.0.0.0/32 is subnetted, 3 subnets
D        100.100.100.1 [90/130816] via 10.0.1.100, 00:45:37, GigabitEthernet2

Looks good, we are receiving the loopback prefix from R1. This is as we would expect.

A similar configuration exists on CSR-2, CSR-3 and CSR-4. Nothing different from a regular MPLS L3 VPN service.

Now for the core configuration utilizing MP-BGP.
We are using CSR-5 as a VPN-v4 route-reflector in order to avoid having a full mesh of iBGP sessions.

So the configuration on R5 looks like this:

CSR-5#sh run | sec router bgp
router bgp 1
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 neighbor 1.1.1.1 remote-as 1
 neighbor 1.1.1.1 update-source Loopback0
 neighbor 2.2.2.2 remote-as 1
 neighbor 2.2.2.2 update-source Loopback0
 neighbor 3.3.3.3 remote-as 1
 neighbor 3.3.3.3 update-source Loopback0
 neighbor 4.4.4.4 remote-as 1
 neighbor 4.4.4.4 update-source Loopback0
 !
 address-family ipv4
 exit-address-family
 !
 address-family vpnv4
  neighbor 1.1.1.1 activate
  neighbor 1.1.1.1 send-community extended
  neighbor 1.1.1.1 route-reflector-client
  neighbor 2.2.2.2 activate
  neighbor 2.2.2.2 send-community extended
  neighbor 2.2.2.2 route-reflector-client
  neighbor 3.3.3.3 activate
  neighbor 3.3.3.3 send-community extended
  neighbor 3.3.3.3 route-reflector-client
  neighbor 4.4.4.4 activate
  neighbor 4.4.4.4 send-community extended
  neighbor 4.4.4.4 route-reflector-client
 exit-address-family

Pretty straightforward really.

Then on CSR-1:

 CSR-1#sh run | sec router bgp
 router bgp 1
  bgp log-neighbor-changes
  no bgp default ipv4-unicast
  neighbor 5.5.5.5 remote-as 1
  neighbor 5.5.5.5 update-source Loopback0
  !
  address-family ipv4
  exit-address-family
  !
  address-family vpnv4
   neighbor 5.5.5.5 activate
   neighbor 5.5.5.5 send-community extended
   neighbor 5.5.5.5 route-map MODIFY-INBOUND in
  exit-address-family
  !
  address-family ipv4 vrf CUST-A
   redistribute eigrp 100
  exit-address-family

Here we have a single neighbor configured (R5 being the RR) using our loopback address. We are also redistributing routes from the VRF into BGP for VPNv4 announcements to the other PE’s. Whats really important (and differs from regular MPLS L3 VPN’s) is the route-map we apply inbound (MODIFY-INBOUND). Lets take a closer look at that:

CSR-1#sh route-map
route-map MODIFY-INBOUND, permit, sequence 10
  Match clauses:
  Set clauses:
    ip next-hop encapsulate l3vpn L3VPN-PROFILE
  Policy routing matches: 0 packets, 0 bytes

So all this does is set the next-hop according to a l3vpn profile called L3VPN-PROFILE. Now this is really the heart of the technology. Lets look at the profile in more detail:

CSR-1#sh run | beg L3VPN
l3vpn encapsulation ip L3VPN-PROFILE
 !

Well, that wasnt very informative. It simply defines a standard profile (which means mGRE) with our desired name.
You can get more detail by using the show commands:

CSR-1#sh l3vpn encapsulation ip 

 Profile: L3VPN-PROFILE
  transport ipv4 source Auto: Loopback0
  protocol gre
  payload mpls
   mtu default
  Tunnel Tunnel0 Created [OK]
  Tunnel Linestate [OK]
  Tunnel Transport Source (Auto) Loopback0 [OK]

So this tells us, that by default Loopback0 was chosen as the source of the tunnel and that Tunnel0 was created automatically. So lets take a look at the Tunnel0 in more detail:

 CSR-1#sh interface Tunnel0
 Tunnel0 is up, line protocol is up
   Hardware is Tunnel
   Interface is unnumbered. Using address of Loopback0 (1.1.1.1)
   MTU 9976 bytes, BW 10000 Kbit/sec, DLY 50000 usec,
      reliability 255/255, txload 1/255, rxload 1/255
   Encapsulation TUNNEL, loopback not set
   Keepalive not set
   Tunnel linestate evaluation up
   Tunnel source 1.1.1.1 (Loopback0)
    Tunnel Subblocks:
       src-track:
          Tunnel0 source tracking subblock associated with Loopback0
           Set of tunnels with source Loopback0, 1 member (includes iterators), on interface <OK>
   Tunnel protocol/transport multi-GRE/IP
     Key disabled, sequencing disabled
     Checksumming of packets disabled
   Tunnel TTL 255, Fast tunneling enabled
   Tunnel transport MTU 1476 bytes
   Tunnel transmit bandwidth 8000 (kbps)
   Tunnel receive bandwidth 8000 (kbps)
   Last input never, output never, output hang never
   Last clearing of "show interface" counters 00:54:16
   Input queue: 0/375/0/0 (size/max/drops/flushes); Total output drops: 3
   Queueing strategy: fifo
   Output queue: 0/0 (size/max)
   5 minute input rate 0 bits/sec, 0 packets/sec
   5 minute output rate 0 bits/sec, 0 packets/sec
      0 packets input, 0 bytes, 0 no buffer
      Received 0 broadcasts (0 IP multicasts)
      0 runts, 0 giants, 0 throttles
      0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
      0 packets output, 0 bytes, 0 underruns
      0 output errors, 0 collisions, 0 interface resets
      0 unknown protocol drops
      0 output buffer failures, 0 output buffers swapped out

Whats important here is that the Tunnel protocol/transport is multi-GRE/IP, which is the whole point of it all.

So to recap, when we receive prefixes reflected by our RR (this is besides the point, it could just as well be a full mesh), we set our IP Next-Hop to the other PE’s loopback address and tell the router to do the mGRE encapsulation when traffic is to be routed to these prefixes.

Lets take a look at our BGP table on CSR-1:

CSR-1#sh bgp vpnv4 uni vrf CUST-A | beg Network
     Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 100:100 (default for vrf CUST-A)
 *>  10.0.1.0/24      0.0.0.0                  0         32768 ?
 *>i 10.0.2.0/24      2.2.2.2                  0    100      0 ?
 *>i 10.0.3.0/24      3.3.3.3                  0    100      0 ?
 *>i 10.0.4.0/24      4.4.4.4                  0    100      0 ?
 *>  100.100.100.1/32 10.0.1.100          130816         32768 ?
 *>i 100.100.100.2/32 2.2.2.2             130816    100      0 ?
 *>i 100.100.100.4/32 4.4.4.4             130816    100      0 ?

(**Note: Remember CSR-3 is broken because of VIRL)

Lets take a look at what information is present for 100.100.100.2/32:

 CSR-1#sh bgp vpnv4 uni vrf CUST-A 100.100.100.2/32
 BGP routing table entry for 100:100:100.100.100.2/32, version 19
 Paths: (1 available, best #1, table CUST-A)
   Not advertised to any peer
   Refresh Epoch 1
   Local
     2.2.2.2 (metric 3) (via default) (via Tunnel0) from 5.5.5.5 (5.5.5.5)
       Origin incomplete, metric 130816, localpref 100, valid, internal, best
       Extended Community: RT:100:100 Cost:pre-bestpath:128:130816
         0x8800:32768:0 0x8801:100:128256 0x8802:65281:2560 0x8803:65281:1500
         0x8806:0:1684300802
       Originator: 2.2.2.2, Cluster list: 5.5.5.5
       mpls labels in/out nolabel/17
       rx pathid: 0, tx pathid: 0x0

Important to note here is that we are being told to use label nr. 17 as the VPN label for this prefix when sending it to 2.2.2.2 (CSR-2).

And finally lets take a look at what CEF thinks about it all:

CSR-1#sh ip cef vrf CUST-A 100.100.100.2 detail
100.100.100.2/32, epoch 0, flags [rib defined all labels]
  nexthop 2.2.2.2 Tunnel0 label 17

So CEF will assign label 17 to the packet and then use Tunnel0 to reach CSR-2. Just as we would expect.

As a final verification ive done an Embedded Packet Capture on CSR-5 while doing a ping from R1’s loopback to R2’s loopback and this is what you can see here:

 6  142   29.028990   1.1.1.1          ->  2.2.2.2          GRE
  0000:  FA163E39 EC39FA16 3ECEC705 08004500   ..>9.9..>.....E.
  0010:  00800000 0000FF2F B5490101 01010202   ......./.I......
  0020:  02020000 88470001 11FE4500 00640000   .....G....E..d..
  0030:  0000FE01 2BCD6464 64016464 64020800   ....+.ddd.ddd...

   7  142   29.106989   1.1.1.1          ->  2.2.2.2          GRE
  0000:  FA163E39 EC39FA16 3ECEC705 08004500   ..>9.9..>.....E.
  0010:  00800001 0000FF2F B5480101 01010202   ......./.H......
  0020:  02020000 88470001 11FE4500 00640001   .....G....E..d..
  0030:  0000FE01 2BCC6464 64016464 64020800   ....+.ddd.ddd...

   8  142   29.184988   1.1.1.1          ->  2.2.2.2          GRE
  0000:  FA163E39 EC39FA16 3ECEC705 08004500   ..>9.9..>.....E.
  0010:  00800002 0000FF2F B5470101 01010202   ......./.G......
  0020:  02020000 88470001 11FE4500 00640002   .....G....E..d..
  0030:  0000FE01 2BCB6464 64016464 64020800   ....+.ddd.ddd...

   9  142   29.241037   1.1.1.1          ->  2.2.2.2          GRE
  0000:  FA163E39 EC39FA16 3ECEC705 08004500   ..>9.9..>.....E.
  0010:  00800003 0000FF2F B5460101 01010202   ......./.F......
  0020:  02020000 88470001 11FE4500 00640003   .....G....E..d..
  0030:  0000FE01 2BCA6464 64016464 64020800   ....+.ddd.ddd...

  10  142   29.287024   1.1.1.1          ->  2.2.2.2          GRE
  0000:  FA163E39 EC39FA16 3ECEC705 08004500   ..>9.9..>.....E.
  0010:  00800004 0000FF2F B5450101 01010202   ......./.E......
  0020:  02020000 88470001 11FE4500 00640004   .....G....E..d..
  0030:  0000FE01 2BC96464 64016464 64020800   ....+.ddd.ddd...

As you can see, the encapsulation is GRE, just as expected.

So thats all there is to this technology. Very useful if you have an IP-only core network.

I hope its been useful and i will soon attach all the configurations for the routers in case you want to take a closer look.

Thanks for reading!

Update: Link to configs here