Category Archives: Service Provider

A look at Auto-Tunnel Mesh Groups

In this post I would like to give a demonstration of using the Auto-Tunnel Mesh group feature.

As you may know, manual MPLS-TE tunnels are first and foremost unidirectional, meaning that if you do them between two PE nodes, you have to do a tunnel in each direction with the local PE node being the headend.

Now imagine if your network had 10 PE routers and you wanted to do a full mesh between them, this can become pretty burdensome and error-prone.
Thankfully there’s a method to avoid doing this manual configuration and instead rely on your IGP to signal its willingness to become part of a TE “Mesh”. Thats what the Auto-Tunnel Mesh Group feature is all about!

toplogy

In my small SP setup, I only have 3 PE devices, namely PE-1, PE-2 and PE-3. I also only have one P node, called P-1.
However small this setup is, its enough to demonstrate the power of the Auto-Tunnel mesh functionality.

Beyond that, I have setup a small MPLS L3 VPN service for customer CUST-A, which has a presence on all 3 PE nodes. The VPNv4 address-family is using a RR which for this purpose is P-1.

We are running OSPF as the IGP of choice. This means that our Mesh membership will be signaled using Opaque LSA’s, which I will show you later on.

The goal of the lab is to use the Auto-Tunnel mesh functionality to create a full mesh of tunnels between my PE nodes and use this exclusively for label switching and to do so with a general template that would scale to many more PE devices than just the 3 in this lab.

The very first thing you want to do is to enable MPLS-TE both globally and on your interfaces. We can verify this on PE-2:

PE-2:

mpls traffic-eng tunnels
!
interface GigabitEthernet2
ip address 10.2.100.2 255.255.255.0
negotiation auto
mpls traffic-eng tunnels
!

The second thing you want to do is to enable the mesh-feature globally using the following command as configured on PE-2 as well:

PE-2:

mpls traffic-eng auto-tunnel mesh

Starting off with MPLS-TE, we need to make sure our IGP is actually signaling this to begin with. I have configured MPLS-TE on the area 0 which is the only area in use in our topology:

PE-2:

router ospf 1
network 0.0.0.0 255.255.255.255 area 0
mpls traffic-eng router-id Loopback0
mpls traffic-eng area 0
mpls traffic-eng mesh-group 100 Loopback0 area 0

Dont get hung up on the last configuration line. I will explain this shortly. However notice the “mpls traffic-eng area 0” and “mpls traffic-eng router-id loopback0”. After those two lines are configured, you should be able to retrieve information on the MPLS-TE topology as seen from your IGP:

PE-2:

PE-2#sh mpls traffic-eng topology brief
My_System_id: 2.2.2.2 (ospf 1 area 0)

Signalling error holddown: 10 sec Global Link Generation 22

IGP Id: 1.1.1.1, MPLS TE Id:1.1.1.1 Router Node (ospf 1 area 0)
Area mg-id's:
: mg-id 100 1.1.1.1 :
link[0]: Broadcast, DR: 10.1.100.100, nbr_node_id:8, gen:14
frag_id: 2, Intf Address: 10.1.100.1
TE metric: 1, IGP metric: 1, attribute flags: 0x0
SRLGs: None

IGP Id: 2.2.2.2, MPLS TE Id:2.2.2.2 Router Node (ospf 1 area 0)
link[0]: Broadcast, DR: 10.2.100.100, nbr_node_id:9, gen:19
frag_id: 2, Intf Address: 10.2.100.2
TE metric: 1, IGP metric: 1, attribute flags: 0x0
SRLGs: None

IGP Id: 3.3.3.3, MPLS TE Id:3.3.3.3 Router Node (ospf 1 area 0)
Area mg-id's:
: mg-id 100 3.3.3.3 :
link[0]: Broadcast, DR: 10.3.100.100, nbr_node_id:11, gen:22
frag_id: 2, Intf Address: 10.3.100.3
TE metric: 1, IGP metric: 1, attribute flags: 0x0
SRLGs: None

IGP Id: 10.1.2.2, MPLS TE Id:22.22.22.22 Router Node (ospf 1 area 0)
link[0]: Broadcast, DR: 10.1.100.100, nbr_node_id:8, gen:17
frag_id: 3, Intf Address: 10.1.100.100
TE metric: 10, IGP metric: 10, attribute flags: 0x0
SRLGs: None

link[1]: Broadcast, DR: 10.2.100.100, nbr_node_id:9, gen:17
frag_id: 4, Intf Address: 10.2.100.100
TE metric: 10, IGP metric: 10, attribute flags: 0x0
SRLGs: None

link[2]: Broadcast, DR: 10.3.100.100, nbr_node_id:11, gen:17
frag_id: 5, Intf Address: 10.3.100.100
TE metric: 10, IGP metric: 10, attribute flags: 0x0
SRLGs: None

IGP Id: 10.1.100.100, Network Node (ospf 1 area 0)
link[0]: Broadcast, Nbr IGP Id: 10.1.2.2, nbr_node_id:5, gen:13

link[1]: Broadcast, Nbr IGP Id: 1.1.1.1, nbr_node_id:6, gen:13

IGP Id: 10.2.100.100, Network Node (ospf 1 area 0)
link[0]: Broadcast, Nbr IGP Id: 10.1.2.2, nbr_node_id:5, gen:18

link[1]: Broadcast, Nbr IGP Id: 2.2.2.2, nbr_node_id:1, gen:18

IGP Id: 10.3.100.100, Network Node (ospf 1 area 0)
link[0]: Broadcast, Nbr IGP Id: 10.1.2.2, nbr_node_id:5, gen:21

link[1]: Broadcast, Nbr IGP Id: 3.3.3.3, nbr_node_id:7, gen:21

The important thing to notice here is that we are indeed seeing the other routers in the network, all the PE devices as well as the P device.

Now to the last line of configuration under the router ospf process:

PE-2:

"mpls traffic-eng mesh-group 100 Loopback0 area 0"

What this states is that we would like to use the Auto-Tunnel Mesh group feature, with this PE node being a member of group 100, using loopback0 for communication on the tunnel and running within the area 0.

This by itself only handles the signaling, but we also want to deploy a template in order to create the individual tunnel interfaces. This is done in the following manner:

PE-2:

interface Auto-Template100
ip unnumbered Loopback0
tunnel mode mpls traffic-eng
tunnel destination mesh-group 100
tunnel mpls traffic-eng autoroute announce
tunnel mpls traffic-eng path-option 10 dynamic

Using the Auto-Template100 interface, we, as we would also do in manual TE, specify our loopback address, the tunnel mode and the path option. Note that here we are simply following the IGP, which sort of defeats the purpose of many MPLS-TE configurations. But with our topology there is no path diversity so it wouldnt matter anyways.

Also, the autoroute announce command is used to force traffic into the tunnels.

The important thing is the “tunnel destination mesh-group 100” which ties this configuration snippet into the OSPF one.

After everything is setup, you should see some dynamic tunnels being created on each PE node:

PE-2:

PE-2#sh ip int b | incl up
GigabitEthernet1 100.100.101.100 YES manual up up
GigabitEthernet2 10.2.100.2 YES manual up up
Auto-Template100 2.2.2.2 YES TFTP up up
Loopback0 2.2.2.2 YES manual up up
Tunnel64336 2.2.2.2 YES TFTP up up
Tunnel64337 2.2.2.2 YES TFTP up up

Lets verify the current RIB configuration after this step:

PE-2:

PE-2#sh ip route | beg Gateway
Gateway of last resort is not set

1.0.0.0/32 is subnetted, 1 subnets
O 1.1.1.1 [110/12] via 1.1.1.1, 00:29:13, Tunnel64336
2.0.0.0/32 is subnetted, 1 subnets
C 2.2.2.2 is directly connected, Loopback0
3.0.0.0/32 is subnetted, 1 subnets
O 3.3.3.3 [110/12] via 3.3.3.3, 00:28:48, Tunnel64337
10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
O 10.1.100.0/24 [110/11] via 10.2.100.100, 00:29:13, GigabitEthernet2
C 10.2.100.0/24 is directly connected, GigabitEthernet2
L 10.2.100.2/32 is directly connected, GigabitEthernet2
O 10.3.100.0/24 [110/11] via 10.2.100.100, 00:29:13, GigabitEthernet2
22.0.0.0/32 is subnetted, 1 subnets
O 22.22.22.22 [110/2] via 10.2.100.100, 00:29:13, GigabitEthernet2

Very good. We can see that in order to reach 1.1.1.1/32 which is PE-1’s loopback, we are indeed routing through one of the dynamic tunnels.
The same goes for 3.3.3.3/32 towards PE-3’s loopback.
PE-2:

PE-2#traceroute 1.1.1.1 so loo0
Type escape sequence to abort.
Tracing the route to 1.1.1.1
VRF info: (vrf in name/id, vrf out name/id)
1 10.2.100.100 [MPLS: Label 17 Exp 0] 16 msec 22 msec 22 msec
2 10.1.100.1 25 msec * 19 msec

We can see that traffic towards that loopback is indeed being label-switched. And just to make it obvious, let me make sure we are not using LDP 🙂

PE-2:

PE-2#sh mpls ldp neighbor
PE-2#

On P-1, it being the midpoint of our LSP’s, we would expect 6 unidirectional tunnels in total:

P-1:

P-1#sh mpls for
Local Outgoing Prefix Bytes Label Outgoing Next Hop
Label Label or Tunnel Id Switched interface
16 Pop Label 3.3.3.3 64336 [6853] \
472 Et2/0 10.1.100.1
17 Pop Label 2.2.2.2 64336 [2231] \
2880 Et2/0 10.1.100.1
18 Pop Label 1.1.1.1 64336 [4312] \
2924 Et2/1 10.2.100.2
19 Pop Label 1.1.1.1 64337 [4962] \
472 Et2/2 10.3.100.3
20 Pop Label 2.2.2.2 64337 [6013] \
562 Et2/2 10.3.100.3
21 Pop Label 3.3.3.3 64337 [4815] \
0 Et2/1 10.2.100.2

Exactly what we expected.
The following is the output of the command: “show ip ospf database opaque-area” on PE-2. I have cut it down to the relevant opaque-lsa part (we are using 2 types, one for the general MPLS-TE and one for the Mesh-Group feature):

LS age: 529
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 4.0.0.0
Opaque Type: 4
Opaque ID: 0
Advertising Router: 1.1.1.1
LS Seq Number: 80000002
Checksum: 0x5364
Length: 32

Capability Type: Mesh-group
Length: 8
Value:

0000 0064 0101 0101

LS age: 734
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 4.0.0.0
Opaque Type: 4
Opaque ID: 0
Advertising Router: 2.2.2.2
LS Seq Number: 80000002
Checksum: 0x6748
Length: 32

Capability Type: Mesh-group
Length: 8
Value:

0000 0064 0202 0202

LS age: 701
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 4.0.0.0
Opaque Type: 4
Opaque ID: 0
Advertising Router: 3.3.3.3
LS Seq Number: 80000002
Checksum: 0x7B2C
Length: 32

Capability Type: Mesh-group
Length: 8
Value:

0000 0064 0303 0303

I have highlighted the interesting parts, which is the Advertising Router and the value of the TLV, those starting with 0000 0064, which is in fact the membership of “100” being signaled across the IGP area.
Okay, all good i hear you say, but lets do an end-to-end test from the CE devices in Customer CUST-A’s domain:

R1:

R1#sh ip route | beg Gateway
Gateway of last resort is not set

10.0.0.0/32 is subnetted, 3 subnets
C 10.1.1.1 is directly connected, Loopback0
B 10.2.2.2 [20/0] via 100.100.100.100, 00:37:46
B 10.3.3.3 [20/0] via 100.100.100.100, 00:37:36
100.0.0.0/8 is variably subnetted, 2 subnets, 2 masks
C 100.100.100.0/24 is directly connected, FastEthernet0/0
L 100.100.100.1/32 is directly connected, FastEthernet0/0

So we are learning the routes on the customer side (through standard IPv4 BGP).

R1:

R1#ping 10.2.2.2 so loo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.2.2.2, timeout is 2 seconds:
Packet sent with a source address of 10.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 40/72/176 ms
R1#ping 10.3.3.3 so loo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.3.3.3, timeout is 2 seconds:
Packet sent with a source address of 10.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/32/48 ms

We have reachability! – What about traceroute:

R1:

R1#traceroute 10.2.2.2 so loo0

Type escape sequence to abort.
Tracing the route to 10.2.2.2

1 100.100.100.100 28 msec 20 msec 12 msec
2 10.1.100.100 [MPLS: Labels 18/17 Exp 0] 44 msec 136 msec 60 msec
3 100.100.101.100 [MPLS: Label 17 Exp 0] 28 msec 32 msec 12 msec
4 100.100.101.3 28 msec 32 msec 24 msec
R1#traceroute 10.3.3.3 so loo0

Type escape sequence to abort.
Tracing the route to 10.3.3.3

1 100.100.100.100 48 msec 16 msec 8 msec
2 10.1.100.100 [MPLS: Labels 19/17 Exp 0] 48 msec 12 msec 52 msec
3 100.100.102.100 [MPLS: Label 17 Exp 0] 16 msec 28 msec 36 msec
4 100.100.102.4 68 msec 56 msec 48 msec

Just what we would expect from our L3 MPLS VPN service. A transport label (this time through MPLS-TE) and a VPN label as signaled through MP-BGP.

To round it off, I have attached the following from a packet capture on P-1’s interface toward PE-1 and then re-issued the ICMP-echo from R1’s loopback toward R2’s loopback adress:

wireshark-output

With that, I hope its been informative for you. Thanks for reading!

References:

http://www.cisco.com/c/en/us/td/docs/ios/12_0s/feature/guide/gsmeshgr.html

Configurations:

configurations

Unified/Seamless MPLS

In this post I would like to highlight a relative new (to me) application of MPLS called Unified MPLS.
The goal of Unified MPLS is to separate your network into individual segments of IGP’s in order to keep your core network as simple as possible while still maintaining an end-to-end LSP for regular MPLS applications such as L3 VPN’s.

What we are doing is simply to put Route Reflectors into the forwarding path and changing the next-hop’s along the way, essentially stiching together the final LSP.
Along with that we are using BGP to signal a label value to maintain the LSP from one end of the network to the other without the use of LDP between IGP’s.

Take a look at the topology that we will be using to demonstrate this feature:

Unified-MPLS-Topology

In this topology we have a simplified layout of a service provider. We have a core network consisting of R3, R4 and R5 along with distribution networks on the right and left of the core. R2 and R3 is in the left distribution and R5 and R6 is in the right hand side one.

We have an MPLS L3VPN customer connected consisting of R1 in one site and R7 in another.

As is visisible in the topology, we are running 3 separate IGP’s to make a point about this feature. EIGRP AS 1, OSPF 100 and EIGRP AS 2. However we are only running one autonomous system as seen from BGP, so its a pure iBGP network.

Now in order to make the L3VPN to work, we need to have an end-to-end LSP going from R2 all the way to R6.
Whats is key here is that in order to have end-to-end reachability, we have contained IGP areas, each of which is running LDP for labels. However between the areas, all we are doing is leaking a couple of loopback adresses into the distribution sections from the core. These are used exclusively for the iBGP session.

On top of that, we need to have R3 and R5 being route-reflectors, have them being in the data path as well as having them allocating labels. This is done through the “send-label” command along with modifying the next-hop (“next-hop-self all” command).

This is illustrated in the following:

Unified-MPLS-iBGP-Topology

Enough theory, lets take a look at the configuration nessecary to pull this of. Lets start out with R2’s IGP and LDP configuration:

R2#sh run | sec router eigrp
router eigrp 1
 network 2.0.0.0
 network 10.0.0.0
 passive-interface default
 no passive-interface GigabitEthernet3

R2#sh run int g3
interface GigabitEthernet3
 ip address 10.2.3.2 255.255.255.0
 negotiation auto
 mpls ip
end

Pretty vanilla configuration of IGP + LDP.

The same for R3:

R3#sh run | sec router eigrp 1
router eigrp 1
 network 10.0.0.0
 redistribute ospf 100 metric 1 1 1 1 1 route-map REDIST-LOOPBACK-MAP
 passive-interface default
 no passive-interface GigabitEthernet2

R3#sh run int g2
interface GigabitEthernet2
 ip address 10.2.3.3 255.255.255.0
 negotiation auto
 mpls ip
end

R3#sh route-map REDIST-LOOPBACK-MAP
route-map REDIST-LOOPBACK-MAP, permit, sequence 10
  Match clauses:
    ip address prefix-lists: REDIST-LOOPBACK-PREFIX-LIST
  Set clauses:
  Policy routing matches: 0 packets, 0 bytes

R3#sh ip prefix-list
ip prefix-list REDIST-LOOPBACK-PREFIX-LIST: 1 entries
   seq 5 permit 3.3.3.3/32

Apart from the redistribution part, its simply establishing an EIGRP adjacency with R2. On top of that we are redistributing R3’s loopback0 interface, which is in the Core area, into EIGRP. Again, this step is nessecary for the iBGP session establishment.

An almost identical setup is present in the other distribution site, consisting of R5 and R6. Again we redistribute R5’s loopback0 address into the IGP (EIGRP AS 2), so we can have iBGP connectivity, which is our next step.

So lets take a look at the BGP configuration on R2 all the way to R6. Im leaving out the VPNv4 configuration for now, in order to make it more visible what we are trying to accomplish first:

R2:
---
router bgp 1000
 bgp router-id 2.2.2.2
 bgp log-neighbor-changes
 neighbor 3.3.3.3 remote-as 1000
 neighbor 3.3.3.3 update-source Loopback0
 !
 address-family ipv4
  network 2.2.2.2 mask 255.255.255.255
  neighbor 3.3.3.3 activate
  neighbor 3.3.3.3 send-label

R3:
---
router bgp 1000
 bgp router-id 3.3.3.3
 bgp log-neighbor-changes
 neighbor 2.2.2.2 remote-as 1000
 neighbor 2.2.2.2 update-source Loopback0
 neighbor 2.2.2.2 route-reflector-client
 neighbor 2.2.2.2 next-hop-self all
 neighbor 2.2.2.2 send-label
 neighbor 5.5.5.5 remote-as 1000
 neighbor 5.5.5.5 update-source Loopback0
 neighbor 5.5.5.5 route-reflector-client
 neighbor 5.5.5.5 next-hop-self all
 neighbor 5.5.5.5 send-label

R5:
---
router bgp 1000
 bgp router-id 5.5.5.5
 bgp log-neighbor-changes
 neighbor 3.3.3.3 remote-as 1000
 neighbor 3.3.3.3 update-source Loopback0
 neighbor 3.3.3.3 route-reflector-client
 neighbor 3.3.3.3 next-hop-self all
 neighbor 3.3.3.3 send-label
 neighbor 6.6.6.6 remote-as 1000
 neighbor 6.6.6.6 update-source Loopback0
 neighbor 6.6.6.6 route-reflector-client
 neighbor 6.6.6.6 next-hop-self all
 neighbor 6.6.6.6 send-label

R6:
---
router bgp 1000
 bgp router-id 6.6.6.6
 bgp log-neighbor-changes
 neighbor 5.5.5.5 remote-as 1000
 neighbor 5.5.5.5 update-source Loopback0
 !
 address-family ipv4
  network 6.6.6.6 mask 255.255.255.255
  neighbor 5.5.5.5 activate
  neighbor 5.5.5.5 send-label

As visible from the configuration. We have 2 IPv4 route-reflectors (R3 and R5), both of which put themselves into the datapath by using the next-hop-self command. On top of that we are allocating labels for all prefixes via BGP as well. Lets verify this on the set:

R2#sh bgp ipv4 uni la
   Network          Next Hop      In label/Out label
   2.2.2.2/32       0.0.0.0         imp-null/nolabel
   6.6.6.6/32       3.3.3.3         nolabel/305

R3#sh bgp ipv4 uni la
   Network          Next Hop      In label/Out label
   2.2.2.2/32       2.2.2.2         300/imp-null
   6.6.6.6/32       5.5.5.5         305/500

R5#sh bgp ipv4 uni la
   Network          Next Hop      In label/Out label
   2.2.2.2/32       3.3.3.3         505/300
   6.6.6.6/32       6.6.6.6         500/imp-null

 R6#sh bgp ipv4 uni la
    Network          Next Hop      In label/Out label
    2.2.2.2/32       5.5.5.5         nolabel/505
    6.6.6.6/32       0.0.0.0         imp-null/nolabel

Since we are only injecting 2 prefixes (loopbacks of R2 and R6) into BGP, thats all we have allocated labels for.

Doing a traceroute from R2 to R6 (between loopbacks), will reveal if we truly have an LSP between them:

R2#traceroute 6.6.6.6 so loo0
Type escape sequence to abort.
Tracing the route to 6.6.6.6
VRF info: (vrf in name/id, vrf out name/id)
  1 10.2.3.3 [MPLS: Label 305 Exp 0] 26 msec 15 msec 18 msec
  2 10.3.4.4 [MPLS: Labels 401/500 Exp 0] 10 msec 24 msec 34 msec
  3 10.4.5.5 [MPLS: Label 500 Exp 0] 7 msec 23 msec 24 msec
  4 10.5.6.6 20 msec *  16 msec

This looks exactly like we wanted it to. (note that the 401 label is on a pure P router in the core).
This also means we can setup our VPNv4 configuration on R2 and R6:

R2#sh run | sec router bgp
router bgp 1000
 bgp router-id 2.2.2.2
 bgp log-neighbor-changes
 neighbor 3.3.3.3 remote-as 1000
 neighbor 3.3.3.3 update-source Loopback0
 neighbor 6.6.6.6 remote-as 1000
 neighbor 6.6.6.6 update-source Loopback0
 !
 address-family ipv4
  network 2.2.2.2 mask 255.255.255.255
  neighbor 3.3.3.3 activate
  neighbor 3.3.3.3 send-label
  no neighbor 6.6.6.6 activate
 exit-address-family
 !
 address-family vpnv4
  neighbor 6.6.6.6 activate
  neighbor 6.6.6.6 send-community extended
 exit-address-family
 !
 address-family ipv4 vrf CUSTOMER-A
  redistribute connected
  redistribute static
 exit-address-family
R2#

R6#sh run | sec router bgp
router bgp 1000
 bgp router-id 6.6.6.6
 bgp log-neighbor-changes
 neighbor 2.2.2.2 remote-as 1000
 neighbor 2.2.2.2 update-source Loopback0
 neighbor 5.5.5.5 remote-as 1000
 neighbor 5.5.5.5 update-source Loopback0
 !
 address-family ipv4
  network 6.6.6.6 mask 255.255.255.255
  no neighbor 2.2.2.2 activate
  neighbor 5.5.5.5 activate
  neighbor 5.5.5.5 send-label
 exit-address-family
 !
 address-family vpnv4
  neighbor 2.2.2.2 activate
  neighbor 2.2.2.2 send-community extended
 exit-address-family
 !
 address-family ipv4 vrf CUSTOMER-A
  redistribute connected
  redistribute static
 exit-address-family

Lets verify that the iBGP VPNv4 peering is up and running:

R2#sh bgp vpnv4 uni all sum
..
Neighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
6.6.6.6         4         1000      16      16       11    0    0 00:09:31        2

R6#sh bgp vpnv4 uni all sum
..
Neighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
2.2.2.2         4         1000      17      17       11    0    0 00:10:26        2

We do have the prefixes and we should also have reachability from R1 to R7 (by way of their individual static default routes):

R1#ping 7.7.7.7 so loo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 7.7.7.7, timeout is 2 seconds:
Packet sent with a source address of 1.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 17/27/54 ms

Looks good, lets check the label path:

R1#traceroute 7.7.7.7 so loo0
Type escape sequence to abort.
Tracing the route to 7.7.7.7
VRF info: (vrf in name/id, vrf out name/id)
  1 10.1.2.2 19 msec 13 msec 12 msec
  2 10.2.3.3 [MPLS: Labels 305/600 Exp 0] 18 msec 19 msec 15 msec
  3 10.3.4.4 [MPLS: Labels 401/500/600 Exp 0] 12 msec 32 msec 34 msec
  4 10.4.5.5 [MPLS: Labels 500/600 Exp 0] 20 msec 27 msec 27 msec
  5 10.6.7.6 [MPLS: Label 600 Exp 0] 23 msec 15 msec 13 msec
  6 10.6.7.7 25 msec *  16 msec

What we are seeing here is basically the same path, but with the “VPN” label first (label 600).

So what have we really accomplished here? – Well, lets take a look at the RIB on R2 and look for the IGP (EIGRP AS 1) routes:

R2#sh ip route eigrp
..
      3.0.0.0/32 is subnetted, 1 subnets
D EX     3.3.3.3 [170/2560000512] via 10.2.3.3, 00:16:02, GigabitEthernet3
      10.0.0.0/8 is variably subnetted, 3 subnets, 2 masks
D        10.3.4.0/24 [90/3072] via 10.2.3.3, 00:16:02, GigabitEthernet3

A very small table indeed. And if we include whats being learned by BGP:

R2#sh ip route bgp
..
      6.0.0.0/32 is subnetted, 1 subnets
B        6.6.6.6 [200/0] via 3.3.3.3, 00:17:02

R2#sh ip route 6.6.6.6
Routing entry for 6.6.6.6/32
  Known via "bgp 1000", distance 200, metric 0, type internal
  Last update from 3.3.3.3 00:17:43 ago
  Routing Descriptor Blocks:
  * 3.3.3.3, from 3.3.3.3, 00:17:43 ago
      Route metric is 0, traffic share count is 1
      AS Hops 0
      MPLS label: 305

Only 1 prefix to communicate with the remote distribution site’s PE router (which we need the label for).

This means you can scale your distribution sites to very large sizes, keep your core as effecient as possible and eliminate using areas and whatnot in your IGP’s.

I hope its been useful with this quick walkthrough of unified/seamless MPLS.

Trying out IPv6 Prefix Delegation

In this post i will show how and why to use a feature called IPv6 Prefix Delegation (PD).

IPv6 prefix delegation is a feature that provides the capability to delegate or hand out IPv6 prefixes to other routers without the need to hardcode these prefixes into the routers.

Why would you want to do this? – Well, for one is the administration overhead associated with manual configuration. If the end-customer only cares about the amount of prefixes he or she receives, then it might as well be handed out automatically from a preconfigure pool. Just like DHCP works today on end-user systems.

On top of that, by configuring a redistribution into BGP just once, you will automatically have reachability to the prefixes that has been handed out, from the rest of your SP network.

So how do you go about configuring this? – Well, lets take a look at the topology we’ll be using to demonstrate IPv6 Prefix Delegation.

PD-Post-Topology

First off, we have the SP core network which consists of R1, R2 and R3. They are running in AS 64512 with R1 being a BGP route-reflector for the IPv6 unicast address-family. As an IGP we are running OSPFv3 to provide reachability within the core. No IPv4 is configured on any device.

The SP has been allocated a /32 IPv6 prefix which is 2001:1111::/32, from which it will “carve” out IPv6 prefixes to both its internal network as well as customer networks.

We are using /125 for the links between the core routers, just to make it simple when looking at the routing tables and the topology.

R2 is really where all the magic is taking place. R2 is a PE for two customers, Customer A and Customer B. Customer A is being reached through Gigabit2 and Customer B through Gigabit3. The customer’s respective CE routers are R4 and R7.

There is a link-net between R2 and R4 as well as R2 and R7. These are respectively 2001:1111:101::/64 and 2001:1111:102::/64.

So Lab-ISP has decided to use a /48 network from which to hand out prefixes to its customers. This /48 is 2001:1111:2222::/48. Lab-ISP also decided to hand out /56 addresses which will give the customers 8 bits (from 56 to 64) to use for subnetting. This is a typical deployment.

Also, since we are using a /48 as the block to “carve” out from, this gives us 8 bits (from 48 to 56) of assignable subnets, which ofcourse equals to 256 /56 prefixes we can hand out.

All of this can be a bit confusing, so lets look at it from a different perspective.

We start out with 2001:1111:2222::/48. We then want to look at how the first /56 looks like:

The 2001:1111:2222:0000::/56 is
2001:1111:2222:0000::
until
2001:1111:2222:00FF::

That last byte (remember this is all in hex) is what gives the customer 256 subnets to play around with.

The next /56 is:
2001:1111:2222:0100::/56

2001:1111:2222:0100::
until
2001:1111:2222:01FF::

We can do this all in all 256 times as mentioned earlier.

So in summary, with two customers, each receiving a /56 prefix, we would expect to see the bindings show up on R2 as:

2001:1111:2222::/56
2001:1111:2222:100::/56

So with all this theory in place, lets take a look at the configuration that makes all this work out.

First off we start out with creating a local IPv6 pool on R2:

ipv6 local pool IPv6-Local-Pool 2001:1111:2222::/48 56

This is in accordance to the requirements we have stated earlier.

Next up, we tie this local pool into a global IPv6 pool used specifically for Prefix Delegation:

ipv6 dhcp pool PD-DHCP-POOL
 prefix-delegation pool IPv6-Local-Pool

Finally we attach the IPv6 DHCP pool to the interfaces of Customer A and Customer B:

R2#sh run int g2
Building configuration...

Current configuration : 132 bytes
!
interface GigabitEthernet2
 no ip address
 negotiation auto
 ipv6 address 2001:1111:101::2/64
 ipv6 dhcp server PD-DHCP-POOL
end

R2#sh run int g3
Building configuration...

Current configuration : 132 bytes
!
interface GigabitEthernet3
 no ip address
 negotiation auto
 ipv6 address 2001:1111:102::2/64
 ipv6 dhcp server PD-DHCP-POOL
end

Thats pretty much all thats required from the SP point of view in order to hand out the prefixes.

Now, lets take a look at whats required on the CE routers.

Starting off with R4’s interface to the SP:

R4#sh run int g2
Building configuration...

Current configuration : 156 bytes
!
interface GigabitEthernet2
 no ip address
 negotiation auto
 ipv6 address 2001:1111:101::3/64
 ipv6 address autoconfig
 ipv6 dhcp client pd LOCAL-CE
end

Note that the “LOCAL-CE” is a local label we will use for the next step. It can be anything you desire.

Only when the “inside” interfaces requests an IPv6 address will a request be sent to the SP for them to hand something out. This is done on R4’s g1.405 and g1.406 interfaces:

R4#sh run int g1.405
Building configuration...

Current configuration : 126 bytes
!
interface GigabitEthernet1.405
 encapsulation dot1Q 405
 ipv6 address LOCAL-CE ::1:0:0:0:1/64
 ipv6 address autoconfig
end

R4#sh run int g1.406
Building configuration...

Current configuration : 126 bytes
!
interface GigabitEthernet1.406
 encapsulation dot1Q 406
 ipv6 address LOCAL-CE ::2:0:0:0:1/64
 ipv6 address autoconfig
end

Here we reference the previous local label “LOCAL-CE”. Most interesting is the fact that we are now subnetting the /56 prefix we have received by doing the “::1:0:0:0:1/64” and “::2:0:0:0:1/64” respectively.

What this does is that it appends the address to whats being given out. To repeat, for Customer A, this is 2001:1111:2222::/56 which will then be a final address of: 2001:1111:2222:1:0:0:0:1/64 for interface g1.405 and 2001:1111:2222:2:0:0:0:1/64 for g1.406.

Lets turn our attention to Customer B on R7.

Same thing has been configured, just using a different “label” for the assigned pool to show that its arbitrary:

R7#sh run int g3
Building configuration...

Current configuration : 155 bytes
!
interface GigabitEthernet3
 no ip address
 negotiation auto
 ipv6 address 2001:1111:102::7/64
 ipv6 address autoconfig
 ipv6 dhcp client pd CE-POOL
end

And the inside interface g1.100:

R7#sh run int g1.100
Building configuration...

Current configuration : 100 bytes
!
interface GigabitEthernet1.100
 encapsulation dot1Q 100
 ipv6 address CE-POOL ::1:0:0:0:7/64
end

Again, we are subnetting the received /56 into a /64 and applying it on the inside interface.

Going back to the SP point of view, lets verify that we are handing out some prefixes:

R2#sh ipv6 local pool
Pool                  Prefix                                       Free  In use
IPv6-Local-Pool       2001:1111:2222::/48                            254      2

We can see that our local pool has handed out 2 prefixes and if we dig further down into the bindings:

R2#sh ipv6 dhcp binding
Client: FE80::250:56FF:FEBE:93CC
  DUID: 00030001001EF6767600
  Username : unassigned
  VRF : default
  Interface : GigabitEthernet3
  IA PD: IA ID 0x00080001, T1 302400, T2 483840
    Prefix: 2001:1111:2222:100::/56
            preferred lifetime 604800, valid lifetime 2592000
            expires at Oct 16 2014 03:11 PM (2416581 seconds)
Client: FE80::250:56FF:FEBE:4754
  DUID: 00030001001EE5DF8700
  Username : unassigned
  VRF : default
  Interface : GigabitEthernet2
  IA PD: IA ID 0x00070001, T1 302400, T2 483840
    Prefix: 2001:1111:2222::/56
            preferred lifetime 604800, valid lifetime 2592000
            expires at Oct 16 2014 03:11 PM (2416575 seconds)

We see that we do indeed have some bindings taking place. Whats of more interest though, is the fact that static routes have been created:

R2#sh ipv6 route static | beg a - Ap
       a - Application
S   2001:1111:2222::/56 [1/0]
     via FE80::250:56FF:FEBE:4754, GigabitEthernet2
S   2001:1111:2222:100::/56 [1/0]
     via FE80::250:56FF:FEBE:93CC, GigabitEthernet3

So two static routes that points to the CE routers. This makes it extremely simple to propagate further into the SP core:

R2#sh run | sec router bgp
router bgp 64512
 bgp router-id 2.2.2.2
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 neighbor 2001:1111::12:1 remote-as 64512
 !
 address-family ipv4
 exit-address-family
 !
 address-family ipv6
  redistribute static
  neighbor 2001:1111::12:1 activate
 exit-address-family

Ofcourse some sort of filtering should be used instead of just redistributing every static route on the PE, but you get the point. So lets check it out on R3 for example:

R3#sh bgp ipv6 uni | beg Network
     Network          Next Hop            Metric LocPrf Weight Path
 *>i 2001:1111:2222::/56
                       2001:1111::12:2          0    100      0 ?
 *>i 2001:1111:2222:100::/56
                       2001:1111::12:2          0    100      0 ?

We do indeed have the two routes installed.

So how could the customer setup their routers to learn these prefixes automatically and use them actively?
Well, one solution would be stateless autoconfiguration, which i have opted to use here along with setting the default route doing this, on R5:

R5#sh run int g1.405
Building configuration...

Current configuration : 96 bytes
!
interface GigabitEthernet1.405
 encapsulation dot1Q 405
 ipv6 address autoconfig default
end

R5#sh ipv6 route | beg a - Ap
       a - Application
ND  ::/0 [2/0]
     via FE80::250:56FF:FEBE:49F3, GigabitEthernet1.405
NDp 2001:1111:2222:1::/64 [2/0]
     via GigabitEthernet1.405, directly connected
L   2001:1111:2222:1:250:56FF:FEBE:3DFB/128 [0/0]
     via GigabitEthernet1.405, receive
L   FF00::/8 [0/0]
     via Null0, receive

and R6:

R6#sh run int g1.406
Building configuration...

Current configuration : 96 bytes
!
interface GigabitEthernet1.406
 encapsulation dot1Q 406
 ipv6 address autoconfig default
end

R6#sh ipv6 route | beg a - App
       a - Application
ND  ::/0 [2/0]
     via FE80::250:56FF:FEBE:49F3, GigabitEthernet1.406
NDp 2001:1111:2222:2::/64 [2/0]
     via GigabitEthernet1.406, directly connected
L   2001:1111:2222:2:250:56FF:FEBE:D054/128 [0/0]
     via GigabitEthernet1.406, receive
L   FF00::/8 [0/0]
     via Null0, receive

So now we have the SP core in place, we have the internal customer in place. All thats really required now is for some sort of routing to take place on the CE routers toward the SP. I have chosen the simplest solution, a static default route:

R4#sh run | incl ipv6 route
ipv6 route ::/0 2001:1111:101::2

and on R7:

R7#sh run | incl ipv6 route
ipv6 route ::/0 2001:1111:102::2

Finally its time to test all this stuff out in the data plane.

Lets ping from R3 to R5 and R6:

R3#ping 2001:1111:2222:1:250:56FF:FEBE:3DFB
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001:1111:2222:1:250:56FF:FEBE:3DFB, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/12/20 ms
R3#ping 2001:1111:2222:2:250:56FF:FEBE:D054
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001:1111:2222:2:250:56FF:FEBE:D054, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/7/17 ms

And also to R7:

R3#ping 2001:1111:2222:101::7
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001:1111:2222:101::7, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/8/18 ms

Excellent. Everything works.

Lets summarize what we have done.

1) We created a local IPv6 pool on the PE router.
2) We created a DHCPv6 server utilizing this local pool as a prefix delegation.
3) We enabled the DHCPv6 server on the customer facing interfaces.
4) We enabled the DHCPv6 PD on the CE routers (R4 and R7) and used a local label as an identifier.
5) We enabled IPv6 addresses using PD on the local interfaces toward R5, R6 on Customer A and on R7 on Customer B.
6) We used stateless autoconfiguration internal to the customers to further propagate the IPv6 prefixes.
7) We created static routing on the CE routers toward the SP.
8) We redistributed statics into BGP on the PE router.
9) We verified that IPv6 prefixes were being delegated through DHCPv6.
10) And finally we verified that everything was working in the data plane.

I hope this has covered a pretty niche topic of IPv6 and it has been useful to you.

Take care!

VRF based path selection

In this post I will be showing you how its possible to use different paths between your PE routers on a per VRF basis.

This is very useful if you have customers you want to “steer” away from your normal traffic flow between PE routers.
For example, this could be due to certain SLA’s.

I will be using the following topology to demonstrate how this can be done:

Topology

A short walkthrough of the topology is in order.

In the service provider core we have 4 routers. R3, XRv-1, XRv-2 and R4. R3 and R4 are IOS-XE based routers and XRv-1 and XRv-2 are as the name implies, IOS-XR routers. There is no significance attached to the fact that im running two XR routers. Its simply how I could build the required topology.

The service provider is running OSPF as the IGP, with R3 and R4 being the PE routers for an MPLS L3 VPN service. On top of that, LDP is being used to build the required LSP’s. The IGP has been modified to prefer the northbound path (R3 -> XRv-1 -> R4) by increasing the cost of the R3, XRv-2 and R4 to 100.

So by default, traffic between R3 and R4 will flow northbound.

We can easily verify this:

R3#traceroute 4.4.4.4
Type escape sequence to abort.
Tracing the route to 4.4.4.4
VRF info: (vrf in name/id, vrf out name/id)
  1 10.3.10.10 [MPLS: Label 16005 Exp 0] 16 msec 1 msec 1 msec
  2 10.4.10.4 1 msec *  5 msec

And the reverse path is the same:

R4#traceroute 3.3.3.3
Type escape sequence to abort.
Tracing the route to 3.3.3.3
VRF info: (vrf in name/id, vrf out name/id)
  1 10.4.10.10 [MPLS: Label 16000 Exp 0] 3 msec 2 msec 0 msec
  2 10.3.10.3 1 msec *  5 msec

Besides that traffic flow the desired way, we can see we are using label switching between the loopbacks. Exactly what we want in this type of setup.

On the customer side, we have 2 customers, Customer A and Customer B. Each of them has 2 sites, one behind R3 and one behind R4. Pretty simple. They are all running EIGRP between the CE’s and the PE’s.

Beyond this we have MPLS Traffic Engineering running in the service core as well. Specifically we are running a tunnel going from R3’s loopback200 (33.33.33.33/32) towards R4’s loopback200 (44.44.44.44/32). This has been accomplished by configuring an explicit path on both R3 and R4.

Lets verify the tunnel configuration on both:

On R3:

R3#sh ip expl
PATH NEW-R3-TO-R4 (strict source route, path complete, generation 8)
    1: next-address 10.3.20.20
    2: next-address 10.4.20.4
R3#sh run int tunnel10
Building configuration...

Current configuration : 180 bytes
!
interface Tunnel10
 ip unnumbered Loopback200
 tunnel mode mpls traffic-eng
 tunnel destination 10.4.20.4
 tunnel mpls traffic-eng path-option 10 explicit name NEW-R3-TO-R4
end

And on R4:

R4#sh ip expl
PATH NEW-R4-TO-R3 (strict source route, path complete, generation 4)
    1: next-address 10.4.20.20
    2: next-address 10.3.20.3
R4#sh run int tun10
Building configuration...

Current configuration : 180 bytes
!
interface Tunnel10
 ip unnumbered Loopback200
 tunnel mode mpls traffic-eng
 tunnel destination 10.3.20.3
 tunnel mpls traffic-eng path-option 10 explicit name NEW-R4-TO-R3
end

On top of that we have configured a static route on both R3 and R4, to steer traffic for each others loopback200’s down the tunnel:

R3#sh run | incl ip route
ip route 44.44.44.44 255.255.255.255 Tunnel10

R4#sh run | incl ip route
ip route 33.33.33.33 255.255.255.255 Tunnel10

Resulting in the following RIB’s:

R3#sh ip route 44.44.44.44
Routing entry for 44.44.44.44/32
  Known via "static", distance 1, metric 0 (connected)
  Routing Descriptor Blocks:
  * directly connected, via Tunnel10
      Route metric is 0, traffic share count is 1
	  
R4#sh ip route 33.33.33.33
Routing entry for 33.33.33.33/32
  Known via "static", distance 1, metric 0 (connected)
  Routing Descriptor Blocks:
  * directly connected, via Tunnel10
      Route metric is 0, traffic share count is 1

And to test out that we are actually using the southbound path (R3 -> XRv-2 -> R4), lets traceroute between the loopbacks (loopback200):

on R3:

R3#traceroute 44.44.44.44 so loopback200
Type escape sequence to abort.
Tracing the route to 44.44.44.44
VRF info: (vrf in name/id, vrf out name/id)
  1 10.3.20.20 [MPLS: Label 16007 Exp 0] 4 msec 2 msec 1 msec
  2 10.4.20.4 1 msec *  3 msec

and on R4:

R4#traceroute 33.33.33.33 so loopback200
Type escape sequence to abort.
Tracing the route to 33.33.33.33
VRF info: (vrf in name/id, vrf out name/id)
  1 10.4.20.20 [MPLS: Label 16008 Exp 0] 4 msec 1 msec 1 msec
  2 10.3.20.3 1 msec *  3 msec

This verifies that we have our two unidirectional tunnels and that communication between the loopback200 interfaces flows through the southbound path using our TE tunnels.

So lets take a look at the very simple BGP PE configuration on both R3 and R4:

R3:

router bgp 100
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 neighbor 4.4.4.4 remote-as 100
 neighbor 4.4.4.4 update-source Loopback100
 !
 address-family ipv4
 exit-address-family
 !
 address-family vpnv4
  neighbor 4.4.4.4 activate
  neighbor 4.4.4.4 send-community extended
 exit-address-family
 !
 address-family ipv4 vrf A
  redistribute eigrp 100
 exit-address-family
 !
 address-family ipv4 vrf B
  redistribute eigrp 100
 exit-address-family

and R4:

router bgp 100
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 neighbor 3.3.3.3 remote-as 100
 neighbor 3.3.3.3 update-source Loopback100
 !
 address-family ipv4
 exit-address-family
 !
 address-family vpnv4
  neighbor 3.3.3.3 activate
  neighbor 3.3.3.3 send-community extended
 exit-address-family
 !
 address-family ipv4 vrf A
  redistribute eigrp 100
 exit-address-family
 !
 address-family ipv4 vrf B
  redistribute eigrp 100
 exit-address-family

From this output, we can see that we are using the loopback100 interfaces for the BGP peering. As routing updates comes in from one PE, the next-hop will be set to the remote PE’s loopback100 interface. This will then cause the transport-label to be one going to this loopback100 interface.

A traceroute from R1’s loopback0 interface to R5’s loopback0 interface, will show us the path that traffic between each site in VRF A (Customer A) will take:

R1:

R1#traceroute 5.5.5.5 so loo0
Type escape sequence to abort.
Tracing the route to 5.5.5.5
VRF info: (vrf in name/id, vrf out name/id)
  1 10.1.3.3 1 msec 1 msec 0 msec
  2 10.3.10.10 [MPLS: Labels 16005/408 Exp 0] 6 msec 1 msec 10 msec
  3 10.4.5.4 [MPLS: Label 408 Exp 0] 15 msec 22 msec 17 msec
  4 10.4.5.5 18 msec *  4 msec

and lets compare that to what R3 will use as the transport label to reach R4’s loopback100 interface:

 
R3#sh mpls for
Local      Outgoing   Prefix           Bytes Label   Outgoing   Next Hop
Label      Label      or Tunnel Id     Switched      interface
300        Pop Label  10.10.10.10/32   0             Gi1.310    10.3.10.10
301        Pop Label  10.4.10.0/24     0             Gi1.310    10.3.10.10
302        Pop Label  20.20.20.20/32   0             Gi1.320    10.3.20.20
303        16004      10.4.20.0/24     0             Gi1.310    10.3.10.10
304   [T]  Pop Label  44.44.44.44/32   0             Tu10       point2point
305        16005      4.4.4.4/32       0             Gi1.310    10.3.10.10
310        No Label   1.1.1.1/32[V]    2552          Gi1.13     10.1.3.1
311        No Label   10.1.3.0/24[V]   0             aggregate/A
312        No Label   2.2.2.2/32[V]    2552          Gi1.23     10.2.3.2
313        No Label   10.2.3.0/24[V]   0             aggregate/B

We can see that this matches up being 16005 (going to XRv-1) through the northbound path.

This begs the question, how do we steer our traffic through the southbound path using the loopback200 instead, when the peering is between loopback100’s?

Well, thankfully IOS has it covered. Under the VRF configuration for Customer B (VRF B), we have the option of setting the loopback interface of updates sent to the remote PE:

On R3:

vrf definition B
 rd 100:2
 !
 address-family ipv4
  route-target export 100:2
  route-target import 100:2
  bgp next-hop Loopback200
 exit-address-family

and the same on R4:

 vrf definition B
  rd 100:2
  !
  address-family ipv4
   route-target export 100:2
   route-target import 100:2
   bgp next-hop Loopback200
  exit-address-family

This causes the BGP updates to contain the “correct” next-hop:

R3:

R3#sh bgp vpnv4 uni vrf B | beg Route Dis
Route Distinguisher: 100:2 (default for vrf B)
 *>  2.2.2.2/32       10.2.3.2            130816         32768 ?
 *>i 6.6.6.6/32       44.44.44.44         130816    100      0 ?
 *>  10.2.3.0/24      0.0.0.0                  0         32768 ?
 *>i 10.4.6.0/24      44.44.44.44              0    100      0 ?

44.44.44.44/32 being the loopback200 of R4, and on R4:

R4#sh bgp vpnv4 uni vrf B | beg Route Dis
Route Distinguisher: 100:2 (default for vrf B)
 *>i 2.2.2.2/32       33.33.33.33         130816    100      0 ?
 *>  6.6.6.6/32       10.4.6.6            130816         32768 ?
 *>i 10.2.3.0/24      33.33.33.33              0    100      0 ?
 *>  10.4.6.0/24      0.0.0.0                  0         32768 ?

Lets check out whether this actually works or not:

R2#traceroute 6.6.6.6 so loo0
Type escape sequence to abort.
Tracing the route to 6.6.6.6
VRF info: (vrf in name/id, vrf out name/id)
  1 10.2.3.3 1 msec 1 msec 0 msec
  2 10.3.20.20 [MPLS: Labels 16007/409 Exp 0] 4 msec 1 msec 10 msec
  3 10.4.6.4 [MPLS: Label 409 Exp 0] 15 msec 16 msec 17 msec
  4 10.4.6.6 19 msec *  4 msec

Excellent! – We can see that we are indeed using the southbound path. To make sure we are using the tunnel, note the transport label of 16007, and compare that to:

R3:

R3#sh mpls traffic-eng tun tunnel 10

Name: R3_t10                              (Tunnel10) Destination: 10.4.20.4
  Status:
    Admin: up         Oper: up     Path: valid       Signalling: connected
    path option 10, type explicit NEW-R3-TO-R4 (Basis for Setup, path weight 200)

  Config Parameters:
    Bandwidth: 0        kbps (Global)  Priority: 7  7   Affinity: 0x0/0xFFFF
    Metric Type: TE (default)
    AutoRoute: disabled LockDown: disabled Loadshare: 0 [0] bw-based
    auto-bw: disabled
  Active Path Option Parameters:
    State: explicit path option 10 is active
    BandwidthOverride: disabled  LockDown: disabled  Verbatim: disabled


  InLabel  :  -
  OutLabel : GigabitEthernet1.320, 16007
  Next Hop : 10.3.20.20

I have deleted alot of non-relevant output, but pay attention to the Outlabel, which is indeed 16007.

So that was a quick walkthrough of how easy it is to accomplish the stated goal once you know about that nifty IOS command.

I hope its been useful to you.

Take Care!

Using the OSPF Forwarding Address for traffic-steering

In this fairly short post, id like to address a topic that came up on IRC (#cciestudy @ freenode.net). Its about how you select a route thats being redistributed into an OSPF NSSA area and comes into the OSPF backbone area 0.

For my post i will be using the very simple topology below. Nothing else is necessary to illustrate what is going on.

FA-NSSA-Topology

First off, id like to clarify a few things about what takes place when redistributing routes into an NSSA area.

What happens is that you have an external network, 4.4.4.4/32 in our example. This is _not_ part of the current area 1. When this network is being redistributed into area 1, its forwarding address will be set to the highest active interface of the redistributing router in the area (R4 in our case). The highest interface in the area local to the router is Loopback100 with an address of 44.44.44.44/32.

*A reader noted that a loopback address will beat a physical interface even if it has a lower address. This is true and goes for OSPF in general. Thanks!

Lets verify the configuration on R4 and the result of the redistribution to the OSPF database:

R4#sh run | sec router ospf
router ospf 100
router-id 144.144.144.144
log-adjacency-changes
area 1 nssa
redistribute connected subnets
network 10.2.0.0 0.0.255.255 area 1
network 10.3.0.0 0.0.255.255 area 1
network 44.44.44.44 0.0.0.0 area 1

So we are running Area 1 on three interfaces connecting to R2 and R3 along with a loopback100 interface.

And the output of the relevant section of the OSPF database is:

R4#sh ip os data nssa

OSPF Router with ID (144.144.144.144) (Process ID 100)

Type-7 AS External Link States (Area 1)

LS age: 408
Options: (No TOS-capability, Type 7/5 translation, DC)
LS Type: AS External Link
Link State ID: 4.4.4.4 (External Network Number )
Advertising Router: 144.144.144.144
LS Seq Number: 80000001
Checksum: 0x4A49
Length: 36
Network Mask: /32
Metric Type: 2 (Larger than any link state path)
MTID: 0
Metric: 20
Forward Address: 44.44.44.44
External Route Tag: 0

What we are verifying here is the fact that the FA is in fact set according to the forementioned rules, namely 44.44.44.44.

Lets take a look at the OSPF configuration of R2 and R3:

R2#sh run | sec router ospf
router ospf 100
router-id 22.22.22.22
log-adjacency-changes
area 1 nssa
network 10.1.2.0 0.0.0.255 area 0
network 10.2.4.0 0.0.0.255 area 1

And R3:

R3#sh run | sec router ospf
router ospf 100
log-adjacency-changes
area 1 nssa
network 10.1.3.0 0.0.0.255 area 0
network 10.3.4.0 0.0.0.255 area 1

Very straigh forward so far, with the exception to the fact that i have manually set R2’s router-id, to force it to be higher than R3. This is to prove the point below.

Now what we should ideally see, is that the ABR (R2 and R3) with the highest router-id will do the type-7 to type-5 translation and preserve the FA of the type-7. What we would like to see on R1, is a type 5 LSA with a Forwarding Address of 44.44.44.44, with the advertising router be R2 (22.22.22.22). Lets check it out:

R1#sh ip os data ex

OSPF Router with ID (10.1.3.1) (Process ID 100)

Type-5 AS External Link States

Routing Bit Set on this LSA in topology Base with MTID 0
LS age: 630
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 4.4.4.4 (External Network Number )
Advertising Router: 22.22.22.22
LS Seq Number: 80000001
Checksum: 0x394E
Length: 36
Network Mask: /32
Metric Type: 2 (Larger than any link state path)
MTID: 0
Metric: 20
Forward Address: 44.44.44.44
External Route Tag: 0

R1#sh ip route 4.4.4.4
Routing entry for 4.4.4.4/32
Known via "ospf 100", distance 110, metric 20, type extern 2, forward metric 3
Last update from 10.1.3.3 on FastEthernet1/1, 00:11:03 ago
Routing Descriptor Blocks:
10.1.3.3, from 22.22.22.22, 00:11:03 ago, via FastEthernet1/1
Route metric is 20, traffic share count is 1
* 10.1.2.2, from 22.22.22.22, 00:11:03 ago, via FastEthernet1/0
Route metric is 20, traffic share count is 1

Very good, we are in fact seeing this LSA with the information we expected. We can also see something you might not expect, namely the fact that we have two paths installed in the RIB for 4.4.4.4/32. Why is that?

Well, what R1 really cares about is “how” it can get to the Forwarding Address of the route and in this case, it can get to 44.44.44.44/32 through 2 paths, R2 and R3.

Lets check out what happens if we block 44.44.44.44/32 going from Area 1 to Area 0 through R2.

R2#sh run | incl prefix-list
ip prefix-list BLOCK-R4-LOOPBACK seq 5 deny 44.44.44.44/32
ip prefix-list BLOCK-R4-LOOPBACK seq 10 permit 0.0.0.0/0 le 32

R2#sh run | sec router ospf
router ospf 100
router-id 22.22.22.22
log-adjacency-changes
area 1 nssa
area 1 filter-list prefix BLOCK-R4-LOOPBACK out
network 10.1.2.0 0.0.0.255 area 0
network 10.2.4.0 0.0.0.255 area 1

Lets see what this does to the RIB of R1:

R1#sh ip route | beg Gateway
Gateway of last resort is not set

4.0.0.0/32 is subnetted, 1 subnets
O E2 4.4.4.4 [110/20] via 10.1.3.3, 00:16:43, FastEthernet1/1
10.0.0.0/8 is variably subnetted, 6 subnets, 2 masks
C 10.1.2.0/24 is directly connected, FastEthernet1/0
L 10.1.2.1/32 is directly connected, FastEthernet1/0
C 10.1.3.0/24 is directly connected, FastEthernet1/1
L 10.1.3.1/32 is directly connected, FastEthernet1/1
O IA 10.2.4.0/24 [110/2] via 10.1.2.2, 00:16:47, FastEthernet1/0
O IA 10.3.4.0/24 [110/2] via 10.1.3.3, 00:23:41, FastEthernet1/1
44.0.0.0/32 is subnetted, 1 subnets
O IA 44.44.44.44 [110/3] via 10.1.3.3, 00:16:48, FastEthernet1/1

and the LSA is still the same as before:

R1#sh ip os data ex

OSPF Router with ID (10.1.3.1) (Process ID 100)

Type-5 AS External Link States

Routing Bit Set on this LSA in topology Base with MTID 0
LS age: 1027
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 4.4.4.4 (External Network Number )
Advertising Router: 22.22.22.22
LS Seq Number: 80000001
Checksum: 0x394E
Length: 36
Network Mask: /32
Metric Type: 2 (Larger than any link state path)
MTID: 0
Metric: 20
Forward Address: 44.44.44.44
External Route Tag: 0

So what this tells us, is that if the Forwarding Address is different than 0.0.0.0 (which we’ll cover in a minute) and you dont have reachability to whatever its set to, you cannot install this in the RIB.

In our case we still have one valid path through R3, so its still in the RIB, but not with load-balancing.

So to summarize what we have covered so far:
– Even though only 1 ABR creates the new type-5 (type-7 to type-5 translation), you can have load-balacing occuring.
– If you dont have a valid path to the Forwarding Address, you cannot install it in the RIB.

Lets revert our configuration on R2:

R2#sh run | sec router ospf
router ospf 100
router-id 22.22.22.22
log-adjacency-changes
area 1 nssa
network 10.1.2.0 0.0.0.255 area 0
network 10.2.4.0 0.0.0.255 area 1

Now lets take a look at FA-Suppression!

What FA-Suppression does, is that instead of preserving the FA according to the previously mentioned rules, it sets the Forwarding Address to 0.0.0.0, indicating that the router originating the Type-5 should be used as the exit point.

We’ve already established that R2 is the router performing the Type-7 to Type-5 translation, so lets do the following configuration on R2:

R2(config-router)#area 1 nssa translate type7 suppress-fa

What does this do to our OSPF database on R1, specifically the Type-5 LSA:

R1#sh ip os data ext

OSPF Router with ID (10.1.3.1) (Process ID 100)

Type-5 AS External Link States

Routing Bit Set on this LSA in topology Base with MTID 0
LS age: 33
Options: (No TOS-capability, DC)
LS Type: AS External Link
Link State ID: 4.4.4.4 (External Network Number )
Advertising Router: 22.22.22.22
LS Seq Number: 80000002
Checksum: 0x96A0
Length: 36
Network Mask: /32
Metric Type: 2 (Larger than any link state path)
MTID: 0
Metric: 20
Forward Address: 0.0.0.0
External Route Tag: 0

Indeed the Forwarding Address has been set to 0.0.0.0, indicating that the Advertising Router (22.22.22.22) should be used as the exit point. This also has the effect of removing our load-balancing from occuring:

R1#sh ip route 4.4.4.4
Routing entry for 4.4.4.4/32
Known via "ospf 100", distance 110, metric 20, type extern 2, forward metric 1
Last update from 10.1.2.2 on FastEthernet1/0, 00:03:48 ago
Routing Descriptor Blocks:
* 10.1.2.2, from 22.22.22.22, 00:03:48 ago, via FastEthernet1/0
Route metric is 20, traffic share count is 1

So depending on how you want to “steer” your traffic, you might want to consider whether you allow the Forwarding Address through your topology and if you want to use FA suppression or not.

I hope its been useful to you!

Take care!

Using LISP for IPv6 tunnelling.

In this post I would like to show how its possible to use a fairly new protocol, LISP, to interconnect IPv6 islands over an IPv4 backbone/core network.

LISP stands for Locator ID Seperation Protocol. As the name suggest, its actually meant to decouple location from identity. This means it can be used for such cool things as mobility, being VM’s or a mobile data connection.

However another aspect of using LISP involves its tunneling mechanism. This is what I will be using in my example to provide the IPv6 islands the ability to communicate over the IPv4-only backbone.

There is alot of terminology involved with LISP, but i will only use some of them here for clarity. If you want to know more about LISP, a good place to start is http://lisp.cisco.com.

The topology i will be using is a modified version of one presented in a Cisco Live presentation called “BRKRST-3046 – Advanced LISP – Whats in it for me?”. I encourage you to view this as well for more information.

Here is the topology:

LISP-IPv6-Topology

Some background information about the setup. Both Site 1 and Site 2 are using EIGRP as the IGP. Both IPv4 and IPv6 is being routed internally. A default route is created by R2, R3 and R6 in their respective sites.

The RIB on R1 for both IPv4 and IPv6:

R1#sh ip route eigrp | beg Gateway
Gateway of last resort is 172.16.10.3 to network 0.0.0.0

D*EX  0.0.0.0/0 [170/2560000512] via 172.16.10.3, 1d00h, GigabitEthernet1.100
                [170/2560000512] via 172.16.10.2, 1d00h, GigabitEthernet1.100
R1#sh ipv6 route eigrp
<snip>
EX  ::/0 [170/2816]
     via FE80::250:56FF:FEBE:675D, GigabitEthernet1.100
     via FE80::250:56FF:FEBE:9215, GigabitEthernet1.100

And R7:

R7#sh ip ro eigrp | beg Gateway
Gateway of last resort is 172.16.20.6 to network 0.0.0.0

D*EX  0.0.0.0/0 [170/2560000512] via 172.16.20.6, 1d00h, GigabitEthernet1.67

R7#sh ipv6 route eigrp
<snip>
EX  ::/0 [170/2816]
     via FE80::250:56FF:FEBE:D054, GigabitEthernet1.67

Now in order to get anywhere, we need to setup our LISP infrastructure. This means configuring R2, R3 and R6 as whats known as RLOC’s as well as configuring R5 as a mapping-server and map-resolver. A mapping server/resolver is where RLOC’s register what internal IP scopes they have in their sites. Its also where each RLOC asks for information on how to reach other sites. So obviously they are a very important part of our LISP setup. Here is the relevant configuration on R5:

router lisp
 site SITE1
  authentication-key blah
  eid-prefix 153.16.1.1/32
  eid-prefix 153.16.1.2/32
  eid-prefix 172.16.10.0/24 accept-more-specifics
  eid-prefix 2001::1/128
  eid-prefix 2001::2/128
  eid-prefix 2001:100::/32 accept-more-specifics
  exit
 !
 site SITE2
  authentication-key blah
  eid-prefix 153.16.2.1/32
  eid-prefix 172.16.20.0/24
  eid-prefix 2001::7/128
  eid-prefix 2001:67::/32 accept-more-specifics
  exit
 !
 ipv4 map-server
 ipv4 map-resolver
 ipv6 map-server
 ipv6 map-resolver

On IOS-XE which is what im using to build this lab, all configuration is being done under the router LISP mode.

As can be seen from the configuration, two sites have been defined, SITE1 and SITE2.
An authentication key has been configured for each site. Furthermore, the prefixes that we want to accept from each site has also been configured. If our addressing scheme had been somewhat more thought out we could use the “accept-more-specifics” to accept more specific subnets, but this configuration serves our purpose.

Pay attention to the fact that we do this for each address-family. For our IPv6 example this is really not nessecary, but i wanted to provide both IPv4 and IPv6 connectivity, so i configured both.

Finally I’ve configured R5 as both a map-server and map-resolver for each address-family.

Next up is the configuration for R2:

R2#sh run | sec router lisp
router lisp
 locator-set SITE1
  10.1.1.1 priority 10 weight 50
  10.1.2.1 priority 10 weight 50
  exit
 !
 database-mapping 153.16.1.1/32 locator-set SITE1
 database-mapping 153.16.1.2/32 locator-set SITE1
 database-mapping 172.16.10.0/24 locator-set SITE1
 database-mapping 2001::1/128 locator-set SITE1
 database-mapping 2001::2/128 locator-set SITE1
 database-mapping 2001:100::/32 locator-set SITE1
 ipv4 itr map-resolver 10.1.3.1
 ipv4 itr
 ipv4 etr map-server 10.1.3.1 key blah
 ipv4 etr
 ipv6 itr map-resolver 10.1.3.1
 ipv6 itr
 ipv6 etr map-server 10.1.3.1 key blah
 ipv6 etr

The first part of this configuration lists a “Locator-Set”. This is where you want to list each RLOC for the site in question. For our SITE1 we have 2 RLOC’s with IPv4 addresses in the IPv4 transport cloud being 10.1.1.1 and 10.1.2.1 respectively for R2 and R3.

One of the very cool things about LISP is how you can achieve redundancy and/or load-balancing signaled by the local RLOC’s. By modifying the priority of R3 (10.1.2.1) to 20, we have effectively told the other site(s) that we want to prefer R2 as the egress tunnel router (ETR), so all traffic would be sent to R2. However if we instead leave the priority to the same and modify the weight, we can load-balance traffic. Again this is signaled by the local site and replicated to the remote site(s).

Next up is our mappings. This is where we define which prefixes we want to use in this site. Here we have the loopbacks of R1 and the network used for connectivity in SITE 1. Both for IPv4 and IPv6. Again IPv4 is not nessecary for our example.

Finally, we define a map-resolver and map-server for both ITR (Ingress Tunnel Router) and ETR (Egress Tunnel Router). This is so we can define where we want to send our mapping data as well as where to ask for other ETR’s. We also define ourselves as ITR and ETR for both address-families.

The exact same configuration has been applied on R3:

R3#sh run | sec router lisp
router lisp
 locator-set SITE1
  10.1.1.1 priority 10 weight 50
  10.1.2.1 priority 10 weight 50
  exit
 !
 database-mapping 153.16.1.1/32 locator-set SITE1
 database-mapping 153.16.1.2/32 locator-set SITE1
 database-mapping 172.16.10.0/24 locator-set SITE1
 database-mapping 2001::1/128 locator-set SITE1
 database-mapping 2001::2/128 locator-set SITE1
 database-mapping 2001:100::/32 locator-set SITE1
 ipv4 itr map-resolver 10.1.3.1
 ipv4 itr
 ipv4 etr map-server 10.1.3.1 key blah
 ipv4 etr
 ipv6 itr map-resolver 10.1.3.1
 ipv6 itr
 ipv6 etr map-server 10.1.3.1 key blah
 ipv6 etr

Now for some verification commands on R2:

R2#sh ip lisp
  Instance ID:                      0
  Router-lisp ID:                   0
  Locator table:                    default
  EID table:                        default
  Ingress Tunnel Router (ITR):      enabled
  Egress Tunnel Router (ETR):       enabled
  Proxy-ITR Router (PITR):          disabled
  Proxy-ETR Router (PETR):          disabled
  NAT-traversal Router (NAT-RTR):   disabled
  Mobility First-Hop Router:        disabled
  Map Server (MS):                  disabled
  Map Resolver (MR):                disabled
  Delegated Database Tree (DDT):    disabled
  Map-Request source:               10.1.1.1
  ITR Map-Resolver(s):              10.1.3.1
  ETR Map-Server(s):                10.1.3.1 (00:00:50)
  xTR-ID:                           0xA7F25A1D-0x982B7E10-0xDD2D66CC-0x436D28A5
  site-ID:                          unspecified
  ITR Solicit Map Request (SMR):    accept and process
    Max SMRs per map-cache entry:   8 more specifics
    Multiple SMR suppression time:  20 secs
  ETR accept mapping data:          disabled, verify disabled
  ETR map-cache TTL:                1d00h
  Locator Status Algorithms:
    RLOC-probe algorithm:           disabled
    LSB reports:                    process
    IPv4 RLOC minimum mask length:  /0
    IPv6 RLOC minimum mask length:  /0
  Static mappings configured:       0
  Map-cache size/limit:             1/1000
  Imported route count/limit:       0/1000
  Map-cache activity check period:  60 secs
  Map-cache FIB updates:            established
  Total database mapping size:      3
    static database size/limit:     3/5000
    dynamic database size/limit:    0/1000
    route-import database size:     0
  Persistent map-cache:             interval 01:00:00
    Earliest next store:            now
    Location:                       bootflash:LISP-MapCache-IPv4-00000000-00100

Lots of output. But pay attention to the fact that both ITR and ETR has been enabled and ITR Map-Resolver(s) and ETR Map-Server(s) has been defined to 10.1.3.1 (R5).

We also want to verify our current map-cache which is the cache maintained by the RLOC’s for what it already “knows” about:

R2#sh ipv6 lisp map-cache
LISP IPv6 Mapping Cache for EID-table default (IID 0), 1 entries

::/0, uptime: 00:00:01, expires: never, via static send map-request
  Negative cache entry, action: send-map-request

Basically this output tells you that we dont know about any specific networks from other sites just yet.

R6 is very similar to R2 and R3:

R6#sh run | sec router lisp
router lisp
 locator-set SITE2
  10.1.4.1 priority 10 weight 50
  exit
 !
 database-mapping 153.16.2.1/32 locator-set SITE2
 database-mapping 172.16.20.0/24 locator-set SITE2
 database-mapping 2001::7/128 locator-set SITE2
 ipv4 itr map-resolver 10.1.3.1
 ipv4 itr
 ipv4 etr map-server 10.1.3.1 key blah
 ipv4 etr
 ipv6 itr map-resolver 10.1.3.1
 ipv6 itr
 ipv6 etr map-server 10.1.3.1 key blah
 ipv6 etr

And verification:

R6#sh ip lisp
  Instance ID:                      0
  Router-lisp ID:                   0
  Locator table:                    default
  EID table:                        default
  Ingress Tunnel Router (ITR):      enabled
  Egress Tunnel Router (ETR):       enabled
  Proxy-ITR Router (PITR):          disabled
  Proxy-ETR Router (PETR):          disabled
  NAT-traversal Router (NAT-RTR):   disabled
  Mobility First-Hop Router:        disabled
  Map Server (MS):                  disabled
  Map Resolver (MR):                disabled
  Delegated Database Tree (DDT):    disabled
  Map-Request source:               10.1.4.1
  ITR Map-Resolver(s):              10.1.3.1
  ETR Map-Server(s):                10.1.3.1 (00:00:38)
  xTR-ID:                           0xFABA5140-0x6AA2BA6A-0x5F347223-0xF7E0CED0
  site-ID:                          unspecified
  ITR Solicit Map Request (SMR):    accept and process
    Max SMRs per map-cache entry:   8 more specifics
    Multiple SMR suppression time:  20 secs
  ETR accept mapping data:          disabled, verify disabled
  ETR map-cache TTL:                1d00h
  Locator Status Algorithms:
    RLOC-probe algorithm:           disabled
    LSB reports:                    process
    IPv4 RLOC minimum mask length:  /0
    IPv6 RLOC minimum mask length:  /0
  Static mappings configured:       0
  Map-cache size/limit:             1/1000
  Imported route count/limit:       0/1000
  Map-cache activity check period:  60 secs
  Map-cache FIB updates:            established
  Total database mapping size:      2
    static database size/limit:     2/5000
    dynamic database size/limit:    0/1000
    route-import database size:     0
  Persistent map-cache:             interval 01:00:00
    Earliest next store:            now
    Location:                       bootflash:LISP-MapCache-IPv4-00000000-00100

Along with the mapping-cache:

R6#sh ipv6 lisp map-cache
LISP IPv6 Mapping Cache for EID-table default (IID 0), 1 entries

::/0, uptime: 00:00:04, expires: never, via static send map-request
  Negative cache entry, action: send-map-request

If we now try a ping from R1’s loopback0 to R7’s loopback0 we see the following:

R1#ping 2001::7 so loo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001::7, timeout is 2 seconds:
Packet sent with a source address of 2001::1
..!!!
Success rate is 60 percent (3/5), round-trip min/avg/max = 1/1/1 ms

What this tells us is that we have connectivity, but beyond that it also means that for the 2 first ICMP echo’s, location data is being retrieved. Lets now check the mapping-cache on R2:

R2#sh ipv6 lisp map-cache
LISP IPv6 Mapping Cache for EID-table default (IID 0), 2 entries

::/0, uptime: 00:01:08, expires: never, via static send map-request
  Negative cache entry, action: send-map-request
2001::7/128, uptime: 00:00:07, expires: 23:59:52, via map-reply, complete
  Locator   Uptime    State      Pri/Wgt
  10.1.4.1  00:00:07  up          10/50

Here we see that 2001::7/128 is currently in the cache and in order to get there we need to tunnel our traffic to the RLOC at 10.1.4.1 (R6).

On the remote side we see something similar:

R6#sh ipv6 lisp map-cache
LISP IPv6 Mapping Cache for EID-table default (IID 0), 2 entries

::/0, uptime: 00:01:53, expires: never, via static send map-request
  Negative cache entry, action: send-map-request
2001::1/128, uptime: 00:01:34, expires: 23:58:25, via map-reply, complete
  Locator   Uptime    State      Pri/Wgt
  10.1.1.1  00:01:34  up          10/50
  10.1.2.1  00:01:34  up          10/50

This is the mapping that tells R6 that it can use both RLOC’s to send traffic to (They are both in the up state).

If we try a ping from R1 again:

R1#ping 2001::7 so loo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001::7, timeout is 2 seconds:
Packet sent with a source address of 2001::1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/9/25 ms

We get full connectivity because the cache has already been populated.

Finally lets see what the packet capture looks like on R4:

R4#show monitor capture LISP buffer brief
 -------------------------------------------------------------
 #   size   timestamp     source	     destination   protocol
 -------------------------------------------------------------
   0  154    0.000000   10.1.1.1         ->  10.1.4.1         UDP
   1  154    0.000000   10.1.4.1         ->  10.1.1.1         UDP
   2  154    0.001007   10.1.1.1         ->  10.1.4.1         UDP
   3  154    0.001007   10.1.4.1         ->  10.1.1.1         UDP
   4  154    0.001007   10.1.1.1         ->  10.1.4.1         UDP
   5  154    0.001999   10.1.4.1         ->  10.1.1.1         UDP
   6  154    0.003006   10.1.1.1         ->  10.1.4.1         UDP
   7  154    0.016006   10.1.4.1         ->  10.1.1.1         UDP
   8  154    0.025008   10.1.1.1         ->  10.1.4.1         UDP
   9  154    0.038008   10.1.4.1         ->  10.1.1.1         UDP
  10  162    2.282035   10.1.4.1         ->  10.1.3.1         UDP
  11  162    2.282035   10.1.3.1         ->  10.1.4.1         UDP

(I used an EPC (Embedded Packet Capture) on R4 to get the data).

We clearly see that UDP traffic is flowing between R2 and R6.

So this tunneling characteristic is one way we can utilize LISP, but there are many other use cases as I mentioned before.

I hope this has been useful to you.

Until next time, take care!

CRS/ASR Switching fabrics

At the moment I’m going through whitepapers, Cisco Live 365 presentations and IOS XR fundamentals learning about switching fabrics.

Its a steep learning curve, but in its own way its quite fascinating.

There are a lot of acronyms to be mastered, so later on i will post a list that might serve myself and others when looking at these sort of architectures.

Whats next…

I have a lot of non-technical related projects in the pipeline, but study wise, whats next up for me is the IOS XR specialist exam.

I think the blueprint for it looks interesting and it provides a way for me to learn more about IOS XR.

I don’t really have a date for the exam just yet as I’m taking it easy and trying to lab out as much as i can to have it stick.

I will be posting about anything i find interesting or different from Classic IOS. Right now I’m trying to figure out the details on the LPTS implemented on XR platforms. A way of protecting the management/control plane of the router.

Take care!

Short update

Its been a long time since my last update. I apologise for this. It wasnt my intention, it just sort of happened.

In the meantime I have tried the CCIE SP lab and didnt pass it, so I am still studying for my next attempt which is comming up shortly.
Until then I have booked quite a number of rack hours. Hopefully I will learn from some of the mistakes I have identified.

Just last week Cisco announced the IOS XRv image that allows you to run a virtual instance of IOS XR. This is great news for the community at large as it provides the ability to learn about XR without having to spend alot of money on rack rentals or even buying platforms that run XR.

Unfortunally, there is a bug in the download system, which Cisco is trying to correct. It disallows the download for people with active partner status. This includes me. So we are to wait 3 days at the time of writing until it gets sorted out.

I suggest you take a look at FryGuy’s blog about the release of IOS XRv.
The link can be found here:
http://www.fryguy.net/2014/02/08/cisco-ios-xrv-v-as-in-virtual/

Take care!

ISIS csnp-interval

The CSNP on multiaccess networks

The CSNP (Complete Sequence Number PDU) on multi-access networks is being sent out on behalf of the DIS (Designated Intermediate System), which acts as the pseudonode representing the multi-access network. Its being used as ISIS’s way of making sure everybody on the multi-access network is up to date. If thats not the case, the node which is missing some routing information can use PSNP (Partial Sequence Number PDU)’s to request the missing information from the DIS.

The csnp-interval is simply the timer that controls how often the DIS sends out this CSNP. The default on IOS (and XR) is every 10 seconds.

Its important to know that a separate timer is kept for both level–1 and level–2 DIS.

For this example i will be using the topology listed in figure 1:

Topology

Take note of the fact that i have manually set the Mac address on the routers to make it more obvious which router is the DIS from the point of view of debugs.

Since everybody has the same priority (default 64), the highest SNPA (SubNet Point of Attachment), which translates to the Mac address, will be used as the tiebreaker. Highest one wins. In our case this will be R3.

The output below highlights this fact:

R1#sh isis nei

System Id      Type Interface   IP Address      State Holdtime Circuit Id
R2             L2   Fa1/0       100.100.100.2   UP    28       R3.01
R3             L2   Fa1/0       100.100.100.3   UP    7        R3.01

Now to prove the CSNP timer, lets look at what our debugs are telling us:

R1#
*Jun 24 16:58:10.267: ISIS-Snp: Rec L2 CSNP from 0000.0000.0003 (FastEthernet1/0)
*Jun 24 16:58:10.267: ISIS-SNP: CSNP range 0000.0000.0000.00-00 to FFFF.FFFF.FFFF.FF-FF
*Jun 24 16:58:10.267: ISIS-SNP: Same entry 0000.0000.0001.00-00, seq 6
*Jun 24 16:58:10.267: ISIS-SNP: Same entry 0000.0000.0002.00-00, seq 4
*Jun 24 16:58:10.267: ISIS-SNP: Same entry 0000.0000.0003.00-00, seq 3
*Jun 24 16:58:10.267: ISIS-SNP: Same entry 0000.0000.0003.01-00, seq 2
R1#
*Jun 24 16:58:19.467: ISIS-Snp: Rec L2 CSNP from 0000.0000.0003 (FastEthernet1/0)
*Jun 24 16:58:19.471: ISIS-SNP: CSNP range 0000.0000.0000.00-00 to FFFF.FFFF.FFFF.FF-FF
*Jun 24 16:58:19.471: ISIS-SNP: Same entry 0000.0000.0001.00-00, seq 6
*Jun 24 16:58:19.471: ISIS-SNP: Same entry 0000.0000.0002.00-00, seq 4
*Jun 24 16:58:19.471: ISIS-SNP: Same entry 0000.0000.0003.00-00, seq 3
*Jun 24 16:58:19.475: ISIS-SNP: Same entry 0000.0000.0003.01-00, seq 2
R1#
*Jun 24 16:58:27.639: ISIS-Snp: Rec L2 CSNP from 0000.0000.0003 (FastEthernet1/0)
*Jun 24 16:58:27.643: ISIS-SNP: CSNP range 0000.0000.0000.00-00 to FFFF.FFFF.FFFF.FF-FF
*Jun 24 16:58:27.643: ISIS-SNP: Same entry 0000.0000.0001.00-00, seq 6
*Jun 24 16:58:27.643: ISIS-SNP: Same entry 0000.0000.0002.00-00, seq 4
*Jun 24 16:58:27.643: ISIS-SNP: Same entry 0000.0000.0003.00-00, seq 3
*Jun 24 16:58:27.643: ISIS-SNP: Same entry 0000.0000.0003.01-00, seq 2

Roughly every 10 seconds R1 receives a L2 frame containing the CSNP from R3 (0000.0000.0003). So at least the theory is spot on. Now lets modify the timer to see if it kicks in:

R3(config)#
 R3(config)#int f1/0
 R3(config-if)#isis csnp-interval 20

So now on R1, we should see the CSNP arrive every 20 seconds instead:

R1#
*Jun 24 16:59:32.883: ISIS-Snp: Rec L2 CSNP from 0000.0000.0003 (FastEthernet1/0)
*Jun 24 16:59:32.883: ISIS-SNP: CSNP range 0000.0000.0000.00-00 to FFFF.FFFF.FFFF.FF-FF
*Jun 24 16:59:32.883: ISIS-SNP: Same entry 0000.0000.0001.00-00, seq 6
*Jun 24 16:59:32.883: ISIS-SNP: Same entry 0000.0000.0002.00-00, seq 4
*Jun 24 16:59:32.887: ISIS-SNP: Same entry 0000.0000.0003.00-00, seq 3
*Jun 24 16:59:32.887: ISIS-SNP: Same entry 0000.0000.0003.01-00, seq 2
R1#
*Jun 24 16:59:49.679: ISIS-Snp: Rec L2 CSNP from 0000.0000.0003 (FastEthernet1/0)
*Jun 24 16:59:49.679: ISIS-SNP: CSNP range 0000.0000.0000.00-00 to FFFF.FFFF.FFFF.FF-FF
*Jun 24 16:59:49.679: ISIS-SNP: Same entry 0000.0000.0001.00-00, seq 6
*Jun 24 16:59:49.679: ISIS-SNP: Same entry 0000.0000.0002.00-00, seq 4
*Jun 24 16:59:49.679: ISIS-SNP: Same entry 0000.0000.0003.00-00, seq 3
*Jun 24 16:59:49.679: ISIS-SNP: Same entry 0000.0000.0003.01-00, seq 2
R1#
*Jun 24 17:00:07.479: ISIS-Snp: Rec L2 CSNP from 0000.0000.0003 (FastEthernet1/0)
*Jun 24 17:00:07.483: ISIS-SNP: CSNP range 0000.0000.0000.00-00 to FFFF.FFFF.FFFF.FF-FF
*Jun 24 17:00:07.483: ISIS-SNP: Same entry 0000.0000.0001.00-00, seq 6
*Jun 24 17:00:07.483: ISIS-SNP: Same entry 0000.0000.0002.00-00, seq 4
*Jun 24 17:00:07.483: ISIS-SNP: Same entry 0000.0000.0003.00-00, seq 3
*Jun 24 17:00:07.487: ISIS-SNP: Same entry 0000.0000.0003.01-00, seq 2

And lo and behold, its working!

Conclusion

With this command you have the ability to modify how often a DIS sends out the required CSNP (Complete Sequence Number PDU). Unless you have a certain requirement that requires you to change this timer, its default of 10 seconds should be able to scale to very large multi-access networks.

I hope the explanation of this timer has been useful to you.