Category Archives: IOS

Unified/Seamless MPLS

In this post I would like to highlight a relative new (to me) application of MPLS called Unified MPLS.
The goal of Unified MPLS is to separate your network into individual segments of IGP’s in order to keep your core network as simple as possible while still maintaining an end-to-end LSP for regular MPLS applications such as L3 VPN’s.

What we are doing is simply to put Route Reflectors into the forwarding path and changing the next-hop’s along the way, essentially stiching together the final LSP.
Along with that we are using BGP to signal a label value to maintain the LSP from one end of the network to the other without the use of LDP between IGP’s.

Take a look at the topology that we will be using to demonstrate this feature:

Unified-MPLS-Topology

In this topology we have a simplified layout of a service provider. We have a core network consisting of R3, R4 and R5 along with distribution networks on the right and left of the core. R2 and R3 is in the left distribution and R5 and R6 is in the right hand side one.

We have an MPLS L3VPN customer connected consisting of R1 in one site and R7 in another.

As is visisible in the topology, we are running 3 separate IGP’s to make a point about this feature. EIGRP AS 1, OSPF 100 and EIGRP AS 2. However we are only running one autonomous system as seen from BGP, so its a pure iBGP network.

Now in order to make the L3VPN to work, we need to have an end-to-end LSP going from R2 all the way to R6.
Whats is key here is that in order to have end-to-end reachability, we have contained IGP areas, each of which is running LDP for labels. However between the areas, all we are doing is leaking a couple of loopback adresses into the distribution sections from the core. These are used exclusively for the iBGP session.

On top of that, we need to have R3 and R5 being route-reflectors, have them being in the data path as well as having them allocating labels. This is done through the “send-label” command along with modifying the next-hop (“next-hop-self all” command).

This is illustrated in the following:

Unified-MPLS-iBGP-Topology

Enough theory, lets take a look at the configuration nessecary to pull this of. Lets start out with R2’s IGP and LDP configuration:

R2#sh run | sec router eigrp
router eigrp 1
 network 2.0.0.0
 network 10.0.0.0
 passive-interface default
 no passive-interface GigabitEthernet3

R2#sh run int g3
interface GigabitEthernet3
 ip address 10.2.3.2 255.255.255.0
 negotiation auto
 mpls ip
end

Pretty vanilla configuration of IGP + LDP.

The same for R3:

R3#sh run | sec router eigrp 1
router eigrp 1
 network 10.0.0.0
 redistribute ospf 100 metric 1 1 1 1 1 route-map REDIST-LOOPBACK-MAP
 passive-interface default
 no passive-interface GigabitEthernet2

R3#sh run int g2
interface GigabitEthernet2
 ip address 10.2.3.3 255.255.255.0
 negotiation auto
 mpls ip
end

R3#sh route-map REDIST-LOOPBACK-MAP
route-map REDIST-LOOPBACK-MAP, permit, sequence 10
  Match clauses:
    ip address prefix-lists: REDIST-LOOPBACK-PREFIX-LIST
  Set clauses:
  Policy routing matches: 0 packets, 0 bytes

R3#sh ip prefix-list
ip prefix-list REDIST-LOOPBACK-PREFIX-LIST: 1 entries
   seq 5 permit 3.3.3.3/32

Apart from the redistribution part, its simply establishing an EIGRP adjacency with R2. On top of that we are redistributing R3’s loopback0 interface, which is in the Core area, into EIGRP. Again, this step is nessecary for the iBGP session establishment.

An almost identical setup is present in the other distribution site, consisting of R5 and R6. Again we redistribute R5’s loopback0 address into the IGP (EIGRP AS 2), so we can have iBGP connectivity, which is our next step.

So lets take a look at the BGP configuration on R2 all the way to R6. Im leaving out the VPNv4 configuration for now, in order to make it more visible what we are trying to accomplish first:

R2:
---
router bgp 1000
 bgp router-id 2.2.2.2
 bgp log-neighbor-changes
 neighbor 3.3.3.3 remote-as 1000
 neighbor 3.3.3.3 update-source Loopback0
 !
 address-family ipv4
  network 2.2.2.2 mask 255.255.255.255
  neighbor 3.3.3.3 activate
  neighbor 3.3.3.3 send-label

R3:
---
router bgp 1000
 bgp router-id 3.3.3.3
 bgp log-neighbor-changes
 neighbor 2.2.2.2 remote-as 1000
 neighbor 2.2.2.2 update-source Loopback0
 neighbor 2.2.2.2 route-reflector-client
 neighbor 2.2.2.2 next-hop-self all
 neighbor 2.2.2.2 send-label
 neighbor 5.5.5.5 remote-as 1000
 neighbor 5.5.5.5 update-source Loopback0
 neighbor 5.5.5.5 route-reflector-client
 neighbor 5.5.5.5 next-hop-self all
 neighbor 5.5.5.5 send-label

R5:
---
router bgp 1000
 bgp router-id 5.5.5.5
 bgp log-neighbor-changes
 neighbor 3.3.3.3 remote-as 1000
 neighbor 3.3.3.3 update-source Loopback0
 neighbor 3.3.3.3 route-reflector-client
 neighbor 3.3.3.3 next-hop-self all
 neighbor 3.3.3.3 send-label
 neighbor 6.6.6.6 remote-as 1000
 neighbor 6.6.6.6 update-source Loopback0
 neighbor 6.6.6.6 route-reflector-client
 neighbor 6.6.6.6 next-hop-self all
 neighbor 6.6.6.6 send-label

R6:
---
router bgp 1000
 bgp router-id 6.6.6.6
 bgp log-neighbor-changes
 neighbor 5.5.5.5 remote-as 1000
 neighbor 5.5.5.5 update-source Loopback0
 !
 address-family ipv4
  network 6.6.6.6 mask 255.255.255.255
  neighbor 5.5.5.5 activate
  neighbor 5.5.5.5 send-label

As visible from the configuration. We have 2 IPv4 route-reflectors (R3 and R5), both of which put themselves into the datapath by using the next-hop-self command. On top of that we are allocating labels for all prefixes via BGP as well. Lets verify this on the set:

R2#sh bgp ipv4 uni la
   Network          Next Hop      In label/Out label
   2.2.2.2/32       0.0.0.0         imp-null/nolabel
   6.6.6.6/32       3.3.3.3         nolabel/305

R3#sh bgp ipv4 uni la
   Network          Next Hop      In label/Out label
   2.2.2.2/32       2.2.2.2         300/imp-null
   6.6.6.6/32       5.5.5.5         305/500

R5#sh bgp ipv4 uni la
   Network          Next Hop      In label/Out label
   2.2.2.2/32       3.3.3.3         505/300
   6.6.6.6/32       6.6.6.6         500/imp-null

 R6#sh bgp ipv4 uni la
    Network          Next Hop      In label/Out label
    2.2.2.2/32       5.5.5.5         nolabel/505
    6.6.6.6/32       0.0.0.0         imp-null/nolabel

Since we are only injecting 2 prefixes (loopbacks of R2 and R6) into BGP, thats all we have allocated labels for.

Doing a traceroute from R2 to R6 (between loopbacks), will reveal if we truly have an LSP between them:

R2#traceroute 6.6.6.6 so loo0
Type escape sequence to abort.
Tracing the route to 6.6.6.6
VRF info: (vrf in name/id, vrf out name/id)
  1 10.2.3.3 [MPLS: Label 305 Exp 0] 26 msec 15 msec 18 msec
  2 10.3.4.4 [MPLS: Labels 401/500 Exp 0] 10 msec 24 msec 34 msec
  3 10.4.5.5 [MPLS: Label 500 Exp 0] 7 msec 23 msec 24 msec
  4 10.5.6.6 20 msec *  16 msec

This looks exactly like we wanted it to. (note that the 401 label is on a pure P router in the core).
This also means we can setup our VPNv4 configuration on R2 and R6:

R2#sh run | sec router bgp
router bgp 1000
 bgp router-id 2.2.2.2
 bgp log-neighbor-changes
 neighbor 3.3.3.3 remote-as 1000
 neighbor 3.3.3.3 update-source Loopback0
 neighbor 6.6.6.6 remote-as 1000
 neighbor 6.6.6.6 update-source Loopback0
 !
 address-family ipv4
  network 2.2.2.2 mask 255.255.255.255
  neighbor 3.3.3.3 activate
  neighbor 3.3.3.3 send-label
  no neighbor 6.6.6.6 activate
 exit-address-family
 !
 address-family vpnv4
  neighbor 6.6.6.6 activate
  neighbor 6.6.6.6 send-community extended
 exit-address-family
 !
 address-family ipv4 vrf CUSTOMER-A
  redistribute connected
  redistribute static
 exit-address-family
R2#

R6#sh run | sec router bgp
router bgp 1000
 bgp router-id 6.6.6.6
 bgp log-neighbor-changes
 neighbor 2.2.2.2 remote-as 1000
 neighbor 2.2.2.2 update-source Loopback0
 neighbor 5.5.5.5 remote-as 1000
 neighbor 5.5.5.5 update-source Loopback0
 !
 address-family ipv4
  network 6.6.6.6 mask 255.255.255.255
  no neighbor 2.2.2.2 activate
  neighbor 5.5.5.5 activate
  neighbor 5.5.5.5 send-label
 exit-address-family
 !
 address-family vpnv4
  neighbor 2.2.2.2 activate
  neighbor 2.2.2.2 send-community extended
 exit-address-family
 !
 address-family ipv4 vrf CUSTOMER-A
  redistribute connected
  redistribute static
 exit-address-family

Lets verify that the iBGP VPNv4 peering is up and running:

R2#sh bgp vpnv4 uni all sum
..
Neighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
6.6.6.6         4         1000      16      16       11    0    0 00:09:31        2

R6#sh bgp vpnv4 uni all sum
..
Neighbor        V           AS MsgRcvd MsgSent   TblVer  InQ OutQ Up/Down  State/PfxRcd
2.2.2.2         4         1000      17      17       11    0    0 00:10:26        2

We do have the prefixes and we should also have reachability from R1 to R7 (by way of their individual static default routes):

R1#ping 7.7.7.7 so loo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 7.7.7.7, timeout is 2 seconds:
Packet sent with a source address of 1.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 17/27/54 ms

Looks good, lets check the label path:

R1#traceroute 7.7.7.7 so loo0
Type escape sequence to abort.
Tracing the route to 7.7.7.7
VRF info: (vrf in name/id, vrf out name/id)
  1 10.1.2.2 19 msec 13 msec 12 msec
  2 10.2.3.3 [MPLS: Labels 305/600 Exp 0] 18 msec 19 msec 15 msec
  3 10.3.4.4 [MPLS: Labels 401/500/600 Exp 0] 12 msec 32 msec 34 msec
  4 10.4.5.5 [MPLS: Labels 500/600 Exp 0] 20 msec 27 msec 27 msec
  5 10.6.7.6 [MPLS: Label 600 Exp 0] 23 msec 15 msec 13 msec
  6 10.6.7.7 25 msec *  16 msec

What we are seeing here is basically the same path, but with the “VPN” label first (label 600).

So what have we really accomplished here? – Well, lets take a look at the RIB on R2 and look for the IGP (EIGRP AS 1) routes:

R2#sh ip route eigrp
..
      3.0.0.0/32 is subnetted, 1 subnets
D EX     3.3.3.3 [170/2560000512] via 10.2.3.3, 00:16:02, GigabitEthernet3
      10.0.0.0/8 is variably subnetted, 3 subnets, 2 masks
D        10.3.4.0/24 [90/3072] via 10.2.3.3, 00:16:02, GigabitEthernet3

A very small table indeed. And if we include whats being learned by BGP:

R2#sh ip route bgp
..
      6.0.0.0/32 is subnetted, 1 subnets
B        6.6.6.6 [200/0] via 3.3.3.3, 00:17:02

R2#sh ip route 6.6.6.6
Routing entry for 6.6.6.6/32
  Known via "bgp 1000", distance 200, metric 0, type internal
  Last update from 3.3.3.3 00:17:43 ago
  Routing Descriptor Blocks:
  * 3.3.3.3, from 3.3.3.3, 00:17:43 ago
      Route metric is 0, traffic share count is 1
      AS Hops 0
      MPLS label: 305

Only 1 prefix to communicate with the remote distribution site’s PE router (which we need the label for).

This means you can scale your distribution sites to very large sizes, keep your core as effecient as possible and eliminate using areas and whatnot in your IGP’s.

I hope its been useful with this quick walkthrough of unified/seamless MPLS.

EIGRP OTP example

In this post id like to provide an example of a fairly new development to EIGRP which is called EIGRP Over The Top (OTP).

In all its simplicity it establish an EIGRP multihop adjacency using LISP as the encapsulation method for transport through the WAN network.

One of the applications of this would be to avoid relying on the SP in an MPLS L3 VPN. You could simply use the L3 VPN for transport between the interfaces directly connected to the Service Provider and run your own adjacency directly between your CPE routers (without the use of a GRE tunnel, which would be another method to do it)

The topology used for this example consists of 4 routers. All 4 of the routers are using OSPF to provide connectivity (you could take this example and do a L3 VPN using MPLS as an exercise). Im simply taking the lazy path and doing it this way 🙂

EIGRP-OTP-Topology

EIGRP-OTP-Topology

R1 and R4 are running EIGRP in a named process “test”. This process is in Autonomous system 100 and the Loopback 0 interfaces are advertised into the V4 address-family.

Lets verify that we have connectivity between R1’s g1.102 interface and R4’s g1.304 interface:

R1#ping 172.3.4.4 so g1.102
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 172.3.4.4, timeout is 2 seconds:
Packet sent with a source address of 172.1.2.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/5/19 ms

All looks good.

Now lets take a look at the configuration that ties the R1 and R4 together with an EIGRP adjacency:

on R1:

R1#sh run | sec router eigrp
router eigrp test
 !
 address-family ipv4 unicast autonomous-system 100
  !
  topology base
  exit-af-topology
  neighbor 172.3.4.4 GigabitEthernet1.102 remote 10 lisp-encap 1
  network 1.0.0.0
  network 172.1.0.0
 exit-address-family

Whats important here is line 8 and 10.

In line 8, we specify that we have a remote neighbor (172.3.4.4), which can be reached through g1.102. The maximum number of hops to reach this neighbor is 10 and we should use lisp encapsulation with an ID of 1.

Also in line 10, its important to add the outgoing interface into the EIGRP process. I’ve found that without doing this, the adjacency wont come up. Its not enough to specify the interface in the neighbor command.

Lets verify which interfaces we are running EIGRP on at R1:

R1#sh ip eigrp interfaces
EIGRP-IPv4 VR(test) Address-Family Interfaces for AS(100)
                              Xmit Queue   PeerQ        Mean   Pacing Time   Multicast    Pending
Interface              Peers  Un/Reliable  Un/Reliable  SRTT   Un/Reliable   Flow Timer   Routes
Lo0                      0        0/0       0/0           0       0/0            0           0
Gi1.102                  1        0/0       0/0           1       0/0           50           0

On the reverse path, on R4:

R4#sh run | sec router eigrp
router eigrp test
 !
 address-family ipv4 unicast autonomous-system 100
  !
  topology base
  exit-af-topology
  neighbor 172.1.2.1 GigabitEthernet1.304 remote 10 lisp-encap 1
  network 4.0.0.0
  network 172.3.0.0
 exit-address-family

Same deal, just in the opposite direction.

Thats about it, lets take a look if we have the desired adjacency up and running:

R1#sh ip ei nei
EIGRP-IPv4 VR(test) Address-Family Neighbors for AS(100)
H   Address                 Interface              Hold Uptime   SRTT   RTO  Q  Seq
                                                   (sec)         (ms)       Cnt Num
0   172.3.4.4               Gi1.102                  12 01:14:16    1   100  0  3

Excellent! and the routing tables:

R1#sh ip route eigrp | beg Gateway
Gateway of last resort is not set

      4.0.0.0/32 is subnetted, 1 subnets
D        4.4.4.4 [90/93994331] via 172.3.4.4, 01:14:50, LISP1

Pay attention to the fact that LISP1 is used as the outgoing interface.

And finally the data plane verification:

R1#ping 4.4.4.4 so loo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
Packet sent with a source address of 1.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/10/25 ms

Great! – and thats about all there is to a simple EIGRP OTP scenario. (Look into EIGRP “Route Reflectors” if you want more information on hub-and-spoke topologies).

Take care!

Trying out IPv6 Prefix Delegation

In this post i will show how and why to use a feature called IPv6 Prefix Delegation (PD).

IPv6 prefix delegation is a feature that provides the capability to delegate or hand out IPv6 prefixes to other routers without the need to hardcode these prefixes into the routers.

Why would you want to do this? – Well, for one is the administration overhead associated with manual configuration. If the end-customer only cares about the amount of prefixes he or she receives, then it might as well be handed out automatically from a preconfigure pool. Just like DHCP works today on end-user systems.

On top of that, by configuring a redistribution into BGP just once, you will automatically have reachability to the prefixes that has been handed out, from the rest of your SP network.

So how do you go about configuring this? – Well, lets take a look at the topology we’ll be using to demonstrate IPv6 Prefix Delegation.

PD-Post-Topology

First off, we have the SP core network which consists of R1, R2 and R3. They are running in AS 64512 with R1 being a BGP route-reflector for the IPv6 unicast address-family. As an IGP we are running OSPFv3 to provide reachability within the core. No IPv4 is configured on any device.

The SP has been allocated a /32 IPv6 prefix which is 2001:1111::/32, from which it will “carve” out IPv6 prefixes to both its internal network as well as customer networks.

We are using /125 for the links between the core routers, just to make it simple when looking at the routing tables and the topology.

R2 is really where all the magic is taking place. R2 is a PE for two customers, Customer A and Customer B. Customer A is being reached through Gigabit2 and Customer B through Gigabit3. The customer’s respective CE routers are R4 and R7.

There is a link-net between R2 and R4 as well as R2 and R7. These are respectively 2001:1111:101::/64 and 2001:1111:102::/64.

So Lab-ISP has decided to use a /48 network from which to hand out prefixes to its customers. This /48 is 2001:1111:2222::/48. Lab-ISP also decided to hand out /56 addresses which will give the customers 8 bits (from 56 to 64) to use for subnetting. This is a typical deployment.

Also, since we are using a /48 as the block to “carve” out from, this gives us 8 bits (from 48 to 56) of assignable subnets, which ofcourse equals to 256 /56 prefixes we can hand out.

All of this can be a bit confusing, so lets look at it from a different perspective.

We start out with 2001:1111:2222::/48. We then want to look at how the first /56 looks like:

The 2001:1111:2222:0000::/56 is
2001:1111:2222:0000::
until
2001:1111:2222:00FF::

That last byte (remember this is all in hex) is what gives the customer 256 subnets to play around with.

The next /56 is:
2001:1111:2222:0100::/56

2001:1111:2222:0100::
until
2001:1111:2222:01FF::

We can do this all in all 256 times as mentioned earlier.

So in summary, with two customers, each receiving a /56 prefix, we would expect to see the bindings show up on R2 as:

2001:1111:2222::/56
2001:1111:2222:100::/56

So with all this theory in place, lets take a look at the configuration that makes all this work out.

First off we start out with creating a local IPv6 pool on R2:

ipv6 local pool IPv6-Local-Pool 2001:1111:2222::/48 56

This is in accordance to the requirements we have stated earlier.

Next up, we tie this local pool into a global IPv6 pool used specifically for Prefix Delegation:

ipv6 dhcp pool PD-DHCP-POOL
 prefix-delegation pool IPv6-Local-Pool

Finally we attach the IPv6 DHCP pool to the interfaces of Customer A and Customer B:

R2#sh run int g2
Building configuration...

Current configuration : 132 bytes
!
interface GigabitEthernet2
 no ip address
 negotiation auto
 ipv6 address 2001:1111:101::2/64
 ipv6 dhcp server PD-DHCP-POOL
end

R2#sh run int g3
Building configuration...

Current configuration : 132 bytes
!
interface GigabitEthernet3
 no ip address
 negotiation auto
 ipv6 address 2001:1111:102::2/64
 ipv6 dhcp server PD-DHCP-POOL
end

Thats pretty much all thats required from the SP point of view in order to hand out the prefixes.

Now, lets take a look at whats required on the CE routers.

Starting off with R4’s interface to the SP:

R4#sh run int g2
Building configuration...

Current configuration : 156 bytes
!
interface GigabitEthernet2
 no ip address
 negotiation auto
 ipv6 address 2001:1111:101::3/64
 ipv6 address autoconfig
 ipv6 dhcp client pd LOCAL-CE
end

Note that the “LOCAL-CE” is a local label we will use for the next step. It can be anything you desire.

Only when the “inside” interfaces requests an IPv6 address will a request be sent to the SP for them to hand something out. This is done on R4’s g1.405 and g1.406 interfaces:

R4#sh run int g1.405
Building configuration...

Current configuration : 126 bytes
!
interface GigabitEthernet1.405
 encapsulation dot1Q 405
 ipv6 address LOCAL-CE ::1:0:0:0:1/64
 ipv6 address autoconfig
end

R4#sh run int g1.406
Building configuration...

Current configuration : 126 bytes
!
interface GigabitEthernet1.406
 encapsulation dot1Q 406
 ipv6 address LOCAL-CE ::2:0:0:0:1/64
 ipv6 address autoconfig
end

Here we reference the previous local label “LOCAL-CE”. Most interesting is the fact that we are now subnetting the /56 prefix we have received by doing the “::1:0:0:0:1/64” and “::2:0:0:0:1/64” respectively.

What this does is that it appends the address to whats being given out. To repeat, for Customer A, this is 2001:1111:2222::/56 which will then be a final address of: 2001:1111:2222:1:0:0:0:1/64 for interface g1.405 and 2001:1111:2222:2:0:0:0:1/64 for g1.406.

Lets turn our attention to Customer B on R7.

Same thing has been configured, just using a different “label” for the assigned pool to show that its arbitrary:

R7#sh run int g3
Building configuration...

Current configuration : 155 bytes
!
interface GigabitEthernet3
 no ip address
 negotiation auto
 ipv6 address 2001:1111:102::7/64
 ipv6 address autoconfig
 ipv6 dhcp client pd CE-POOL
end

And the inside interface g1.100:

R7#sh run int g1.100
Building configuration...

Current configuration : 100 bytes
!
interface GigabitEthernet1.100
 encapsulation dot1Q 100
 ipv6 address CE-POOL ::1:0:0:0:7/64
end

Again, we are subnetting the received /56 into a /64 and applying it on the inside interface.

Going back to the SP point of view, lets verify that we are handing out some prefixes:


R2#sh ipv6 local pool
Pool                  Prefix                                       Free  In use
IPv6-Local-Pool       2001:1111:2222::/48                            254      2

We can see that our local pool has handed out 2 prefixes and if we dig further down into the bindings:


R2#sh ipv6 dhcp binding
Client: FE80::250:56FF:FEBE:93CC
  DUID: 00030001001EF6767600
  Username : unassigned
  VRF : default
  Interface : GigabitEthernet3
  IA PD: IA ID 0x00080001, T1 302400, T2 483840
    Prefix: 2001:1111:2222:100::/56
            preferred lifetime 604800, valid lifetime 2592000
            expires at Oct 16 2014 03:11 PM (2416581 seconds)
Client: FE80::250:56FF:FEBE:4754
  DUID: 00030001001EE5DF8700
  Username : unassigned
  VRF : default
  Interface : GigabitEthernet2
  IA PD: IA ID 0x00070001, T1 302400, T2 483840
    Prefix: 2001:1111:2222::/56
            preferred lifetime 604800, valid lifetime 2592000
            expires at Oct 16 2014 03:11 PM (2416575 seconds)

We see that we do indeed have some bindings taking place. Whats of more interest though, is the fact that static routes have been created:


R2#sh ipv6 route static | beg a - Ap
       a - Application
S   2001:1111:2222::/56 [1/0]
     via FE80::250:56FF:FEBE:4754, GigabitEthernet2
S   2001:1111:2222:100::/56 [1/0]
     via FE80::250:56FF:FEBE:93CC, GigabitEthernet3

So two static routes that points to the CE routers. This makes it extremely simple to propagate further into the SP core:


R2#sh run | sec router bgp
router bgp 64512
 bgp router-id 2.2.2.2
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 neighbor 2001:1111::12:1 remote-as 64512
 !
 address-family ipv4
 exit-address-family
 !
 address-family ipv6
  redistribute static
  neighbor 2001:1111::12:1 activate
 exit-address-family

Ofcourse some sort of filtering should be used instead of just redistributing every static route on the PE, but you get the point. So lets check it out on R3 for example:


R3#sh bgp ipv6 uni | beg Network
     Network          Next Hop            Metric LocPrf Weight Path
 *>i 2001:1111:2222::/56
                       2001:1111::12:2          0    100      0 ?
 *>i 2001:1111:2222:100::/56
                       2001:1111::12:2          0    100      0 ?

We do indeed have the two routes installed.

So how could the customer setup their routers to learn these prefixes automatically and use them actively?
Well, one solution would be stateless autoconfiguration, which i have opted to use here along with setting the default route doing this, on R5:


R5#sh run int g1.405
Building configuration...

Current configuration : 96 bytes
!
interface GigabitEthernet1.405
 encapsulation dot1Q 405
 ipv6 address autoconfig default
end

R5#sh ipv6 route | beg a - Ap
       a - Application
ND  ::/0 [2/0]
     via FE80::250:56FF:FEBE:49F3, GigabitEthernet1.405
NDp 2001:1111:2222:1::/64 [2/0]
     via GigabitEthernet1.405, directly connected
L   2001:1111:2222:1:250:56FF:FEBE:3DFB/128 [0/0]
     via GigabitEthernet1.405, receive
L   FF00::/8 [0/0]
     via Null0, receive

and R6:


R6#sh run int g1.406
Building configuration...

Current configuration : 96 bytes
!
interface GigabitEthernet1.406
 encapsulation dot1Q 406
 ipv6 address autoconfig default
end

R6#sh ipv6 route | beg a - App
       a - Application
ND  ::/0 [2/0]
     via FE80::250:56FF:FEBE:49F3, GigabitEthernet1.406
NDp 2001:1111:2222:2::/64 [2/0]
     via GigabitEthernet1.406, directly connected
L   2001:1111:2222:2:250:56FF:FEBE:D054/128 [0/0]
     via GigabitEthernet1.406, receive
L   FF00::/8 [0/0]
     via Null0, receive

So now we have the SP core in place, we have the internal customer in place. All thats really required now is for some sort of routing to take place on the CE routers toward the SP. I have chosen the simplest solution, a static default route:


R4#sh run | incl ipv6 route
ipv6 route ::/0 2001:1111:101::2

and on R7:


R7#sh run | incl ipv6 route
ipv6 route ::/0 2001:1111:102::2

Finally its time to test all this stuff out in the data plane.

Lets ping from R3 to R5 and R6:


R3#ping 2001:1111:2222:1:250:56FF:FEBE:3DFB
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001:1111:2222:1:250:56FF:FEBE:3DFB, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/12/20 ms
R3#ping 2001:1111:2222:2:250:56FF:FEBE:D054
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001:1111:2222:2:250:56FF:FEBE:D054, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/7/17 ms

And also to R7:


R3#ping 2001:1111:2222:101::7
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001:1111:2222:101::7, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/8/18 ms

Excellent. Everything works.

Lets summarize what we have done.

1) We created a local IPv6 pool on the PE router.
2) We created a DHCPv6 server utilizing this local pool as a prefix delegation.
3) We enabled the DHCPv6 server on the customer facing interfaces.
4) We enabled the DHCPv6 PD on the CE routers (R4 and R7) and used a local label as an identifier.
5) We enabled IPv6 addresses using PD on the local interfaces toward R5, R6 on Customer A and on R7 on Customer B.
6) We used stateless autoconfiguration internal to the customers to further propagate the IPv6 prefixes.
7) We created static routing on the CE routers toward the SP.
8) We redistributed statics into BGP on the PE router.
9) We verified that IPv6 prefixes were being delegated through DHCPv6.
10) And finally we verified that everything was working in the data plane.

I hope this has covered a pretty niche topic of IPv6 and it has been useful to you.

Take care!

VRF based path selection

In this post I will be showing you how its possible to use different paths between your PE routers on a per VRF basis.

This is very useful if you have customers you want to “steer” away from your normal traffic flow between PE routers.
For example, this could be due to certain SLA’s.

I will be using the following topology to demonstrate how this can be done:

Topology

A short walkthrough of the topology is in order.

In the service provider core we have 4 routers. R3, XRv-1, XRv-2 and R4. R3 and R4 are IOS-XE based routers and XRv-1 and XRv-2 are as the name implies, IOS-XR routers. There is no significance attached to the fact that im running two XR routers. Its simply how I could build the required topology.

The service provider is running OSPF as the IGP, with R3 and R4 being the PE routers for an MPLS L3 VPN service. On top of that, LDP is being used to build the required LSP’s. The IGP has been modified to prefer the northbound path (R3 -> XRv-1 -> R4) by increasing the cost of the R3, XRv-2 and R4 to 100.

So by default, traffic between R3 and R4 will flow northbound.

We can easily verify this:

R3#traceroute 4.4.4.4
Type escape sequence to abort.
Tracing the route to 4.4.4.4
VRF info: (vrf in name/id, vrf out name/id)
  1 10.3.10.10 [MPLS: Label 16005 Exp 0] 16 msec 1 msec 1 msec
  2 10.4.10.4 1 msec *  5 msec

And the reverse path is the same:

R4#traceroute 3.3.3.3
Type escape sequence to abort.
Tracing the route to 3.3.3.3
VRF info: (vrf in name/id, vrf out name/id)
  1 10.4.10.10 [MPLS: Label 16000 Exp 0] 3 msec 2 msec 0 msec
  2 10.3.10.3 1 msec *  5 msec

Besides that traffic flow the desired way, we can see we are using label switching between the loopbacks. Exactly what we want in this type of setup.

On the customer side, we have 2 customers, Customer A and Customer B. Each of them has 2 sites, one behind R3 and one behind R4. Pretty simple. They are all running EIGRP between the CE’s and the PE’s.

Beyond this we have MPLS Traffic Engineering running in the service core as well. Specifically we are running a tunnel going from R3’s loopback200 (33.33.33.33/32) towards R4’s loopback200 (44.44.44.44/32). This has been accomplished by configuring an explicit path on both R3 and R4.

Lets verify the tunnel configuration on both:

On R3:

R3#sh ip expl
PATH NEW-R3-TO-R4 (strict source route, path complete, generation 8)
    1: next-address 10.3.20.20
    2: next-address 10.4.20.4
R3#sh run int tunnel10
Building configuration...

Current configuration : 180 bytes
!
interface Tunnel10
 ip unnumbered Loopback200
 tunnel mode mpls traffic-eng
 tunnel destination 10.4.20.4
 tunnel mpls traffic-eng path-option 10 explicit name NEW-R3-TO-R4
end

And on R4:

R4#sh ip expl
PATH NEW-R4-TO-R3 (strict source route, path complete, generation 4)
    1: next-address 10.4.20.20
    2: next-address 10.3.20.3
R4#sh run int tun10
Building configuration...

Current configuration : 180 bytes
!
interface Tunnel10
 ip unnumbered Loopback200
 tunnel mode mpls traffic-eng
 tunnel destination 10.3.20.3
 tunnel mpls traffic-eng path-option 10 explicit name NEW-R4-TO-R3
end

On top of that we have configured a static route on both R3 and R4, to steer traffic for each others loopback200’s down the tunnel:

R3#sh run | incl ip route
ip route 44.44.44.44 255.255.255.255 Tunnel10

R4#sh run | incl ip route
ip route 33.33.33.33 255.255.255.255 Tunnel10

Resulting in the following RIB’s:

R3#sh ip route 44.44.44.44
Routing entry for 44.44.44.44/32
  Known via "static", distance 1, metric 0 (connected)
  Routing Descriptor Blocks:
  * directly connected, via Tunnel10
      Route metric is 0, traffic share count is 1
	  
R4#sh ip route 33.33.33.33
Routing entry for 33.33.33.33/32
  Known via "static", distance 1, metric 0 (connected)
  Routing Descriptor Blocks:
  * directly connected, via Tunnel10
      Route metric is 0, traffic share count is 1

And to test out that we are actually using the southbound path (R3 -> XRv-2 -> R4), lets traceroute between the loopbacks (loopback200):

on R3:

R3#traceroute 44.44.44.44 so loopback200
Type escape sequence to abort.
Tracing the route to 44.44.44.44
VRF info: (vrf in name/id, vrf out name/id)
  1 10.3.20.20 [MPLS: Label 16007 Exp 0] 4 msec 2 msec 1 msec
  2 10.4.20.4 1 msec *  3 msec

and on R4:

R4#traceroute 33.33.33.33 so loopback200
Type escape sequence to abort.
Tracing the route to 33.33.33.33
VRF info: (vrf in name/id, vrf out name/id)
  1 10.4.20.20 [MPLS: Label 16008 Exp 0] 4 msec 1 msec 1 msec
  2 10.3.20.3 1 msec *  3 msec

This verifies that we have our two unidirectional tunnels and that communication between the loopback200 interfaces flows through the southbound path using our TE tunnels.

So lets take a look at the very simple BGP PE configuration on both R3 and R4:

R3:

router bgp 100
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 neighbor 4.4.4.4 remote-as 100
 neighbor 4.4.4.4 update-source Loopback100
 !
 address-family ipv4
 exit-address-family
 !
 address-family vpnv4
  neighbor 4.4.4.4 activate
  neighbor 4.4.4.4 send-community extended
 exit-address-family
 !
 address-family ipv4 vrf A
  redistribute eigrp 100
 exit-address-family
 !
 address-family ipv4 vrf B
  redistribute eigrp 100
 exit-address-family

and R4:

router bgp 100
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 neighbor 3.3.3.3 remote-as 100
 neighbor 3.3.3.3 update-source Loopback100
 !
 address-family ipv4
 exit-address-family
 !
 address-family vpnv4
  neighbor 3.3.3.3 activate
  neighbor 3.3.3.3 send-community extended
 exit-address-family
 !
 address-family ipv4 vrf A
  redistribute eigrp 100
 exit-address-family
 !
 address-family ipv4 vrf B
  redistribute eigrp 100
 exit-address-family

From this output, we can see that we are using the loopback100 interfaces for the BGP peering. As routing updates comes in from one PE, the next-hop will be set to the remote PE’s loopback100 interface. This will then cause the transport-label to be one going to this loopback100 interface.

A traceroute from R1’s loopback0 interface to R5’s loopback0 interface, will show us the path that traffic between each site in VRF A (Customer A) will take:

R1:

R1#traceroute 5.5.5.5 so loo0
Type escape sequence to abort.
Tracing the route to 5.5.5.5
VRF info: (vrf in name/id, vrf out name/id)
  1 10.1.3.3 1 msec 1 msec 0 msec
  2 10.3.10.10 [MPLS: Labels 16005/408 Exp 0] 6 msec 1 msec 10 msec
  3 10.4.5.4 [MPLS: Label 408 Exp 0] 15 msec 22 msec 17 msec
  4 10.4.5.5 18 msec *  4 msec

and lets compare that to what R3 will use as the transport label to reach R4’s loopback100 interface:

 
R3#sh mpls for
Local      Outgoing   Prefix           Bytes Label   Outgoing   Next Hop
Label      Label      or Tunnel Id     Switched      interface
300        Pop Label  10.10.10.10/32   0             Gi1.310    10.3.10.10
301        Pop Label  10.4.10.0/24     0             Gi1.310    10.3.10.10
302        Pop Label  20.20.20.20/32   0             Gi1.320    10.3.20.20
303        16004      10.4.20.0/24     0             Gi1.310    10.3.10.10
304   [T]  Pop Label  44.44.44.44/32   0             Tu10       point2point
305        16005      4.4.4.4/32       0             Gi1.310    10.3.10.10
310        No Label   1.1.1.1/32[V]    2552          Gi1.13     10.1.3.1
311        No Label   10.1.3.0/24[V]   0             aggregate/A
312        No Label   2.2.2.2/32[V]    2552          Gi1.23     10.2.3.2
313        No Label   10.2.3.0/24[V]   0             aggregate/B

We can see that this matches up being 16005 (going to XRv-1) through the northbound path.

This begs the question, how do we steer our traffic through the southbound path using the loopback200 instead, when the peering is between loopback100’s?

Well, thankfully IOS has it covered. Under the VRF configuration for Customer B (VRF B), we have the option of setting the loopback interface of updates sent to the remote PE:

On R3:

vrf definition B
 rd 100:2
 !
 address-family ipv4
  route-target export 100:2
  route-target import 100:2
  bgp next-hop Loopback200
 exit-address-family

and the same on R4:

 vrf definition B
  rd 100:2
  !
  address-family ipv4
   route-target export 100:2
   route-target import 100:2
   bgp next-hop Loopback200
  exit-address-family

This causes the BGP updates to contain the “correct” next-hop:

R3:

R3#sh bgp vpnv4 uni vrf B | beg Route Dis
Route Distinguisher: 100:2 (default for vrf B)
 *>  2.2.2.2/32       10.2.3.2            130816         32768 ?
 *>i 6.6.6.6/32       44.44.44.44         130816    100      0 ?
 *>  10.2.3.0/24      0.0.0.0                  0         32768 ?
 *>i 10.4.6.0/24      44.44.44.44              0    100      0 ?

44.44.44.44/32 being the loopback200 of R4, and on R4:

R4#sh bgp vpnv4 uni vrf B | beg Route Dis
Route Distinguisher: 100:2 (default for vrf B)
 *>i 2.2.2.2/32       33.33.33.33         130816    100      0 ?
 *>  6.6.6.6/32       10.4.6.6            130816         32768 ?
 *>i 10.2.3.0/24      33.33.33.33              0    100      0 ?
 *>  10.4.6.0/24      0.0.0.0                  0         32768 ?

Lets check out whether this actually works or not:

R2#traceroute 6.6.6.6 so loo0
Type escape sequence to abort.
Tracing the route to 6.6.6.6
VRF info: (vrf in name/id, vrf out name/id)
  1 10.2.3.3 1 msec 1 msec 0 msec
  2 10.3.20.20 [MPLS: Labels 16007/409 Exp 0] 4 msec 1 msec 10 msec
  3 10.4.6.4 [MPLS: Label 409 Exp 0] 15 msec 16 msec 17 msec
  4 10.4.6.6 19 msec *  4 msec

Excellent! – We can see that we are indeed using the southbound path. To make sure we are using the tunnel, note the transport label of 16007, and compare that to:

R3:

R3#sh mpls traffic-eng tun tunnel 10

Name: R3_t10                              (Tunnel10) Destination: 10.4.20.4
  Status:
    Admin: up         Oper: up     Path: valid       Signalling: connected
    path option 10, type explicit NEW-R3-TO-R4 (Basis for Setup, path weight 200)

  Config Parameters:
    Bandwidth: 0        kbps (Global)  Priority: 7  7   Affinity: 0x0/0xFFFF
    Metric Type: TE (default)
    AutoRoute: disabled LockDown: disabled Loadshare: 0 [0] bw-based
    auto-bw: disabled
  Active Path Option Parameters:
    State: explicit path option 10 is active
    BandwidthOverride: disabled  LockDown: disabled  Verbatim: disabled


  InLabel  :  -
  OutLabel : GigabitEthernet1.320, 16007
  Next Hop : 10.3.20.20

I have deleted alot of non-relevant output, but pay attention to the Outlabel, which is indeed 16007.

So that was a quick walkthrough of how easy it is to accomplish the stated goal once you know about that nifty IOS command.

I hope its been useful to you.

Take Care!

Using LISP for IPv6 tunnelling.

In this post I would like to show how its possible to use a fairly new protocol, LISP, to interconnect IPv6 islands over an IPv4 backbone/core network.

LISP stands for Locator ID Seperation Protocol. As the name suggest, its actually meant to decouple location from identity. This means it can be used for such cool things as mobility, being VM’s or a mobile data connection.

However another aspect of using LISP involves its tunneling mechanism. This is what I will be using in my example to provide the IPv6 islands the ability to communicate over the IPv4-only backbone.

There is alot of terminology involved with LISP, but i will only use some of them here for clarity. If you want to know more about LISP, a good place to start is http://lisp.cisco.com.

The topology i will be using is a modified version of one presented in a Cisco Live presentation called “BRKRST-3046 – Advanced LISP – Whats in it for me?”. I encourage you to view this as well for more information.

Here is the topology:

LISP-IPv6-Topology

Some background information about the setup. Both Site 1 and Site 2 are using EIGRP as the IGP. Both IPv4 and IPv6 is being routed internally. A default route is created by R2, R3 and R6 in their respective sites.

The RIB on R1 for both IPv4 and IPv6:


R1#sh ip route eigrp | beg Gateway
Gateway of last resort is 172.16.10.3 to network 0.0.0.0

D*EX  0.0.0.0/0 [170/2560000512] via 172.16.10.3, 1d00h, GigabitEthernet1.100
                [170/2560000512] via 172.16.10.2, 1d00h, GigabitEthernet1.100
R1#sh ipv6 route eigrp
<snip>
EX  ::/0 [170/2816]
     via FE80::250:56FF:FEBE:675D, GigabitEthernet1.100
     via FE80::250:56FF:FEBE:9215, GigabitEthernet1.100

And R7:

R7#sh ip ro eigrp | beg Gateway
Gateway of last resort is 172.16.20.6 to network 0.0.0.0

D*EX  0.0.0.0/0 [170/2560000512] via 172.16.20.6, 1d00h, GigabitEthernet1.67

R7#sh ipv6 route eigrp
<snip>
EX  ::/0 [170/2816]
     via FE80::250:56FF:FEBE:D054, GigabitEthernet1.67

Now in order to get anywhere, we need to setup our LISP infrastructure. This means configuring R2, R3 and R6 as whats known as RLOC’s as well as configuring R5 as a mapping-server and map-resolver. A mapping server/resolver is where RLOC’s register what internal IP scopes they have in their sites. Its also where each RLOC asks for information on how to reach other sites. So obviously they are a very important part of our LISP setup. Here is the relevant configuration on R5:

router lisp
 site SITE1
  authentication-key blah
  eid-prefix 153.16.1.1/32
  eid-prefix 153.16.1.2/32
  eid-prefix 172.16.10.0/24 accept-more-specifics
  eid-prefix 2001::1/128
  eid-prefix 2001::2/128
  eid-prefix 2001:100::/32 accept-more-specifics
  exit
 !
 site SITE2
  authentication-key blah
  eid-prefix 153.16.2.1/32
  eid-prefix 172.16.20.0/24
  eid-prefix 2001::7/128
  eid-prefix 2001:67::/32 accept-more-specifics
  exit
 !
 ipv4 map-server
 ipv4 map-resolver
 ipv6 map-server
 ipv6 map-resolver

On IOS-XE which is what im using to build this lab, all configuration is being done under the router LISP mode.

As can be seen from the configuration, two sites have been defined, SITE1 and SITE2.
An authentication key has been configured for each site. Furthermore, the prefixes that we want to accept from each site has also been configured. If our addressing scheme had been somewhat more thought out we could use the “accept-more-specifics” to accept more specific subnets, but this configuration serves our purpose.

Pay attention to the fact that we do this for each address-family. For our IPv6 example this is really not nessecary, but i wanted to provide both IPv4 and IPv6 connectivity, so i configured both.

Finally I’ve configured R5 as both a map-server and map-resolver for each address-family.

Next up is the configuration for R2:

R2#sh run | sec router lisp
router lisp
 locator-set SITE1
  10.1.1.1 priority 10 weight 50
  10.1.2.1 priority 10 weight 50
  exit
 !
 database-mapping 153.16.1.1/32 locator-set SITE1
 database-mapping 153.16.1.2/32 locator-set SITE1
 database-mapping 172.16.10.0/24 locator-set SITE1
 database-mapping 2001::1/128 locator-set SITE1
 database-mapping 2001::2/128 locator-set SITE1
 database-mapping 2001:100::/32 locator-set SITE1
 ipv4 itr map-resolver 10.1.3.1
 ipv4 itr
 ipv4 etr map-server 10.1.3.1 key blah
 ipv4 etr
 ipv6 itr map-resolver 10.1.3.1
 ipv6 itr
 ipv6 etr map-server 10.1.3.1 key blah
 ipv6 etr

The first part of this configuration lists a “Locator-Set”. This is where you want to list each RLOC for the site in question. For our SITE1 we have 2 RLOC’s with IPv4 addresses in the IPv4 transport cloud being 10.1.1.1 and 10.1.2.1 respectively for R2 and R3.

One of the very cool things about LISP is how you can achieve redundancy and/or load-balancing signaled by the local RLOC’s. By modifying the priority of R3 (10.1.2.1) to 20, we have effectively told the other site(s) that we want to prefer R2 as the egress tunnel router (ETR), so all traffic would be sent to R2. However if we instead leave the priority to the same and modify the weight, we can load-balance traffic. Again this is signaled by the local site and replicated to the remote site(s).

Next up is our mappings. This is where we define which prefixes we want to use in this site. Here we have the loopbacks of R1 and the network used for connectivity in SITE 1. Both for IPv4 and IPv6. Again IPv4 is not nessecary for our example.

Finally, we define a map-resolver and map-server for both ITR (Ingress Tunnel Router) and ETR (Egress Tunnel Router). This is so we can define where we want to send our mapping data as well as where to ask for other ETR’s. We also define ourselves as ITR and ETR for both address-families.

The exact same configuration has been applied on R3:

R3#sh run | sec router lisp
router lisp
 locator-set SITE1
  10.1.1.1 priority 10 weight 50
  10.1.2.1 priority 10 weight 50
  exit
 !
 database-mapping 153.16.1.1/32 locator-set SITE1
 database-mapping 153.16.1.2/32 locator-set SITE1
 database-mapping 172.16.10.0/24 locator-set SITE1
 database-mapping 2001::1/128 locator-set SITE1
 database-mapping 2001::2/128 locator-set SITE1
 database-mapping 2001:100::/32 locator-set SITE1
 ipv4 itr map-resolver 10.1.3.1
 ipv4 itr
 ipv4 etr map-server 10.1.3.1 key blah
 ipv4 etr
 ipv6 itr map-resolver 10.1.3.1
 ipv6 itr
 ipv6 etr map-server 10.1.3.1 key blah
 ipv6 etr

Now for some verification commands on R2:

R2#sh ip lisp
  Instance ID:                      0
  Router-lisp ID:                   0
  Locator table:                    default
  EID table:                        default
  Ingress Tunnel Router (ITR):      enabled
  Egress Tunnel Router (ETR):       enabled
  Proxy-ITR Router (PITR):          disabled
  Proxy-ETR Router (PETR):          disabled
  NAT-traversal Router (NAT-RTR):   disabled
  Mobility First-Hop Router:        disabled
  Map Server (MS):                  disabled
  Map Resolver (MR):                disabled
  Delegated Database Tree (DDT):    disabled
  Map-Request source:               10.1.1.1
  ITR Map-Resolver(s):              10.1.3.1
  ETR Map-Server(s):                10.1.3.1 (00:00:50)
  xTR-ID:                           0xA7F25A1D-0x982B7E10-0xDD2D66CC-0x436D28A5
  site-ID:                          unspecified
  ITR Solicit Map Request (SMR):    accept and process
    Max SMRs per map-cache entry:   8 more specifics
    Multiple SMR suppression time:  20 secs
  ETR accept mapping data:          disabled, verify disabled
  ETR map-cache TTL:                1d00h
  Locator Status Algorithms:
    RLOC-probe algorithm:           disabled
    LSB reports:                    process
    IPv4 RLOC minimum mask length:  /0
    IPv6 RLOC minimum mask length:  /0
  Static mappings configured:       0
  Map-cache size/limit:             1/1000
  Imported route count/limit:       0/1000
  Map-cache activity check period:  60 secs
  Map-cache FIB updates:            established
  Total database mapping size:      3
    static database size/limit:     3/5000
    dynamic database size/limit:    0/1000
    route-import database size:     0
  Persistent map-cache:             interval 01:00:00
    Earliest next store:            now
    Location:                       bootflash:LISP-MapCache-IPv4-00000000-00100

Lots of output. But pay attention to the fact that both ITR and ETR has been enabled and ITR Map-Resolver(s) and ETR Map-Server(s) has been defined to 10.1.3.1 (R5).

We also want to verify our current map-cache which is the cache maintained by the RLOC’s for what it already “knows” about:

R2#sh ipv6 lisp map-cache
LISP IPv6 Mapping Cache for EID-table default (IID 0), 1 entries

::/0, uptime: 00:00:01, expires: never, via static send map-request
  Negative cache entry, action: send-map-request

Basically this output tells you that we dont know about any specific networks from other sites just yet.

R6 is very similar to R2 and R3:

R6#sh run | sec router lisp
router lisp
 locator-set SITE2
  10.1.4.1 priority 10 weight 50
  exit
 !
 database-mapping 153.16.2.1/32 locator-set SITE2
 database-mapping 172.16.20.0/24 locator-set SITE2
 database-mapping 2001::7/128 locator-set SITE2
 ipv4 itr map-resolver 10.1.3.1
 ipv4 itr
 ipv4 etr map-server 10.1.3.1 key blah
 ipv4 etr
 ipv6 itr map-resolver 10.1.3.1
 ipv6 itr
 ipv6 etr map-server 10.1.3.1 key blah
 ipv6 etr

And verification:

R6#sh ip lisp
  Instance ID:                      0
  Router-lisp ID:                   0
  Locator table:                    default
  EID table:                        default
  Ingress Tunnel Router (ITR):      enabled
  Egress Tunnel Router (ETR):       enabled
  Proxy-ITR Router (PITR):          disabled
  Proxy-ETR Router (PETR):          disabled
  NAT-traversal Router (NAT-RTR):   disabled
  Mobility First-Hop Router:        disabled
  Map Server (MS):                  disabled
  Map Resolver (MR):                disabled
  Delegated Database Tree (DDT):    disabled
  Map-Request source:               10.1.4.1
  ITR Map-Resolver(s):              10.1.3.1
  ETR Map-Server(s):                10.1.3.1 (00:00:38)
  xTR-ID:                           0xFABA5140-0x6AA2BA6A-0x5F347223-0xF7E0CED0
  site-ID:                          unspecified
  ITR Solicit Map Request (SMR):    accept and process
    Max SMRs per map-cache entry:   8 more specifics
    Multiple SMR suppression time:  20 secs
  ETR accept mapping data:          disabled, verify disabled
  ETR map-cache TTL:                1d00h
  Locator Status Algorithms:
    RLOC-probe algorithm:           disabled
    LSB reports:                    process
    IPv4 RLOC minimum mask length:  /0
    IPv6 RLOC minimum mask length:  /0
  Static mappings configured:       0
  Map-cache size/limit:             1/1000
  Imported route count/limit:       0/1000
  Map-cache activity check period:  60 secs
  Map-cache FIB updates:            established
  Total database mapping size:      2
    static database size/limit:     2/5000
    dynamic database size/limit:    0/1000
    route-import database size:     0
  Persistent map-cache:             interval 01:00:00
    Earliest next store:            now
    Location:                       bootflash:LISP-MapCache-IPv4-00000000-00100

Along with the mapping-cache:

R6#sh ipv6 lisp map-cache
LISP IPv6 Mapping Cache for EID-table default (IID 0), 1 entries

::/0, uptime: 00:00:04, expires: never, via static send map-request
  Negative cache entry, action: send-map-request

If we now try a ping from R1’s loopback0 to R7’s loopback0 we see the following:

R1#ping 2001::7 so loo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001::7, timeout is 2 seconds:
Packet sent with a source address of 2001::1
..!!!
Success rate is 60 percent (3/5), round-trip min/avg/max = 1/1/1 ms

What this tells us is that we have connectivity, but beyond that it also means that for the 2 first ICMP echo’s, location data is being retrieved. Lets now check the mapping-cache on R2:

R2#sh ipv6 lisp map-cache
LISP IPv6 Mapping Cache for EID-table default (IID 0), 2 entries

::/0, uptime: 00:01:08, expires: never, via static send map-request
  Negative cache entry, action: send-map-request
2001::7/128, uptime: 00:00:07, expires: 23:59:52, via map-reply, complete
  Locator   Uptime    State      Pri/Wgt
  10.1.4.1  00:00:07  up          10/50

Here we see that 2001::7/128 is currently in the cache and in order to get there we need to tunnel our traffic to the RLOC at 10.1.4.1 (R6).

On the remote side we see something similar:

R6#sh ipv6 lisp map-cache
LISP IPv6 Mapping Cache for EID-table default (IID 0), 2 entries

::/0, uptime: 00:01:53, expires: never, via static send map-request
  Negative cache entry, action: send-map-request
2001::1/128, uptime: 00:01:34, expires: 23:58:25, via map-reply, complete
  Locator   Uptime    State      Pri/Wgt
  10.1.1.1  00:01:34  up          10/50
  10.1.2.1  00:01:34  up          10/50

This is the mapping that tells R6 that it can use both RLOC’s to send traffic to (They are both in the up state).

If we try a ping from R1 again:

R1#ping 2001::7 so loo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 2001::7, timeout is 2 seconds:
Packet sent with a source address of 2001::1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/9/25 ms

We get full connectivity because the cache has already been populated.

Finally lets see what the packet capture looks like on R4:

R4#show monitor capture LISP buffer brief
 -------------------------------------------------------------
 #   size   timestamp     source	     destination   protocol
 -------------------------------------------------------------
   0  154    0.000000   10.1.1.1         ->  10.1.4.1         UDP
   1  154    0.000000   10.1.4.1         ->  10.1.1.1         UDP
   2  154    0.001007   10.1.1.1         ->  10.1.4.1         UDP
   3  154    0.001007   10.1.4.1         ->  10.1.1.1         UDP
   4  154    0.001007   10.1.1.1         ->  10.1.4.1         UDP
   5  154    0.001999   10.1.4.1         ->  10.1.1.1         UDP
   6  154    0.003006   10.1.1.1         ->  10.1.4.1         UDP
   7  154    0.016006   10.1.4.1         ->  10.1.1.1         UDP
   8  154    0.025008   10.1.1.1         ->  10.1.4.1         UDP
   9  154    0.038008   10.1.4.1         ->  10.1.1.1         UDP
  10  162    2.282035   10.1.4.1         ->  10.1.3.1         UDP
  11  162    2.282035   10.1.3.1         ->  10.1.4.1         UDP

(I used an EPC (Embedded Packet Capture) on R4 to get the data).

We clearly see that UDP traffic is flowing between R2 and R6.

So this tunneling characteristic is one way we can utilize LISP, but there are many other use cases as I mentioned before.

I hope this has been useful to you.

Until next time, take care!

Short update

Its been a long time since my last update. I apologise for this. It wasnt my intention, it just sort of happened.

In the meantime I have tried the CCIE SP lab and didnt pass it, so I am still studying for my next attempt which is comming up shortly.
Until then I have booked quite a number of rack hours. Hopefully I will learn from some of the mistakes I have identified.

Just last week Cisco announced the IOS XRv image that allows you to run a virtual instance of IOS XR. This is great news for the community at large as it provides the ability to learn about XR without having to spend alot of money on rack rentals or even buying platforms that run XR.

Unfortunally, there is a bug in the download system, which Cisco is trying to correct. It disallows the download for people with active partner status. This includes me. So we are to wait 3 days at the time of writing until it gets sorted out.

I suggest you take a look at FryGuy’s blog about the release of IOS XRv.
The link can be found here:
http://www.fryguy.net/2014/02/08/cisco-ios-xrv-v-as-in-virtual/

Take care!

Fixing multicast RPF failure with BGP

In this post i would like to explain how you can fix a multicast RPF failure using BGP.

If you take a look at the topology in figure 1, we have a network running EIGRP as the IGP
and where R1 advertises its loopback 0 (1.1.1.1/32). R4 also has a loopback 0 with the 4.4.4.4/32 address.
EIGRP adjacencies are running between R1 and R2, R1 and R3, R2 and R3 and finally R3 and R4.
Basically on all links in the topology.

Figure 1

Everything is working fine and we can verify that we have reachability through the network using icmp echo between R1 and R4:

R1#ping 4.4.4.4 so loo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 4.4.4.4, timeout is 2 seconds:
Packet sent with a source address of 1.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 32/59/96 ms

To create the multicast network, we enable multicast routing on all the routers and enable
pim sparse-mode on all links except the R1 to R3 link! (remember to include the loopbacks).

Rx(config-if)#ip pim sparse-mode

To simulate traffic, R1 will be the source and R4’s loopback will be the receiver, so on R4, lets join
239.1.1.1:

R4(config-if)# ip igmp join-group 239.1.1.1

What we end up with is a PIM topology as in figure 2.

Figure 2

Figure 2

Now what we want to do is make R1 our RP using BSR. What will happen is that R3 will receive a BSR
message on its f2/0 interface and determine that its not the RPF interface it should be getting this message on.
That would instead be f1/0 as per the IGP.

Using “debug ip pim bsr” a message will appear on the console:

*Jun 12 11:07:20.063: PIM-BSR(0): bootstrap (1.1.1.1) on non-RPF path FastEthernet2/0 or from non-RPF neighbor 0.0.0.0 discarded

This basically renders our multicast network incomplete as R3 and R4 wont have any RP.

So lets fix it!

On R2 and R3 we enable BGP with the multicast address family:

R2:

router bgp 100
no bgp default ipv4-unicast
bgp log-neighbor-changes
neighbor 10.2.3.3 remote-as 100
!
address-family ipv4
no synchronization
no auto-summary
exit-address-family
!
address-family ipv4 multicast
network 1.1.1.1 mask 255.255.255.255
neighbor 10.2.3.3 activate
no auto-summary
exit-address-family

R3:

router bgp 100
no bgp default ipv4-unicast
bgp log-neighbor-changes
neighbor 10.2.3.2 remote-as 100
!
address-family ipv4
no synchronization
no auto-summary
exit-address-family
!
address-family ipv4 multicast
neighbor 10.2.3.2 activate
distance bgp 20 70 1
no auto-summary
exit-address-family

So what did we do here. Well for one thing, we are creating an iBGP peering between R2 and R3.
On this peering we are enable the multiprotocol extension for multicast.

Then on R2, we are advertising the 1.1.1.1/32 network as learned through EIGRP.
A similar configuration is used on R3 except we are not advertising anything and importantly we are using the
distance command to set the administrative distance for iBGP routes to 70. This last step would not be nessecary
if we had an eBGP peering instead, since the AD would then already be lower than the IGP learned route.

On R3, we should now see the following:

R3#sh bgp ipv4 mu
BGP table version is 2, local router ID is 10.3.4.3
Status codes: s suppressed, d damped, h history, * valid, > best, i - internal,
r RIB-failure, S Stale, m multipath, b backup-path, x best-external
Origin codes: i - IGP, e - EGP, ? - incomplete

Network          Next Hop            Metric LocPrf Weight Path
*>i1.1.1.1/32       10.1.2.1            156160    100      0 i

and using “show ip rpf 1.1.1.1”:

R3#sh ip rpf 1.1.1.1
RPF information for ? (1.1.1.1)
RPF interface: FastEthernet2/0
RPF neighbor: ? (10.2.3.2)
RPF route/mask: 1.1.1.1/32
RPF type: multicast (bgp 100)
Doing distance-preferred lookups across tables
RPF topology: ipv4 multicast base, originated from ipv4 unicast base

As can be seen, R3 now use the BGP learned route through f2/0 to get “back” to 1.1.1.1/32.
And finally, the RP should be installed on both R3 and R4:

R3#sh ip pim rp map
PIM Group-to-RP Mappings

Group(s) 224.0.0.0/4
RP 1.1.1.1 (?), v2
Info source: 1.1.1.1 (?), via bootstrap, priority 0, holdtime 150
Uptime: 00:52:22, expires: 00:01:51

R4#sh ip pim rp map
PIM Group-to-RP Mappings

Group(s) 224.0.0.0/4
RP 1.1.1.1 (?), v2
Info source: 1.1.1.1 (?), via bootstrap, priority 0, holdtime 150
Uptime: 00:52:36, expires: 00:01:36

So thats how we would fix a multicast RPF failure using BGP.

Understanding the “NTP access-group” command in IOS.

NTP has always been one of those things I have found tricky to really lab up. Its fairly easy to setup, but verifying whether everything is working as you expect, can be hard because it takes a while to synchronize (and even unsynchronize).

In this post I will try and shed some light on the “ntp access-group” command set in Cisco IOS.

When you perform a “?” on the command set it looks like the following (on 12.2(33)SRD7):

R1(config)#ntp access-group ?
  peer        Provide full access
  query-only  Allow only control queries
  serve       Provide server and query access
  serve-only  Provide only server access

For each of the options you can specify an access-list:

R1(config)#ntp access-group peer ?
  <1-99>       Standard IP access list
  <1300-1999>  Standard IP access list (expanded range)

The trick to understanding how this lightweight security system works is to understand the following sentence in the documentation:

“If you specify any access groups, only the specified access is granted.”

Along with the ordered list of most open to least open:

peer
query-only
serve
serve-only

Lets illustrate this with an example. If you apply the following:

access-list 90 deny any 
access-list 91 permit 10.1.2.10 
ntp access-group peer 90 
ntp access-group serve-only 91 

What you are really doing is telling the router that it cant “peer” with anything (allow time requests and allow the system itself to synchronize). However processing of an incomming time request will go down the list and meet the “ntp access-group serve-only 91” command. This allows time requests from the hosts permitted in the access-list.

In our case host 10.1.2.10 can get its time from the local system.

The example above is for demonstration purposes since the “ntp access-group peer 90” is the same as not having specified the “ntp access-group peer” command in the first place.

So you see, an incomming request goes down the list of things “allowed” and if it finds itself allowed by anything, it succeeds. However if it reach the end of the list and nothing has permitted the request, it is discarded.

Caution

In my example, I have actually locked out any chance for the router itself to synchronize its time. This is due to the fact that since only peer and serve-only is allowed, and the only one of those two that will allow the router itself to synchronize is the peer option and this option denies everything.