MPLS VPN's over mGRE


This blog post outlines what “MPLS VPNs over mGRE” is all about as well as provide an example of such a configuration.

So what is “MPLS VPNs over mGRE”? – Well, basically its taking regular MPLS VPN’s and using it over an IP only core network. Since VPN’s over MPLS is one of the primary drivers for implementing an MPLS network in the first place, using the same functionality over an IP-only core might be very compelling for some not willing/able to run MPLS label switching in the core.

Instead of using labels to switch the traffic from one PE to another, mGRE (Multipoint GRE) is used as the encapsulation technology instead.

Be advised that 1 label is still being used however. This is the VPN label that’s used to identify which VRF interface to switch the traffic to when its received by a PE. This label is, just as in regular MPLS VPN’s, assigned by the PE through MP-BGP.

So how is this actually performed? – Well, lets take a look at an example.

The topology I will be using is as follows:

Topology for MPLS VPN's over mGRE

Topology for MPLS VPN’s over mGRE

**** Note:** I ran into an issue with VIRL, causing my CSR-3 to R3 to fail when establishing EIGRP adjacency. So i will not be using this in the examples to come. I noted this behavior on the VIRL community forums in case you are interested.

In this topology we have a core network, consisting of CSR-1 to CSR-5. They are all running OSPF in area 0. No MPLS is configured, so its pure IP routing end-to-end.

Lets take a look at CSR-5’s RIB:

CSR-5#sh ip route | beg Gateway
Gateway of last resort is not set
      1.0.0.0/32 is subnetted, 1 subnets
O        1.1.1.1 [110/2] via 192.168.15.1, 00:39:00, GigabitEthernet2
      2.0.0.0/32 is subnetted, 1 subnets
O        2.2.2.2 [110/2] via 192.168.25.2, 00:38:50, GigabitEthernet3
      3.0.0.0/32 is subnetted, 1 subnets
O        3.3.3.3 [110/2] via 192.168.35.3, 00:38:50, GigabitEthernet4
      4.0.0.0/32 is subnetted, 1 subnets
O        4.4.4.4 [110/2] via 192.168.45.4, 00:39:10, GigabitEthernet5
      5.0.0.0/32 is subnetted, 1 subnets
C        5.5.5.5 is directly connected, Loopback0
      192.168.15.0/24 is variably subnetted, 2 subnets, 2 masks
C        192.168.15.0/24 is directly connected, GigabitEthernet2
L        192.168.15.5/32 is directly connected, GigabitEthernet2
      192.168.25.0/24 is variably subnetted, 2 subnets, 2 masks
C        192.168.25.0/24 is directly connected, GigabitEthernet3
L        192.168.25.5/32 is directly connected, GigabitEthernet3
      192.168.35.0/24 is variably subnetted, 2 subnets, 2 masks
C        192.168.35.0/24 is directly connected, GigabitEthernet4
L        192.168.35.5/32 is directly connected, GigabitEthernet4
      192.168.45.0/24 is variably subnetted, 2 subnets, 2 masks
C        192.168.45.0/24 is directly connected, GigabitEthernet5
L        192.168.45.5/32 is directly connected, GigabitEthernet5

And to verify that we are not running any MPLS switching:

CSR-5#sh mpls for
Local      Outgoing   Prefix           Bytes Label   Outgoing   Next Hop
Label      Label      or Tunnel Id     Switched      interface

So we have our connected interfaces along with the loopbacks of all the routers in the core network.

Lets take a look at CSR-1’s configuration, with regards to its VRF configuration in particular:

vrf definition CUST-A
 rd 100:100
 !
 address-family ipv4
  route-target export 100:100
  route-target import 100:100
 exit-address-family
!
!
interface GigabitEthernet2
 vrf forwarding CUST-A
 ip address 10.0.1.1 255.255.255.0
 negotiation auto
!
router eigrp 1
!
address-family ipv4 vrf CUST-A autonomous-system 100
  redistribute bgp 1 metric 1 1 1 1 1
  network 0.0.0.0
 exit-address-family

We have our VRF CUST-A configured, with a RD of 100:100 along with 100:100 as both import and export Route-Targets. Just as we would configure for a regular MPLS L3 VPN.

We use our GigabithEthernet2 interface as our attachment circuit to our CUST-A. In addition we have EIGRP 100 running as the VRF aware IGP towards R1. And finally we are redistributing BGP into the VRF.

Lets make sure we are receiving routes from R1 into the VRF RIB:

CSR-1#sh ip route vrf CUST-A eigrp | beg Gateway
Gateway of last resort is not set
      100.0.0.0/32 is subnetted, 3 subnets
D        100.100.100.1 [90/130816] via 10.0.1.100, 00:45:37, GigabitEthernet2

Looks good, we are receiving the loopback prefix from R1. This is as we would expect.

A similar configuration exists on CSR-2, CSR-3 and CSR-4. Nothing different from a regular MPLS L3 VPN service.

Now for the core configuration utilizing MP-BGP.

We are using CSR-5 as a VPN-v4 route-reflector in order to avoid having a full mesh of iBGP sessions.

So the configuration on R5 looks like this:

CSR-5#sh run | sec router bgp
router bgp 1
 bgp log-neighbor-changes
 no bgp default ipv4-unicast
 neighbor 1.1.1.1 remote-as 1
 neighbor 1.1.1.1 update-source Loopback0
 neighbor 2.2.2.2 remote-as 1
 neighbor 2.2.2.2 update-source Loopback0
 neighbor 3.3.3.3 remote-as 1
 neighbor 3.3.3.3 update-source Loopback0
 neighbor 4.4.4.4 remote-as 1
 neighbor 4.4.4.4 update-source Loopback0
 !
 address-family ipv4
 exit-address-family
 !
 address-family vpnv4
  neighbor 1.1.1.1 activate
  neighbor 1.1.1.1 send-community extended
  neighbor 1.1.1.1 route-reflector-client
  neighbor 2.2.2.2 activate
  neighbor 2.2.2.2 send-community extended
  neighbor 2.2.2.2 route-reflector-client
  neighbor 3.3.3.3 activate
  neighbor 3.3.3.3 send-community extended
  neighbor 3.3.3.3 route-reflector-client
  neighbor 4.4.4.4 activate
  neighbor 4.4.4.4 send-community extended
  neighbor 4.4.4.4 route-reflector-client
 exit-address-family

Pretty straightforward really.

Then on CSR-1:

CSR-1#sh run | sec router bgp
 router bgp 1
  bgp log-neighbor-changes
  no bgp default ipv4-unicast
  neighbor 5.5.5.5 remote-as 1
  neighbor 5.5.5.5 update-source Loopback0
  !
  address-family ipv4
  exit-address-family
  !
  address-family vpnv4
   neighbor 5.5.5.5 activate
   neighbor 5.5.5.5 send-community extended
   neighbor 5.5.5.5 route-map MODIFY-INBOUND in
  exit-address-family
  !
  address-family ipv4 vrf CUST-A
   redistribute eigrp 100
  exit-address-family

Here we have a single neighbor configured (R5 being the RR) using our loopback address. We are also redistributing routes from the VRF into BGP for VPNv4 announcements to the other PE’s. Whats really important (and differs from regular MPLS L3 VPN’s) is the route-map we apply inbound (MODIFY-INBOUND). Lets take a closer look at that:

CSR-1#sh route-map
route-map MODIFY-INBOUND, permit, sequence 10
  Match clauses:
  Set clauses:
    ip next-hop encapsulate l3vpn L3VPN-PROFILE
  Policy routing matches: 0 packets, 0 bytes

So all this does is set the next-hop according to a l3vpn profile called L3VPN-PROFILE. Now this is really the heart of the technology. Lets look at the profile in more detail:

CSR-1#sh run | beg L3VPN
l3vpn encapsulation ip L3VPN-PROFILE
 !

Well, that wasnt very informative. It simply defines a standard profile (which means mGRE) with our desired name.

You can get more detail by using the show commands:

CSR-1#sh l3vpn encapsulation ip
 Profile: L3VPN-PROFILE
  transport ipv4 source Auto: Loopback0
  protocol gre
  payload mpls
   mtu default
  Tunnel Tunnel0 Created [OK]
  Tunnel Linestate [OK]
  Tunnel Transport Source (Auto) Loopback0 [OK]

So this tells us, that by default Loopback0 was chosen as the source of the tunnel and that Tunnel0 was created automatically. So lets take a look at the Tunnel0 in more detail:

CSR-1#sh interface Tunnel0
 Tunnel0 is up, line protocol is up
   Hardware is Tunnel
   Interface is unnumbered. Using address of Loopback0 (1.1.1.1)
   MTU 9976 bytes, BW 10000 Kbit/sec, DLY 50000 usec,
      reliability 255/255, txload 1/255, rxload 1/255
   Encapsulation TUNNEL, loopback not set
   Keepalive not set
   Tunnel linestate evaluation up
   Tunnel source 1.1.1.1 (Loopback0)
    Tunnel Subblocks:
       src-track:
          Tunnel0 source tracking subblock associated with Loopback0
           Set of tunnels with source Loopback0, 1 member (includes iterators), on interface <OK>
   Tunnel protocol/transport multi-GRE/IP
     Key disabled, sequencing disabled
     Checksumming of packets disabled
   Tunnel TTL 255, Fast tunneling enabled
   Tunnel transport MTU 1476 bytes
   Tunnel transmit bandwidth 8000 (kbps)
   Tunnel receive bandwidth 8000 (kbps)
   Last input never, output never, output hang never
   Last clearing of "show interface" counters 00:54:16
   Input queue: 0/375/0/0 (size/max/drops/flushes); Total output drops: 3
   Queueing strategy: fifo
   Output queue: 0/0 (size/max)
   5 minute input rate 0 bits/sec, 0 packets/sec
   5 minute output rate 0 bits/sec, 0 packets/sec
      0 packets input, 0 bytes, 0 no buffer
      Received 0 broadcasts (0 IP multicasts)
      0 runts, 0 giants, 0 throttles
      0 input errors, 0 CRC, 0 frame, 0 overrun, 0 ignored, 0 abort
      0 packets output, 0 bytes, 0 underruns
      0 output errors, 0 collisions, 0 interface resets
      0 unknown protocol drops
      0 output buffer failures, 0 output buffers swapped out

Whats important here is that the Tunnel protocol/transport is multi-GRE/IP, which is the whole point of it all.

So to recap, when we receive prefixes reflected by our RR (this is besides the point, it could just as well be a full mesh), we set our IP Next-Hop to the other PE’s loopback address and tell the router to do the mGRE encapsulation when traffic is to be routed to these prefixes.

Lets take a look at our BGP table on CSR-1:

CSR-1#sh bgp vpnv4 uni vrf CUST-A | beg Network
     Network          Next Hop            Metric LocPrf Weight Path
Route Distinguisher: 100:100 (default for vrf CUST-A)
 *>  10.0.1.0/24      0.0.0.0                  0         32768 ?
 *>i 10.0.2.0/24      2.2.2.2                  0    100      0 ?
 *>i 10.0.3.0/24      3.3.3.3                  0    100      0 ?
 *>i 10.0.4.0/24      4.4.4.4                  0    100      0 ?
 *>  100.100.100.1/32 10.0.1.100          130816         32768 ?
 *>i 100.100.100.2/32 2.2.2.2             130816    100      0 ?
 *>i 100.100.100.4/32 4.4.4.4             130816    100      0 ?

(**Note: Remember CSR-3 is broken because of VIRL)

Lets take a look at what information is present for 100.100.100.2/32:

CSR-1#sh bgp vpnv4 uni vrf CUST-A 100.100.100.2/32
 BGP routing table entry for 100:100:100.100.100.2/32, version 19
 Paths: (1 available, best #1, table CUST-A)
   Not advertised to any peer
   Refresh Epoch 1
   Local
     2.2.2.2 (metric 3) (via default) (via Tunnel0) from 5.5.5.5 (5.5.5.5)
       Origin incomplete, metric 130816, localpref 100, valid, internal, best
       Extended Community: RT:100:100 Cost:pre-bestpath:128:130816
         0x8800:32768:0 0x8801:100:128256 0x8802:65281:2560 0x8803:65281:1500
         0x8806:0:1684300802
       Originator: 2.2.2.2, Cluster list: 5.5.5.5
       mpls labels in/out nolabel/17
       rx pathid: 0, tx pathid: 0x0

Important to note here is that we are being told to use label nr. 17 as the VPN label for this prefix when sending it to 2.2.2.2 (CSR-2).

And finally lets take a look at what CEF thinks about it all:

CSR-1#sh ip cef vrf CUST-A 100.100.100.2 detail
100.100.100.2/32, epoch 0, flags [rib defined all labels]
  nexthop 2.2.2.2 Tunnel0 label 17

So CEF will assign label 17 to the packet and then use Tunnel0 to reach CSR-2. Just as we would expect.

As a final verification ive done an Embedded Packet Capture on CSR-5 while doing a ping from R1’s loopback to R2’s loopback and this is what you can see here:

6  142   29.028990   1.1.1.1          ->  2.2.2.2          GRE
  0000:  FA163E39 EC39FA16 3ECEC705 08004500   ..>9.9..>.....E.
  0010:  00800000 0000FF2F B5490101 01010202   ......./.I......
  0020:  02020000 88470001 11FE4500 00640000   .....G....E..d..
  0030:  0000FE01 2BCD6464 64016464 64020800   ....+.ddd.ddd...
   7  142   29.106989   1.1.1.1          ->  2.2.2.2          GRE
  0000:  FA163E39 EC39FA16 3ECEC705 08004500   ..>9.9..>.....E.
  0010:  00800001 0000FF2F B5480101 01010202   ......./.H......
  0020:  02020000 88470001 11FE4500 00640001   .....G....E..d..
  0030:  0000FE01 2BCC6464 64016464 64020800   ....+.ddd.ddd...
   8  142   29.184988   1.1.1.1          ->  2.2.2.2          GRE
  0000:  FA163E39 EC39FA16 3ECEC705 08004500   ..>9.9..>.....E.
  0010:  00800002 0000FF2F B5470101 01010202   ......./.G......
  0020:  02020000 88470001 11FE4500 00640002   .....G....E..d..
  0030:  0000FE01 2BCB6464 64016464 64020800   ....+.ddd.ddd...
   9  142   29.241037   1.1.1.1          ->  2.2.2.2          GRE
  0000:  FA163E39 EC39FA16 3ECEC705 08004500   ..>9.9..>.....E.
  0010:  00800003 0000FF2F B5460101 01010202   ......./.F......
  0020:  02020000 88470001 11FE4500 00640003   .....G....E..d..
  0030:  0000FE01 2BCA6464 64016464 64020800   ....+.ddd.ddd...
  10  142   29.287024   1.1.1.1          ->  2.2.2.2          GRE
  0000:  FA163E39 EC39FA16 3ECEC705 08004500   ..>9.9..>.....E.
  0010:  00800004 0000FF2F B5450101 01010202   ......./.E......
  0020:  02020000 88470001 11FE4500 00640004   .....G....E..d..
  0030:  0000FE01 2BC96464 64016464 64020800   ....+.ddd.ddd...

As you can see, the encapsulation is GRE, just as expected.

So thats all there is to this technology. Very useful if you have an IP-only core network.

I hope its been useful and i will soon attach all the configurations for the routers in case you want to take a closer look.

Thanks for reading!

Update: Link to configs here