Author Archives: kim

VxLAN on the CSR1Kv

By now, VxLAN is becoming the standard way of tunneling in the Datacenter.
Using VxLAN, i will show how to use the CSR1Kv to extend your Datacenter L2 reach between sites as well.

First off, what is VxLAN?
It stands for Virtual Extensible LAN. Basically you have a way of decoupling your vlan’s into a new scheme.

You basically map your VLAN into a VNI (Virtual Network Identifier), which in essence makes your VLAN numbering scheme locally significant.

Also, since the numbering for VNI’s is a 24 bit identifier, you have alot more flexibility than just the regular 4096 definable VLAN’s. (12 Bits .1q tags)

Each endpoint that does the encapsulation/decapsulation is called a VTEP (VxLAN Tunnel EndPoint). In our example this would be CSR3 and CSR5.

After the VxLAN header, the packet is further encapsulated into a UDP packet and forwarded across the network. This is a great solution as it doesnt impose any technical restrictions on the core of the network. Only the VTEPs needs to understand VxLAN (and probably have hardware acceleration for it as well).

Since we wont be using BGP EVPN, we will rely solely on multicasting in the network to establish who is the VTEP’s for the traffic in question. The only supported mode is BiDir mode, which is an optimization of the control plane (not the data plane), since it only has (*,G) in its multicast-routing tables.

Lets take a look at the topology i will be using for the example:

 

I have used a regular IOS based device in Site 1 and Site 2, to represent our L2 devices. These could be servers or end-clients for that matter. What i want to accomplish is to run EIGRP between R1 and R2 over the “fabric” using VxLAN as the tunneling mechanism.

CSR3 is the VTEP for Site 1 and CSR5 is the VTEP for Site 2.

In the “fabric” we have CSR4, along with its loopback0 (4.4.4.4/32), which is the BiDir RP and its announcing this using BSR so that CSR3 and CSR4 knows this RP information (along with the BiDir functionality). We are using OSPF as the IGP in the “fabric” to establish routing between the loopback interfaces, which will be the VTEP’s respectively for CSR3 and CSR5.

Lets verify that routing between the loopbacks are working and our RIB is correct:

on CSR3:

CSR3#sh ip route | beg Gate
Gateway of last resort is not set

      3.0.0.0/32 is subnetted, 1 subnets
C        3.3.3.3 is directly connected, Loopback0
      4.0.0.0/32 is subnetted, 1 subnets
O        4.4.4.4 [110/2] via 10.3.4.4, 00:38:27, GigabitEthernet2
      5.0.0.0/32 is subnetted, 1 subnets
O        5.5.5.5 [110/3] via 10.3.4.4, 00:38:27, GigabitEthernet2
      10.0.0.0/8 is variably subnetted, 3 subnets, 2 masks
C        10.3.4.0/24 is directly connected, GigabitEthernet2
L        10.3.4.3/32 is directly connected, GigabitEthernet2
O        10.4.5.0/24 [110/2] via 10.3.4.4, 00:38:27, GigabitEthernet2

CSR3#ping 5.5.5.5 so loo0
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 5.5.5.5, timeout is 2 seconds:
Packet sent with a source address of 3.3.3.3 
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/7/22 ms

This means we have full reachability through the “fabric” from VTEP to VTEP.

Lets make sure our multicast routing is working properly and lets take a look at CSR4 first, since its the RP for the network:

CSR4#sh run | incl ip pim|interface
interface Loopback0
 ip pim sparse-mode
interface GigabitEthernet1
 ip pim sparse-mode
interface GigabitEthernet2
 ip pim sparse-mode
interface GigabitEthernet3
interface GigabitEthernet4
ip pim bidir-enable
ip pim bsr-candidate Loopback0 0
ip pim rp-candidate Loopback0 bidir

We can see from this output that we are running PIM on all the relevant interfaces as well as making sure that bidir is enabled. We have also verified that we are indeed running BSR to announce Loopback0 as the RP.

Lets verify the multicast routing table:

CSR4#sh ip mroute | beg Outgoing   
Outgoing interface flags: H - Hardware switched, A - Assert winner, p - PIM Join
 Timers: Uptime/Expires
 Interface state: Interface, Next-Hop or VCD, State/Mode

(*,224.0.0.0/4), 00:45:05/-, RP 4.4.4.4, flags: B
  Bidir-Upstream: Loopback0, RPF nbr: 4.4.4.4
  Incoming interface list:
    GigabitEthernet2, Accepting/Sparse
    GigabitEthernet1, Accepting/Sparse
    Loopback0, Accepting/Sparse

(*, 239.1.1.1), 00:44:03/00:02:46, RP 4.4.4.4, flags: B
  Bidir-Upstream: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    GigabitEthernet1, Forward/Sparse, 00:44:03/00:02:38
    GigabitEthernet2, Forward/Sparse, 00:44:03/00:02:46

(*, 224.0.1.40), 00:45:05/00:01:56, RP 0.0.0.0, flags: DCL
  Incoming interface: Null, RPF nbr 0.0.0.0
  Outgoing interface list:
    Loopback0, Forward/Sparse, 00:45:04/00:01:56

We can see that we do have some (*,G) entries installed (more on the (*, 239.1.1.1) later).
Excellent.

Now lets take a sample from CSR3’s multicast configuration:

CSR3#sh ip pim rp mapping 
PIM Group-to-RP Mappings

Group(s) 224.0.0.0/4
  RP 4.4.4.4 (?), v2, bidir
    Info source: 4.4.4.4 (?), via bootstrap, priority 0, holdtime 150
         Uptime: 00:45:39, expires: 00:02:23

We see that we have learned the RP, its functionality as BiDir and its learned through BSR.

So far so good. Now lets turn our attention to the VxLAN part of the configuration.

The VTEP functionality is implemented by a new interface, called an NVE. This is where the configuration of which source address to use along with the multicast group to use for flooding is defined.

This is the configuration for CSR3:

CSR3#sh run int nve1
Building configuration...

Current configuration : 137 bytes
!
interface nve1
 no ip address
 source-interface Loopback0
 member vni 1000100 mcast-group 239.1.1.1
 no mop enabled
 no mop sysid
end

Whats important here is that we will source our VTEP from loopback0 (3.3.3.3/32) and use multicast group 239.1.1.1 for the VNI 1000100. This number can be whatever you choose, i have just chosen to use a very large number and encode which VLAN this VNI is used for (Vlan 100).

On the opposite side, we have a similar configuration for the NVE:

CSR5#sh run int nve1
Building configuration...

Current configuration : 137 bytes
!
interface nve1
 no ip address
 source-interface Loopback0
 member vni 1000100 mcast-group 239.1.1.1
 no mop enabled
 no mop sysid
end

Its very important that the multicast group matches on both sides as this is the group they will use to flood BUM (Broadcasts, Unknowns and Multicast) traffic. For example ARP.

The next configuration piece is that we need to create an EFP (Ethernet Flow Point) on the interface towards the site routers (R1 and R2) where we accept traffic tagged with vlan 100:

CSR3#sh run int g1
Building configuration...

Current configuration : 195 bytes
!
interface GigabitEthernet1
 no ip address
 negotiation auto
 no mop enabled
 no mop sysid
 service instance 100 ethernet
  encapsulation dot1q 100
  rewrite ingress tag pop 1 symmetric
 !
end

This configuration piece states that the encap is dot1q vlan 100 and to strip the tag inbound before further processing and add it again on egress.

Now for the piece that ties it all together, namely the bridge-domain:

bridge-domain 100 
 member vni 1000100
 member GigabitEthernet1 service-instance 100

Here we have a bridge domain configuration where we have 2 members. The local interface G1 on its service instance 100 and our VNI / VTEP. This is basically the glue to tie the bridge domain together end to end.

The same configuration is present on CSR5 as well.

Let verify the control plane on CSR3:

CSR3#sh bridge-domain 100
Bridge-domain 100 (2 ports in all)
State: UP                    Mac learning: Enabled
Aging-Timer: 300 second(s)
    GigabitEthernet1 service instance 100
    vni 1000100
   AED MAC address    Policy  Tag       Age  Pseudoport
   0   AABB.CC00.1000 forward dynamic   298  GigabitEthernet1.EFP100
   0   AABB.CC00.2000 forward dynamic   300  nve1.VNI1000100, VxLAN 
                                             src: 3.3.3.3 dst: 5.5.5.5

This command will show the MAC addresses learned in this particular bridge domain. On our EFP on G1 we have dynamically learned the MAC address of R1’s interface and through the NVE1 interface using VNI 1000100 we have learned the MAC address of R2. Pay attention to the fact that we know which VTEP endpoints to send the traffic to now. This means that further communication between these two end-hosts (R1 and R2) is done solely using unicast between 3.3.3.3 and 5.5.5.5 using VxLAN as the tunneling mechanism.

CSR3#show nve interface nve 1 detail 
Interface: nve1, State: Admin Up, Oper Up, Encapsulation: Vxlan,
BGP host reachability: Disable, VxLAN dport: 4789
VNI number: L3CP 0 L2DP 1
source-interface: Loopback0 (primary:3.3.3.3 vrf:0)
   Pkts In   Bytes In   Pkts Out  Bytes Out
      3273     268627       3278     269026

This command shows the status of our NVE interface. From this we can see that its in an Up/Up state. The VxLAN port is the standard destination port (4789) and we have some packets going back and forth.

Now that we have everything checked out okay in the control plane, lets see if the data plane is working by issuing an ICMP ping on R1 to R2 (they are obviously on the same subnet (192.168.100.0/24)):

R1#ping 192.168.100.2
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 192.168.100.2, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/11/26 ms
R1#sh arp
Protocol  Address          Age (min)  Hardware Addr   Type   Interface
Internet  192.168.100.1           -   aabb.cc00.1000  ARPA   Ethernet0/0.100
Internet  192.168.100.2           8   aabb.cc00.2000  ARPA   Ethernet0/0.100

This looks excellent! and in fact the EIGRP peering i had setup between them works as well:

R1#sh ip eigrp neighbors 
EIGRP-IPv4 Neighbors for AS(100)
H   Address                 Interface              Hold Uptime   SRTT   RTO  Q  Seq
                                                   (sec)         (ms)       Cnt Num
0   192.168.100.2           Et0/0.100                12 04:14:30    4   100  0  3

R1#sh ip route eigrp | beg Gateway
Gateway of last resort is not set

      100.0.0.0/32 is subnetted, 2 subnets
D        100.100.100.2 
           [90/409600] via 192.168.100.2, 04:14:46, Ethernet0/0.100

This address is the loopback of R2.

Finally i want to show how the ICMP ping works in the dataplane by doing a capture on CSR4’s G2 interface:

Here we can see a ping i issued on R1’s loopback interface towards R2’s loopback interface.
I have extended the view, so you can see the encapsulation with the VxLAN header running atop the UDP packet.
The UDP packet has the VTEP endpoints (3.3.3.3/32 and 5.5.5.5/32) as the source and destination.

The VNI is what we selected to use and is used for differentiation on the VTEP.
Finally we have our L2 packet in its entirety.

Thats all I wanted to show for now. Next time I will extend this a bit and involve BGP as the control plane.
Thanks for reading!

ISIS Authentication types (packet captures)

In this post i would like to highlight a couple of “features” of ISIS.
More specifically the authentication mechanism used and how it looks in the data plane.

I will do this by configuring a couple of routers and configure the 2 authentication types available. I will then look at packet captures taken from the link between them and illustrate how its used by the ISIS process.

The 2 types of Authentication are link-level authentication of the Hello messages used to establish an adjacency and the second type is the authentication used to authenticate the LSP’s (Link State Packet) themselves.

First off, here is the extremely simple topology, but its all thats required for this purpose:

Simple, right? 2 routers with 1 link between them on Gig1. They are both running ISIS level-2-only mode, which means they will only try and establish a L2 adjacency with their neighbors. Each router has a loopback interface, which is also advertised into ISIS.

First off, lets look at the relevant configuration of CSR-02 for the Link-level authentication:

key chain MY-CHAIN
 key 1
  key-string WIPPIE
!
interface GigabitEthernet1
 ip address 10.1.2.2 255.255.255.0
 ip router isis 1
 negotiation auto
 no mop enabled
 no mop sysid
 isis authentication mode md5
 isis authentication key-chain MY-CHAIN

Without the same configuration on CSR-01, this is what we see in the data path (captured on CSR-02’s G1 interface):

And we also see that we dont have a full adjacency on CSR-01:

CSR-01#sh isis nei

Tag 1:
System Id       Type Interface     IP Address      State Holdtime Circuit Id
CSR-02          L2   Gi1           10.1.2.2        INIT  26       CSR-02.01

Lets apply the same authentication configuration on CSR-01 and see the result:

key chain MY-CHAIN
 key 1
  key-string WIPPIE
!
interface GigabitEthernet1
 ip address 10.1.2.1 255.255.255.0
 ip router isis 1
 negotiation auto
 no mop enabled
 no mop sysid
 isis authentication mode md5
 isis authentication key-chain MY-CHAIN

We now have a full adjacency:

CSR-01#sh isis neighbors 

Tag 1:
System Id       Type Interface     IP Address      State Holdtime Circuit Id
CSR-02          L2   Gi1           10.1.2.2        UP    8        CSR-02.01     

And we have routes from CSR-02:

CSR-01#sh ip route isis | beg Gate
Gateway of last resort is not set

      2.0.0.0/32 is subnetted, 1 subnets
i L2     2.2.2.2 [115/20] via 10.1.2.2, 00:01:07, GigabitEthernet1

Now, this is what we now see from CSR-02’s perspective again:

The Link-level authentication is fairly easy to spot in no time, because you simply wont have a stable adjacency formed.

The second type is LSP authentication. Lets look at the configuration of CSR-02 for this type of authentication:

CSR-02#sh run | sec router isis
 ip router isis 1
 ip router isis 1
router isis 1
 net 49.0000.0000.0002.00
 is-type level-2-only
 authentication mode text
 authentication key-chain MY-CHAIN

In this example, i have selected plain-text authentication, which i certainly dont recommend in production, but its great for our example.

Again, this is what it looks like in the data packet (from CSR-01 to CSR-02) without authentication enabled on CSR-01:

As you can see, we have the LSP that contains CSR-01’s prefixes, but nowhere is authentication present in the packet.

Lets enable it on CSR-01 and see the result:

CSR-01#sh run | sec router isis
 ip router isis 1
 ip router isis 1
router isis 1
 net 49.0000.0000.0001.00
 is-type level-2-only
 authentication mode text
 authentication key-chain MY-CHAIN

The result in the data packet:

Here we clearly have the authentication (with type = 10 (cleartext)) and we can see the password (WIPPIE) because we have selected cleartext.

The result is we a validated ISIS database on both routers.

Thats all folks, hope it helps to understand the difference between the 2 types of authentication in ISIS.

Take care!

 

 

Progress update – 10/07-2017

Hello folks,

Im currently going through the INE DC videos and learning a lot about fabrics and how they work along with a fair bit of UCS information on top of that!

Im spending an average of 2.5 hours on weekdays for study and a bit more in the weekends when time permits.

I still have no firm commitment to the CCIE DC track, but at some point I need to commit to it and really get behind it. One of these days 😉

I mentioned it to the wife-to-be a couple of days ago and while she didn’t applaud the idea, at least she wasn’t firmly against it, which is always something I guess! Its very important for me to have my family behind me in these endeavours!

Im still a bit concerned about the lack of rack rentals for DCv2 from INE, which is something I need to have in place before I order a bootcamp or more training materials from them. As people know by now, I really do my best learning in front of the “system”, trying out what works and what doesn’t.

Now to spin up a few N9K’s in the lab and play around with NX-OS unicast and multicast routing!

Take care.

New Lab Server & random updates

New Server:

So I just completed a purchase off eBay for a new server for my lab purposes.

For a while now I’ve been limited to 32Gb of memory on my old ESXi server, which is really more like 20Gb when my regular servers have had their share. Running a combination of different types of devices, each taking at least 4Gb of memory, doesn’t leave much room for larger labs.

I decided to go with a “real” server this time around. So I got an older Cisco UCS C200 M2 server with 2 x Xeon 5570 processors and an additional 96 Gb ram (on top of the 24 it came with). That stil leaves room for a bit of memory upgrades in the future (it supports a total of 192Gb) (had a budget on this one, so couldn’t go crazy).

Work:

Work has been crazy lately. 2 of my Team members just resigned so a lot of workload has to be shifted until we find suitable replacements. That means I’ve been working 65+ hour work weeks for a while now. Something that I dont find even remotely amusing to be honest. But I’ve been reassured that everything is being done to interview candidates, so im hopeful it will work out after the summer holidays.

We have a lot of interesting projects coming up, fx. our first production environment running Cisco ACI. This also included some very good training. Its really a different ball-game compared to the old way of doing Datacenters.

Also on my plate is some iWan solutions. Pretty interesting all in all.

Study:

Im still reading my way through the Cisco Intelligent WAN (IWAN) book. Its still on my list of things to do to take the exam I mentioned earlier, but keeping the work network running takes priority. Also I can’t help but feeling the pull of another CCIE when time permits, but its still just a thought (we all know how that usually goes right? 🙂 )

Personal:

September 16, my long-time girlfriend and I are getting married! Yes.. Married. Scary, but still something I look forward to. We’ve been together for an amazing 11 years on that date so its about time (she keeps telling me). As you may know, I proposed when I went to Las Vegas for Cisco Live last year, so its very memorable 🙂

Thats about it for now!

Take care!

Onto the next one…

Yesterday I passed the CCNA-W exam. Now onto the next partner certification I need to do before summer.

Its called 500-452 ENCWE – Enterprise Networks Core and WAN Essentials and a large part of it involves iWAN, which im not too familiar with.

To that effect I have ordered the official Cisco Press iWAN book and downloaded all the presentations I could find on iWAN from CiscoLive365. That should keep me busy for the foreseeable future 🙂

I will hopefully be doing some labs on iWAN and will post any findings I have here. It should be fun!

Im still debating whether or not I will goto CLUS this year. Whats really pulling me over there is the people I rarely get to meet. I need to make up my mind soon though.

Take care!

/Kim

So whats next?

I’ve had a little time readjusting after my exam and I’ve given some thought on what to keep me busy next.

Basically I have 3 projects to keep me busy for the next foreseeable future.

1) CCNA-Wireless
My boss came to me a week ago and tasked me with this. He was very humble about it, which was amusing. I will be allocated some time from my normal work projects to study for the exam, which is really helpful. Fortunally some of my CCDE study friends are also going for this exam, so I wont be going down the road alone on this one either.
Im actually quite positive about this as its a technology area I have not really paid much attention to and its very different in what im used to. A shakeup is good every now and then 🙂

2) The IOS-XR Specialist exam
This is one I have been looking quite forward to for some time. Its basically an exam about all things IOS-XR and the platforms that supports it. I tried studying for this before I decided to go down the CCDE path, so it will be nice to pick back up.

3) Work on improving my health.
This is by far the most important one. I have been neglecting my health for far too long now and this needs to change. Im thinking about a different blog site where ill provide updates on what my current health situation is, along with how I aim to improve it. More on this to come when ive gathered my thoughts on this bullet-point.

So it didnt take me long to come up with something to keep me busy 🙂

/Kim

CCDE – A different Journey

Wednesday the 22nd of February, in a testing center in the middle of London, my journey towards achieving the CCDE certification, finally ended in me passing this beast of an exam.

This learning journey was a very different one than either of my CCIE’s. Whereas going for the CCIE meant spending countless hours at the command-line, the CCDE meant spending all of those hours reading and discussing use cases for technologies. It also meant stepping my toes into the business side, picking up the “Why?” behind selecting a specific technology.

It all started a few years ago when my friend Daniel (lostintransit.se) and I started going back and forth on how to approach this thing. We decided we should team up and share notes, discuss technologies and generally use each other as a sparring partner.

At that point I had already decided, that this was going to be a marathon for me, because I could at the time, not allocate as much time each day for study, as I had been for the CCIE’s. Fast forward a good amount of time and I had finally passed the written part of the exam and was ready to really focus on the practical aspect.

Anyone reading the CCDE Practical blueprint basically come to the same conclussion in that it involves every technology under the sun and then some. This in turn made us start a small study group using Slack, with only 4 members to begin with. This was the turning point for me as we had discussions around scenarios and use cases. It helped tremendously that we all come from different backgrounds and industries as well as diverse geographies.

I attended Jeremy Filiben’s CCDE bootcamp in Orlando in April 2016 (http://www.jeremyfilliben.com), which was a great experience. Jeremy is a very good teacher and his scenarios are top notch. Going over them really makes you think in a completely different way than fx. the CCIE track(s). As an implementation engineer, you tend to think in a certain way which is not doing you any good in a design context.

Early on, I also purchased access to an All Access Pass from INE in order to gain access to their CCDE training material. Unfortunally it has not been updated since and they have no plan to 🙁

I also used the Self-Paced material from Orhan Ergun, especially the Quizzes was of help to me. (orhanergun.net).

I had my first attempt at the practical in late summer 2016 and I did fairly well, but it certainly opened my eyes to what i had gotten myself into. At the same time I had the fortune of meeting up with some of the Slack study group members which was an added bonus! (Thanks Andre for introducing me to a proper beer 🙂 )…

At this point I had some real life stuff to attend to (we purchased a house with all that entails). So I was unable to commit to the November testing date. However a good number of people from our study group passed which really motivated me to get on with my studies. So ever since late November I have been doing 3-4 hours of study each week day and 5-6 hours each day during the weekend.

The last month and a half before the exam i focused on doing practice scenarios and watching recordings made from our study group discussing designs. I can also highly recommend watching the new Safari Livelessons on QoS and Large Scale Network Designs. If time is of the essence to you, do some speed-viewing which is what i did as well.

I left for London on the day before the exam and checked into my hotel, which is only a 10 minute walk from the testing center. I had already mentally prepared on what to do on the day itself, so basically the day before was the last day to recharge my “batteries” and try and find some focus. I spent an hour or so going over my notes, especially the QoS and IPv6 parts, then went for dinner. I went to bed early as planned.

On the day I woke up very early, which is not unusual for me in any way, so I basically followed what I had mentally prepared for myself which was to take a long and hot shower, get a decent breakfast (hard part for me as i dont normally have breakfast) and then back to the room to collect my things (Exam registration, wallet and passport (You need two forms of ID)). I left the hotel at 7:20’ish and took the short walk to the testing center. Since I was early i waited outside for a bit waiting for the other guy from our study group to show up to say Hi before the exam, but he was running a bit late, so I finally decided that I had to get started and went inside.

Since the exam is under heavy NDA like any other Cisco exam, I wont get into any details regarding the experience, except to say that at no point in the exam did i feel really comfortable about passing it, quite the contrary in fact. However I knew this is a feeling most candidates experience during the CCDE, so I just decided to press on and do my very best.

At lunch I took the advice of several people and had a bite to eat (even though i still didnt feel like eating) and some water and used close to the full hour of lunch available. It being London and all, there are several good places for lunch very close by to the testing center, so you dont need to go very far at all.
The afternoon went by and I finally clicked the End Exam button and instantly my mood picked up. I had passed!! I honestly didnt know what to do with myself at this point 🙂

I had a brisk walk back to the hotel where i could finally put my “guard” down and smile a bit!
I called my better half and told her the good news and she was even more estatic than I was.

So what sort of advice can I give to others?

Read, Watch Sessions (Cisco Live, Safari, Nanog etc.), constantly asking “Why?” to everything. Learn to read through documents and pick up on Business requirements and details that will pertain to certain design choices. Read some more.

If you are like me and you are easily distracted during your studies, I can highly recommend using the “Pomodoro”-method (Look it up), which is essentially 25 minute slots of focus on your studies and then a short break. It gives you a scope of focus and helps keep you on track. On top of that I mark down everything i do, study wise, into my calendar so I can look back at it and get a good “feel” for how much time i spend on it. It helps to give you a boost when you feel that you havent put enough effort into it.

If you want some recommendation on which book(s) to read, here’s a subset of the books I’ve read:

1) Definitive MPLS Network Designs.
2) CCDE Study Guide.
3) Optimal Routing Design.
4) End-To-End QoS.

These are by far the most important ones, but by no means the only ones you want to read through. You have to assess which technologies you need to learn (more) about and then pick the right material for those cases. The books above are very good for general theory but especially Definitive MPLS Network Designs is good for putting all the relevant pieces into 4 distinct use cases.

Some of the Cisco Live sessions i went through includes:

– Scaling BGP (BRKRST-3321)
– Wan and Remote-Site Deployment using CVDs (BRKRST-2040)
– Highly available Wide Area Network Design (BRKRST-2042)
– WAN Architectures and Design Principles (BRKRST-2041)
– Layer 3 Network Virtualization Design Concepts over the WAN (BRKRST-2045)
– Deploying a virtualized campus network infrastructure (BRKCRS-2033)
– Best practices to deploy HA in SP edge and aggregation architectures (BRKSPG-2402)
– Advanced enterprise campus design: routed access (BRKCRS-3036)
– Deploying BGP Fast Convergence / BGP PIC (BRKIPM-2265)
– The QoS Paradigm Shift (BRKRST-2056)
– Deploying OSPF in modern networks (BRKRST-2337)
– ISIS Deployment in modern networks (BRKRST-2338)
– IPv6 Transition Technologies (BRKSPG-2067)
– Choosing the right VPN technology for your network (BRKSEC-1050)
– Firewall architectures in the Data Centre (BRKSEC-2021)
– MAP-E/MAP-T IPv6 transitioning (BRKSPG-2606)

There are a bunch of others and I recommend you search for “Design” and “Use Case” on CiscoLive365.com (It really is an awesome resource for learning).

The last piece of advice I can give, is to join a study group, or if you are more inclined, create one yourself along with some of your friends who are serious about the CCDE as well.
Have discussions revolving around technologies and have even more discussion on scenarios and use cases for those technologies. It really is quite important to expand your horizont in order to be successful in this exam.

If you would like, at certain times we have openings in our study group (which counts 60+ people at the time of writing, including several well known others and vendors), so get in touch if thats your preference.

With that said, I would like to thank you for reading through this post. Its been a fun learning experience all the way through my CCDE – Journey!

/Kim (CCDE #20170021)

A look at Auto-Tunnel Mesh Groups

In this post I would like to give a demonstration of using the Auto-Tunnel Mesh group feature.

As you may know, manual MPLS-TE tunnels are first and foremost unidirectional, meaning that if you do them between two PE nodes, you have to do a tunnel in each direction with the local PE node being the headend.

Now imagine if your network had 10 PE routers and you wanted to do a full mesh between them, this can become pretty burdensome and error-prone.
Thankfully there’s a method to avoid doing this manual configuration and instead rely on your IGP to signal its willingness to become part of a TE “Mesh”. Thats what the Auto-Tunnel Mesh Group feature is all about!

toplogy

In my small SP setup, I only have 3 PE devices, namely PE-1, PE-2 and PE-3. I also only have one P node, called P-1.
However small this setup is, its enough to demonstrate the power of the Auto-Tunnel mesh functionality.

Beyond that, I have setup a small MPLS L3 VPN service for customer CUST-A, which has a presence on all 3 PE nodes. The VPNv4 address-family is using a RR which for this purpose is P-1.

We are running OSPF as the IGP of choice. This means that our Mesh membership will be signaled using Opaque LSA’s, which I will show you later on.

The goal of the lab is to use the Auto-Tunnel mesh functionality to create a full mesh of tunnels between my PE nodes and use this exclusively for label switching and to do so with a general template that would scale to many more PE devices than just the 3 in this lab.

The very first thing you want to do is to enable MPLS-TE both globally and on your interfaces. We can verify this on PE-2:

PE-2:

mpls traffic-eng tunnels
!
interface GigabitEthernet2
ip address 10.2.100.2 255.255.255.0
negotiation auto
mpls traffic-eng tunnels
!

The second thing you want to do is to enable the mesh-feature globally using the following command as configured on PE-2 as well:

PE-2:

mpls traffic-eng auto-tunnel mesh

Starting off with MPLS-TE, we need to make sure our IGP is actually signaling this to begin with. I have configured MPLS-TE on the area 0 which is the only area in use in our topology:

PE-2:

router ospf 1
network 0.0.0.0 255.255.255.255 area 0
mpls traffic-eng router-id Loopback0
mpls traffic-eng area 0
mpls traffic-eng mesh-group 100 Loopback0 area 0

Dont get hung up on the last configuration line. I will explain this shortly. However notice the “mpls traffic-eng area 0” and “mpls traffic-eng router-id loopback0”. After those two lines are configured, you should be able to retrieve information on the MPLS-TE topology as seen from your IGP:

PE-2:

PE-2#sh mpls traffic-eng topology brief
My_System_id: 2.2.2.2 (ospf 1 area 0)

Signalling error holddown: 10 sec Global Link Generation 22

IGP Id: 1.1.1.1, MPLS TE Id:1.1.1.1 Router Node (ospf 1 area 0)
Area mg-id's:
: mg-id 100 1.1.1.1 :
link[0]: Broadcast, DR: 10.1.100.100, nbr_node_id:8, gen:14
frag_id: 2, Intf Address: 10.1.100.1
TE metric: 1, IGP metric: 1, attribute flags: 0x0
SRLGs: None

IGP Id: 2.2.2.2, MPLS TE Id:2.2.2.2 Router Node (ospf 1 area 0)
link[0]: Broadcast, DR: 10.2.100.100, nbr_node_id:9, gen:19
frag_id: 2, Intf Address: 10.2.100.2
TE metric: 1, IGP metric: 1, attribute flags: 0x0
SRLGs: None

IGP Id: 3.3.3.3, MPLS TE Id:3.3.3.3 Router Node (ospf 1 area 0)
Area mg-id's:
: mg-id 100 3.3.3.3 :
link[0]: Broadcast, DR: 10.3.100.100, nbr_node_id:11, gen:22
frag_id: 2, Intf Address: 10.3.100.3
TE metric: 1, IGP metric: 1, attribute flags: 0x0
SRLGs: None

IGP Id: 10.1.2.2, MPLS TE Id:22.22.22.22 Router Node (ospf 1 area 0)
link[0]: Broadcast, DR: 10.1.100.100, nbr_node_id:8, gen:17
frag_id: 3, Intf Address: 10.1.100.100
TE metric: 10, IGP metric: 10, attribute flags: 0x0
SRLGs: None

link[1]: Broadcast, DR: 10.2.100.100, nbr_node_id:9, gen:17
frag_id: 4, Intf Address: 10.2.100.100
TE metric: 10, IGP metric: 10, attribute flags: 0x0
SRLGs: None

link[2]: Broadcast, DR: 10.3.100.100, nbr_node_id:11, gen:17
frag_id: 5, Intf Address: 10.3.100.100
TE metric: 10, IGP metric: 10, attribute flags: 0x0
SRLGs: None

IGP Id: 10.1.100.100, Network Node (ospf 1 area 0)
link[0]: Broadcast, Nbr IGP Id: 10.1.2.2, nbr_node_id:5, gen:13

link[1]: Broadcast, Nbr IGP Id: 1.1.1.1, nbr_node_id:6, gen:13

IGP Id: 10.2.100.100, Network Node (ospf 1 area 0)
link[0]: Broadcast, Nbr IGP Id: 10.1.2.2, nbr_node_id:5, gen:18

link[1]: Broadcast, Nbr IGP Id: 2.2.2.2, nbr_node_id:1, gen:18

IGP Id: 10.3.100.100, Network Node (ospf 1 area 0)
link[0]: Broadcast, Nbr IGP Id: 10.1.2.2, nbr_node_id:5, gen:21

link[1]: Broadcast, Nbr IGP Id: 3.3.3.3, nbr_node_id:7, gen:21

The important thing to notice here is that we are indeed seeing the other routers in the network, all the PE devices as well as the P device.

Now to the last line of configuration under the router ospf process:

PE-2:

"mpls traffic-eng mesh-group 100 Loopback0 area 0"

What this states is that we would like to use the Auto-Tunnel Mesh group feature, with this PE node being a member of group 100, using loopback0 for communication on the tunnel and running within the area 0.

This by itself only handles the signaling, but we also want to deploy a template in order to create the individual tunnel interfaces. This is done in the following manner:

PE-2:

interface Auto-Template100
ip unnumbered Loopback0
tunnel mode mpls traffic-eng
tunnel destination mesh-group 100
tunnel mpls traffic-eng autoroute announce
tunnel mpls traffic-eng path-option 10 dynamic

Using the Auto-Template100 interface, we, as we would also do in manual TE, specify our loopback address, the tunnel mode and the path option. Note that here we are simply following the IGP, which sort of defeats the purpose of many MPLS-TE configurations. But with our topology there is no path diversity so it wouldnt matter anyways.

Also, the autoroute announce command is used to force traffic into the tunnels.

The important thing is the “tunnel destination mesh-group 100” which ties this configuration snippet into the OSPF one.

After everything is setup, you should see some dynamic tunnels being created on each PE node:

PE-2:

PE-2#sh ip int b | incl up
GigabitEthernet1 100.100.101.100 YES manual up up
GigabitEthernet2 10.2.100.2 YES manual up up
Auto-Template100 2.2.2.2 YES TFTP up up
Loopback0 2.2.2.2 YES manual up up
Tunnel64336 2.2.2.2 YES TFTP up up
Tunnel64337 2.2.2.2 YES TFTP up up

Lets verify the current RIB configuration after this step:

PE-2:

PE-2#sh ip route | beg Gateway
Gateway of last resort is not set

1.0.0.0/32 is subnetted, 1 subnets
O 1.1.1.1 [110/12] via 1.1.1.1, 00:29:13, Tunnel64336
2.0.0.0/32 is subnetted, 1 subnets
C 2.2.2.2 is directly connected, Loopback0
3.0.0.0/32 is subnetted, 1 subnets
O 3.3.3.3 [110/12] via 3.3.3.3, 00:28:48, Tunnel64337
10.0.0.0/8 is variably subnetted, 4 subnets, 2 masks
O 10.1.100.0/24 [110/11] via 10.2.100.100, 00:29:13, GigabitEthernet2
C 10.2.100.0/24 is directly connected, GigabitEthernet2
L 10.2.100.2/32 is directly connected, GigabitEthernet2
O 10.3.100.0/24 [110/11] via 10.2.100.100, 00:29:13, GigabitEthernet2
22.0.0.0/32 is subnetted, 1 subnets
O 22.22.22.22 [110/2] via 10.2.100.100, 00:29:13, GigabitEthernet2

Very good. We can see that in order to reach 1.1.1.1/32 which is PE-1’s loopback, we are indeed routing through one of the dynamic tunnels.
The same goes for 3.3.3.3/32 towards PE-3’s loopback.
PE-2:

PE-2#traceroute 1.1.1.1 so loo0
Type escape sequence to abort.
Tracing the route to 1.1.1.1
VRF info: (vrf in name/id, vrf out name/id)
1 10.2.100.100 [MPLS: Label 17 Exp 0] 16 msec 22 msec 22 msec
2 10.1.100.1 25 msec * 19 msec

We can see that traffic towards that loopback is indeed being label-switched. And just to make it obvious, let me make sure we are not using LDP 🙂

PE-2:

PE-2#sh mpls ldp neighbor
PE-2#

On P-1, it being the midpoint of our LSP’s, we would expect 6 unidirectional tunnels in total:

P-1:

P-1#sh mpls for
Local Outgoing Prefix Bytes Label Outgoing Next Hop
Label Label or Tunnel Id Switched interface
16 Pop Label 3.3.3.3 64336 [6853] \
472 Et2/0 10.1.100.1
17 Pop Label 2.2.2.2 64336 [2231] \
2880 Et2/0 10.1.100.1
18 Pop Label 1.1.1.1 64336 [4312] \
2924 Et2/1 10.2.100.2
19 Pop Label 1.1.1.1 64337 [4962] \
472 Et2/2 10.3.100.3
20 Pop Label 2.2.2.2 64337 [6013] \
562 Et2/2 10.3.100.3
21 Pop Label 3.3.3.3 64337 [4815] \
0 Et2/1 10.2.100.2

Exactly what we expected.
The following is the output of the command: “show ip ospf database opaque-area” on PE-2. I have cut it down to the relevant opaque-lsa part (we are using 2 types, one for the general MPLS-TE and one for the Mesh-Group feature):

LS age: 529
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 4.0.0.0
Opaque Type: 4
Opaque ID: 0
Advertising Router: 1.1.1.1
LS Seq Number: 80000002
Checksum: 0x5364
Length: 32

Capability Type: Mesh-group
Length: 8
Value:

0000 0064 0101 0101

LS age: 734
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 4.0.0.0
Opaque Type: 4
Opaque ID: 0
Advertising Router: 2.2.2.2
LS Seq Number: 80000002
Checksum: 0x6748
Length: 32

Capability Type: Mesh-group
Length: 8
Value:

0000 0064 0202 0202

LS age: 701
Options: (No TOS-capability, DC)
LS Type: Opaque Area Link
Link State ID: 4.0.0.0
Opaque Type: 4
Opaque ID: 0
Advertising Router: 3.3.3.3
LS Seq Number: 80000002
Checksum: 0x7B2C
Length: 32

Capability Type: Mesh-group
Length: 8
Value:

0000 0064 0303 0303

I have highlighted the interesting parts, which is the Advertising Router and the value of the TLV, those starting with 0000 0064, which is in fact the membership of “100” being signaled across the IGP area.
Okay, all good i hear you say, but lets do an end-to-end test from the CE devices in Customer CUST-A’s domain:

R1:

R1#sh ip route | beg Gateway
Gateway of last resort is not set

10.0.0.0/32 is subnetted, 3 subnets
C 10.1.1.1 is directly connected, Loopback0
B 10.2.2.2 [20/0] via 100.100.100.100, 00:37:46
B 10.3.3.3 [20/0] via 100.100.100.100, 00:37:36
100.0.0.0/8 is variably subnetted, 2 subnets, 2 masks
C 100.100.100.0/24 is directly connected, FastEthernet0/0
L 100.100.100.1/32 is directly connected, FastEthernet0/0

So we are learning the routes on the customer side (through standard IPv4 BGP).

R1:

R1#ping 10.2.2.2 so loo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.2.2.2, timeout is 2 seconds:
Packet sent with a source address of 10.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 40/72/176 ms
R1#ping 10.3.3.3 so loo0

Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.3.3.3, timeout is 2 seconds:
Packet sent with a source address of 10.1.1.1
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 20/32/48 ms

We have reachability! – What about traceroute:

R1:

R1#traceroute 10.2.2.2 so loo0

Type escape sequence to abort.
Tracing the route to 10.2.2.2

1 100.100.100.100 28 msec 20 msec 12 msec
2 10.1.100.100 [MPLS: Labels 18/17 Exp 0] 44 msec 136 msec 60 msec
3 100.100.101.100 [MPLS: Label 17 Exp 0] 28 msec 32 msec 12 msec
4 100.100.101.3 28 msec 32 msec 24 msec
R1#traceroute 10.3.3.3 so loo0

Type escape sequence to abort.
Tracing the route to 10.3.3.3

1 100.100.100.100 48 msec 16 msec 8 msec
2 10.1.100.100 [MPLS: Labels 19/17 Exp 0] 48 msec 12 msec 52 msec
3 100.100.102.100 [MPLS: Label 17 Exp 0] 16 msec 28 msec 36 msec
4 100.100.102.4 68 msec 56 msec 48 msec

Just what we would expect from our L3 MPLS VPN service. A transport label (this time through MPLS-TE) and a VPN label as signaled through MP-BGP.

To round it off, I have attached the following from a packet capture on P-1’s interface toward PE-1 and then re-issued the ICMP-echo from R1’s loopback toward R2’s loopback adress:

wireshark-output

With that, I hope its been informative for you. Thanks for reading!

References:

http://www.cisco.com/c/en/us/td/docs/ios/12_0s/feature/guide/gsmeshgr.html

Configurations:

configurations

Happy New Year

Hi Everyone,

I wish you all a Happy New Year!

Currently im very busy studying for my 2nd attempt at the CCDE Practical exam.
I have it booked for the next slot, which is February 22nd in London.

Thankfully there are more and more material available for the CCDE than just a year ago. One of my primary sources are the study group which I have mentioned before, which Daniel (lostintransit.se) and I started way back.

Im also going through the INE scenarios as well as LiveLessons available through a Safari subscription. Those are really good and I highly recommend them.

One of the primary things im practicing at the moment is picking up business requirements from a given scenario. This is quite hard as im at heart an implementation-focused guy. But its good to learn something new and very useful.

If you are not following it just yet, I can highly recommend the “Unleashing CCDE” site on Cisco Learning Network (https://learningnetwork.cisco.com/blogs/unleashing-ccde). There are alot of good posts there on how to pick up these “soft” skills.

I will keep the blog updated with my study progress through February and we’ll see what happens February 22nd 🙂

Take Care.

/Kim

Snippet: The story of the EFP

For a while now, the concept of EVC’s (Ethernet Virtual Circuits) and EFP’s (Ethernet Flow Points), has eluded me.

In this short post, i will provide you with a simple example of a couple of EFP’s. In a later post i will discuss the MEF concept of EVC’s.

As always, here is the topology i will be using:

topology

Its a very simple setup. R1 connects to R2 through its G1 interface and connects to R3 through its G2 interface.

On R2 and R3, we have the very common configuration of using subinterfaces for the individual Vlan’s in question. Namely Vlan 10 for the connection between R1 and R2 and Vlan 20 between R1 and R3.

Here is the configuration of R2 and R3:

R2#sh run int g1.10
Building configuration...
Current configuration : 98 bytes
!
interface GigabitEthernet1.10
encapsulation dot1Q 10
ip address 10.10.10.2 255.255.255.0
end

R3#sh run int g1.20
Building configuration...
Current configuration : 98 bytes
!
interface GigabitEthernet1.20
encapsulation dot1Q 20
ip address 10.10.10.3 255.255.255.0
end

Now on R1 is where the “different” configuration takes place:

R1#sh run int g1
Building configuration...
Current configuration : 182 bytes
!
interface GigabitEthernet1
no ip address
negotiation auto
service instance 10 ethernet
encapsulation dot1q 10
rewrite ingress tag pop 1 symmetric
bridge-domain 10
!
end

R1#sh run int g2
Building configuration...
Current configuration : 182 bytes
!
interface GigabitEthernet2
no ip address
negotiation auto
service instance 20 ethernet
encapsulation dot1q 20
rewrite ingress tag pop 1 symmetric
bridge-domain 10
!
end

R1#sh run int bdi10
Building configuration...
Current configuration : 96 bytes
!
interface BDI10
description -= Our L3 interface =-
ip address 10.10.10.1 255.255.255.0
end

So what does this all mean!? – Well, basically what you are looking at is the very nature of an EFP. One on each physical interface in this case. It is defined under the “service instance” command structure.

An Ethernet Flow Point (EFP) is a way to match a certain ethernet frame, do an action on it ingress (and also in our case egress). On top of that you can attach it to a bridge-domain.

The result of the above configuration is that on G1, we match on the dot1q tag when its tagged with vlan 10. On ingress we then pop 1 tag before performing any other “upstream” action. With the symmetric keyword, we attach the vlan 10 tag when egressing.

On G2, we are doing the same, but with vlan 20 instead.

With both EFP’s we attach a bridge-domain (ID 10), which can be verified like this:

R1#show bridge-domain
Bridge-domain 10 (3 ports in all)
State: UP Mac learning: Enabled
Aging-Timer: 300 second(s)
BDI10 (up)
GigabitEthernet1 service instance 10
GigabitEthernet2 service instance 20
AED MAC address Policy Tag Age Pseudoport
- 001E.7AE0.11BF to_bdi static 0 BDI10

Right now we only have one mac address learned, namely of our L3 BDI interface. But we can see that both G1 and G2 has a service instance in this bridge-domain.

Lets try and do some ICMP tests from R2:

R2#ping 10.10.10.1
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.10.10.1, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 1/120/598 ms

Lets again verify our bridge-domain on R1:

R1#show bridge-domain
Bridge-domain 10 (3 ports in all)
State: UP Mac learning: Enabled
Aging-Timer: 300 second(s)
BDI10 (up)
GigabitEthernet1 service instance 10
GigabitEthernet2 service instance 20
AED MAC address Policy Tag Age Pseudoport
- 001E.7AE0.11BF to_bdi static 0 BDI10
0 0050.56BE.18D8 forward dynamic 276 GigabitEthernet1.EFP10

What we see now, is that a Mac address has been dynamically learned through the G1.EFP10 EFP.

Since we are technically “bridging” these two distinct vlans, we should be able to ping R3 from R2 as well:

R2#ping 10.10.10.3
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 10.10.10.3, timeout is 2 seconds:
!!!!!
Success rate is 100 percent (5/5), round-trip min/avg/max = 2/24/104 ms

And again on R1:

R1#show bridge-domain
Bridge-domain 10 (3 ports in all)
State: UP Mac learning: Enabled
Aging-Timer: 300 second(s)
BDI10 (up)
GigabitEthernet1 service instance 10
GigabitEthernet2 service instance 20
AED MAC address Policy Tag Age Pseudoport
- 001E.7AE0.11BF to_bdi static 0 BDI10
0 0050.56BE.320A forward dynamic 287 GigabitEthernet2.EFP20
0 0050.56BE.18D8 forward dynamic 287 GigabitEthernet1.EFP10

We have now learned all the Mac addresses in our small test environment.

So thats basically all there is to an EFP. A simple way of providing a flexible way of matching frames.

Until next time! – Take care.