VXLAN Configuration

In this chapter, we look at

Enablement

On the spines, we need to enable the EVPN control plane:

nv overlay evpn

One the leaves we also need to enable this, as

nv overlay evpn
feature vn-segment-vlan-based
feature nv overlay

Layer 2 VXLAN

We define four different VLANs - two on one leaf switch, and two on the other. However we only assign two different VXLAN VNIs. I've done this for educational purposes to show the decoupling of VLANs and VNIs. In a real deployment you may try to keep these identical, or adhere to a defined schema.

# Leaf01
vlan 128
  vn-segment 1024
vlan 129
  vn-segment 1025

# Leaf02
vlan 256
  vn-segment 1024
vlan 257
  vn-segment 1025

We can confirm these mappings with the following command:

# Run on Leaf01
Leaf01# show vxlan
Vlan            VN-Segment
====            ==========
128             1024
129             1025

We now define the virtual interface that acts as the VTEP for the switch. It's sourced from the loopback, and reachability information is shared using BGP (that's the EVPN control plane). The two VNIs we've created are added, and we specify the multicast group that allows the underlay to replicate the broadcast and unknown multicast traffic.

interface nve1
  no shutdown
  source-interface loopback0
  host-reachability protocol bgp
  member vni 1024
    mcast-group 239.1.1.1
  member vni 1025
    mcast-group 239.1.1.1

Finally, from a layer 2 perspective, we define the route distinguisher and the route target import/exports for the VNIs. The auto keyword allows us to do this in a deterministic manner, as long as all the switches are in the same autonomous system.

evpn
  vni 1024 l2
    rd auto
    route-target import auto
    route-target export auto
  vni 1025 l2
    rd auto
    route-target import auto
    route-target export auto

Confirmation

We have a router with two links connected into each of the switches, with each link in a different VRF to allow us to ping through the overlay to itself. Interface connectivity is as follows:

  • Te0/0/0 is connected to Leaf02, ethernet1/25 and tethered in Vlan 256
  • Te0/0/1 is connected to Leaf01, ethernet1/25 and tethered in Vlan 128.
Pseudo_Hosts#show ip int brief 
Interface              IP-Address      OK? Method Status                Protocol
Te0/0/0                203.0.113.7     YES manual up                    up      
Te0/0/1                203.0.113.6     YES manual up                    up

We ping from the interface attached to Leaf01 through to the interface attached to Leaf02 to the other and get successful responses:

Pseudo_Hosts#ping vrf A 203.0.113.7
Type escape sequence to abort.
Sending 5, 100-byte ICMP Echos to 203.0.113.7, timeout is 2 seconds:
!!!!!

Looking on Leaf01, we can see the local and remote MAC addresses in the MAC table:

Leaf01# show mac address-table 
Legend: 
        * - primary entry, G - Gateway MAC, (R) - Routed MAC, O - Overlay MAC
        age - seconds since last seen,+ - primary entry using vPC Peer-Link,
        (T) - True, (F) - False, C - ControlPlane MAC
   VLAN     MAC Address      Type      age     Secure NTFY Ports
---------+-----------------+--------+---------+------+----+------------------
C  128     a46c.2a77.5800   dynamic  0         F      F    nve1(192.0.2.4)
*  128     a46c.2a77.5801   dynamic  0         F      F    Eth1/25

We also see this in the new "layer 2 routing table", which shows us the local MAC address has been exported with the appropriate rroute distinguisher and route target from the local MAC address table. We also see the remote MAC address in this table, and it's been imported into the local MAC address table. As it's been imported, it must mean that a route distinguisher is is tagged with matches our (auto generated) import statement under the evpn section.

Leaf01# show l2route mac all 

Flags -(Rmac):Router MAC (Stt):Static (L):Local (R):Remote (V):vPC link 
(Dup):Duplicate (Spl):Split (Rcv):Recv (AD):Auto-Delete(D):Del Pending (S):Stale (C):Clear
(Ps):Peer Sync (O):Re-Originated 

Topology    Mac Address    Prod   Flags         Seq No     Next-Hops      
----------- -------------- ------ ------------- ---------- ----------------
128         a46c.2a77.5800 BGP    Rcv           0          192.0.2.4      
128         a46c.2a77.5801 Local  L,            0          Eth1/25

Taking a look at the BGP table on Leaf01:

Leaf01# show bgp l2vpn evpn 
BGP routing table information for VRF default, address family L2VPN EVPN
BGP table version is 144, local router ID is 192.0.2.3
Status: s-suppressed, x-deleted, S-stale, d-dampened, h-history, *-valid, >-best
Path type: i-internal, e-external, c-confed, l-local, a-aggregate, r-redist, I-injected
Origin codes: i - IGP, e - EGP, ? - incomplete, | - multipath, & - backup

   Network            Next Hop            Metric     LocPrf     Weight Path
Route Distinguisher: 192.0.2.3:32895    (L2VNI 1024)
*>i[2]:[0]:[0]:[48]:[a46c.2a77.5800]:[0]:[0.0.0.0]/216
                      192.0.2.4                         100          0 i
*>l[2]:[0]:[0]:[48]:[a46c.2a77.5801]:[0]:[0.0.0.0]/216
                      192.0.2.3                         100      32768 i

Route Distinguisher: 192.0.2.4:33023
*>i[2]:[0]:[0]:[48]:[a46c.2a77.5800]:[0]:[0.0.0.0]/216
                      192.0.2.4                         100          0 i

The first entry is a type 2 entry for the MAC address a46c.2a77.5801 which was learned from Leaf02. The second entry is a type 2 for MAC address a46c.2a77.5800 which was learned locally. The interesting thing to note is that there is another entry for the MAC learned for Leaf02.

This IOS Hints article provides some insight on it from an MPLS perspective, but the theory is the same here. The route distinguisher of the type 2 prefix learned over VNI 1024 is different to the route distinguiser locally associated with VNI 1024. Thus the when the route is imported, a copy is made and the local route distinguisher is assigned to the prefix.

EVPN Type-2 Advertisement

Let's take a look at the BGP NLRI and it's constituent parts. We'll take a look at the NLRI after it's been imported (i.e. the RD has changed to the local RD):

Leaf01# show bgp l2vpn evpn a46c.2a77.5800 
BGP routing table information for VRF default, address family L2VPN EVPN
Route Distinguisher: 192.0.2.3:32895    (L2VNI 1024)
BGP routing table entry for [2]:[0]:[0]:[48]:[a46c.2a77.5800]:[0]:[0.0.0.0]/216, version 156
Paths: (1 available, best #1)
Flags: (0x000212) on xmit-list, is in l2rib/evpn, is not in HW

  Advertised path-id 1
  Path type: internal, path is valid, is best path, no labeled nexthop, in rib
             Imported from 192.0.2.4:33023:[2]:[0]:[0]:[48]:[a46c.2a77.5800]:[0]:[0.0.0.0]/112 
  AS-Path: NONE, path sourced internal to AS
    192.0.2.4 (metric 9) from 192.0.2.1 (192.0.2.1)
      Origin IGP, MED not set, localpref 100, weight 0
      Received label 1024
      Extcommunity:  RT:65000:1024 ENCAP:8
      Originator: 192.0.2.4 Cluster list: 192.0.2.1 

  Path-id 1 not advertised to any peer
  • NLRI
    • [2] - this denotes an EVPN type-2 advertisement, a "MAC/IP Advertisement Route"
    • [0] - the Ethernet Segment Identifier.
    • [0] - the Ethernet Tag Identifier.
    • [48] - MAC address length.
    • [1eaf.0002.0257] - MAC address.
    • [0] - IP address length.
    • [0.0.0.0] - IP address.
    • /216 - total length of the 'prefix'.
  • Received label 1024 - this is the VNI (and thus VLAN) that this MAC is associated with.
  • Extended Communities
    • RT:65000:1025 - this is the layer 2 route target.
    • ENCAP:8 - this is the overlay encapsulation. 8 refers to VXLAN.
    • MAC Mobility Sequence - incremented when a MAC address moved between leaves and used as a tiebreaker when there are duplicate entries for the same MAC.

Packet Analysis

The following image shows a wireshark capture of an intra-VLAN flow. The leaf switches don't have MAC address information for the endpoints, however in this capture the endpoints themselves do have entries for each other in their ARP tables. The Source and Destination columns show two IP addresses. The first is the VXLAN enapsulation src/dst addresses, the second is the inner payload's src/dst.

When describing the hosts I will use the names Host01 and Host02, each being connected to Leaf01 and Leaf02 respectively.

alt text

The first ICMP echo request is sent out from Host01. As its leaf switch does not have an entry in its L2 route table, the packet encapsulated in VXLAN and sent with a source address of 192.0.2.3, the VTEP IP address, and a multicast destination address of out on 239.1.1.1. We do not see a reply from Host02, which implies that the Leaf02 switch is dropping this request as it doesn't have a MAC entry for the destination.

As the first packet is sent out, Leaf01 learns the MAC address, exports this into its L2 routing table and advertises it via BGP. Let's have a look at the UPDATE message:

Border Gateway Protocol - UPDATE Message
    Marker: ffffffffffffffffffffffffffffffff
    Length: 104
    Type: UPDATE Message (2)
    Withdrawn Routes Length: 0
    Total Path Attribute Length: 81
    Path attributes
        Path Attribute - ORIGIN: IGP
        Path Attribute - AS_PATH: empty
        Path Attribute - LOCAL_PREF: 100
        Path Attribute - EXTENDED_COMMUNITIES
            Type Code: EXTENDED_COMMUNITIES (16)
            Length: 16
            Carried extended communities: (2 communities)
                Community Transitive Two-Octet AS Route Target: 65000:1024
                    Community type high: Transitive Two-Octet AS (0x00)
                    Subtype as2: Route Target (0x02)
                    Two octets AS specific: 65000
                    Four octets AN specific: 1024
                Community Transitive Opaque Encapsulation: VXLAN Encapsulation
                    Community type high: Transitive Opaque (0x03)
                    Subtype opaque: Encapsulation (0x0c)
                    Four octets Value specific: 0x00000000
                    Tunnel types: VXLAN Encapsulation (8)
        Path Attribute - MP_REACH_NLRI
            Type Code: MP_REACH_NLRI (14)
            Length: 44
            Address family identifier (AFI): Layer-2 VPN (25)
            Subsequent address family identifier (SAFI): EVPN (70)
            Next hop network address (4 bytes)
            Number of Subnetwork points of attachment (SNPA): 0
            Network layer reachability information (35 bytes)
                EVPN NLRI: MAC Advertisement Route
                    AFI: MAC Advertisement Route (2)
                    Length: 33
                    Route Distinguisher: 0001c0000203807f (192.0.2.3:32895)
                    ESI: 00 00 00 00 00 00 00 00 00
                        ESI Type: ESI 9 bytes value (0)
                        ESI 9 bytes value: 00 00 00 00 00 00 00 00 00
                    Ethernet Tag ID: 0
                    MAC Address Length: 48
                    MAC Address: Cisco_77:58:01 (a4:6c:2a:77:58:01)
                    IP Address Length: 0
                    IP Address: NOT INCLUDED
                    MPLS Label Stack: 64, (BOGUS: Bottom of Stack NOT set!)

The well-known mandatory path attributes are all standard. We can see that there are two extended communities attached to the advertisement:

  • A Two-Octet AS Route Target that's been attached. This has been auto generated as the AS and the VNI.
  • An Opaque Encapsulation, notifying that VXLAN encapsulation should be used.

We then see the multi-protocol network layer reachability information (MP_REACH_NLRI). The reachability information is part of the layer-2 VPN address family, and the sub-address family is EVPN.

Following the sizing definitions we have the network layer reachability information.

  • We can see this is a type-2 EVPN advertisement.
  • The route distinguisher is the auto-generated 192.0.2.3:32895.
  • The ESI, used to distinguish between the sources of advertisements if the CE is multi-homed to a PE.
  • The Ethernet tag, which "identifies a particular broadcast domain (e.g. VLAN) within an EVPN instance".
  • The MAC address length.
  • The MAC address value.
  • The IP address length. This is 0 as this is a MAC advertisement exported from the MAC address table, as opposed to a MAC-IP advertisement exported from the ARP table.
  • The IP address, which is not includeed.
  • The last value is the 'MPLS Label', which in this instance holds the VXLAN VNI. Wireshark is incorrectly dissecting this, so the output above is not correct. The last 3 octets hold the label, and in our capture they are 0x00 0x04 0x00. This is 1024 is decimal, which matches the VNI assigned to VLAN 128 on Leaaf01.

The next ICMP echo request is sent, and the multicast address is still used as Leaf01 does not know where this host is. This echo request makes it to Host02 which sends a reply. As Leaf02 has learnt the MAC address via the BGP UPDATE, this is sent as a unicast back to Leaf01.

Due to the echo reply, Leaf02 has now learnt the MAC address of Host02 and advertises this via a BGP UPDATE. From this point on both leaf switches have the MAC reachability information and send unicast VXLAN encapsulated frames across the overlay.

results matching ""

    No results matching ""