The Thinking Network (Installment 6): The BGP Overlay & EVPN Services
Installment 6 of The Thinking Network. We deploy the BGP overlay and EVPN address families to create the control plane lever our autonomous AI will use for traffic engineering.
Architecture Overview: Phase 6 (The Overlay)Objective: Deploy the BGP control plane that the autonomous AI agent will manipulate.Core Technologies: BGP (Border Gateway Protocol), iBGP Full-Mesh, EVPN (Ethernet VPN), and SR Linux Overlay Services.The Goal: Establish an iBGP overlay across the IS-IS underlay, configuring the specific BGP attributes (like Local Preference) that our AI model will dynamically alter to steer traffic.
The Overlay
IS-IS built the map. Every router knows where every other router is. The loopback addresses are distributed. The routing table is populated.
Now the overlay goes on top.
BGP (Border Gateway Protocol) is the protocol that carries the information the AI layer will eventually use to make decisions. It is also the protocol the AI will reach into when it decides to act.
Two Protocols, Two Jobs
A question that comes up regularly: if IS-IS is already routing packets, why does this lab also need BGP?
The answer is scope.
IS-IS operates within a single routing domain. It knows the topology of this network and distributes reachability information for the addresses within it. It does not carry service routes. It does not carry VPN information. It does not scale to the hundreds of thousands of prefixes that flow across internet exchange points.
BGP was designed to carry routing information across boundaries - between the distinct, independently-administered networks that form the internet. In this lab, BGP runs as iBGP - internal BGP, a full mesh of sessions within a single autonomous system. All four nodes are in AS 65000. Each node peers with every other node.
The Role of EVPN and L2 Bridges
When we talk about a service route, we are usually talking about an EVPN route. EVPN stands for Ethernet VPN. It is the modern standard for extending Layer 2 networks over a Layer 3 infrastructure.
An L2 bridge domain is essentially a virtual switch. When you connect two servers on opposite sides of the fabric, they need to believe they are on the same local Ethernet segment. They need to broadcast ARP requests and find each other's MAC addresses.
In a traditional network, this would require a physical Layer 2 link spanning the entire distance. In this lab, we use EVPN to bridge that gap. The SR Linux routers perform a process called MAC learning. When a client sends a frame, the router learns the MAC address and encapsulates that Ethernet frame inside a packet.
BGP then distributes the reachability of that MAC address across the iBGP mesh. To the clients, the network appears as a single, transparent switch, even though the traffic is actually traversing a complex Layer 3 fabric managed by IS-IS.
When a service route needs to be distributed - an EVPN route for an L2 bridge domain, a VPN route for an L3 service - it travels through the iBGP mesh. When the AI layer needs to influence traffic flow, it will do so through BGP. Not by changing routes directly, but by changing a BGP attribute called Local Preference. IS-IS gets packets between routers, but BGP decides where services go and which path carries them.
Configuring BGP by Hand
If you were building this lab without automation, this is what you would type. Each node needs its own router ID, its autonomous system number, a peer group definition, and a neighbor statement for each of the other three nodes.
The router ID for each node is its loopback address. The loopback is always reachable - it does not go down when a physical link fails. BGP sessions built on loopback addresses survive link failures and reconverge through whatever alternate path IS-IS provides. This is the IS-IS/BGP interdependency made explicit: IS-IS makes BGP resilient.
srl1:
--{ candidate }--[ ]--
A:srl1# set / network-instance default protocols bgp admin-state enable
A:srl1# set / network-instance default protocols bgp autonomous-system 65000
A:srl1# set / network-instance default protocols bgp router-id 172.1.255.255
A:srl1# set / network-instance default protocols bgp afi-safi ipv4-unicast admin-state enable
A:srl1# set / network-instance default protocols bgp afi-safi evpn admin-state enable
A:srl1# set / network-instance default protocols bgp group ibgp-mesh peer-as 65000
A:srl1# set / network-instance default protocols bgp group ibgp-mesh afi-safi ipv4-unicast admin-state enable
A:srl1# set / network-instance default protocols bgp group ibgp-mesh afi-safi evpn admin-state enable
A:srl1# set / network-instance default protocols bgp neighbor 172.2.255.255 peer-group ibgp-mesh
A:srl1# set / network-instance default protocols bgp neighbor 172.3.255.255 peer-group ibgp-mesh
A:srl1# set / network-instance default protocols bgp neighbor 172.4.255.255 peer-group ibgp-mesh
A:srl1# commit stay
srl2:
A:srl2# set / network-instance default protocols bgp admin-state enable
A:srl2# set / network-instance default protocols bgp autonomous-system 65000
A:srl2# set / network-instance default protocols bgp router-id 172.2.255.255
A:srl2# set / network-instance default protocols bgp afi-safi ipv4-unicast admin-state enable
A:srl2# set / network-instance default protocols bgp afi-safi evpn admin-state enable
A:srl2# set / network-instance default protocols bgp group ibgp-mesh peer-as 65000
A:srl2# set / network-instance default protocols bgp group ibgp-mesh afi-safi ipv4-unicast admin-state enable
A:srl2# set / network-instance default protocols bgp group ibgp-mesh afi-safi evpn admin-state enable
A:srl2# set / network-instance default protocols bgp neighbor 172.1.255.255 peer-group ibgp-mesh
A:srl2# set / network-instance default protocols bgp neighbor 172.3.255.255 peer-group ibgp-mesh
A:srl2# set / network-instance default protocols bgp neighbor 172.4.255.255 peer-group ibgp-mesh
A:srl2# commit stay
srl3:
A:srl3# set / network-instance default protocols bgp admin-state enable
A:srl3# set / network-instance default protocols bgp autonomous-system 65000
A:srl3# set / network-instance default protocols bgp router-id 172.3.255.255
A:srl3# set / network-instance default protocols bgp afi-safi ipv4-unicast admin-state enable
A:srl3# set / network-instance default protocols bgp afi-safi evpn admin-state enable
A:srl3# set / network-instance default protocols bgp group ibgp-mesh peer-as 65000
A:srl3# set / network-instance default protocols bgp group ibgp-mesh afi-safi ipv4-unicast admin-state enable
A:srl3# set / network-instance default protocols bgp group ibgp-mesh afi-safi evpn admin-state enable
A:srl3# set / network-instance default protocols bgp neighbor 172.1.255.255 peer-group ibgp-mesh
A:srl3# set / network-instance default protocols bgp neighbor 172.2.255.255 peer-group ibgp-mesh
A:srl3# set / network-instance default protocols bgp neighbor 172.4.255.255 peer-group ibgp-mesh
A:srl3# commit stay
srl4:
A:srl4# set / network-instance default protocols bgp admin-state enable
A:srl4# set / network-instance default protocols bgp autonomous-system 65000
A:srl4# set / network-instance default protocols bgp router-id 172.4.255.255
A:srl4# set / network-instance default protocols bgp afi-safi ipv4-unicast admin-state enable
A:srl4# set / network-instance default protocols bgp afi-safi evpn admin-state enable
A:srl4# set / network-instance default protocols bgp group ibgp-mesh peer-as 65000
A:srl4# set / network-instance default protocols bgp group ibgp-mesh afi-safi ipv4-unicast admin-state enable
A:srl4# set / network-instance default protocols bgp group ibgp-mesh afi-safi evpn admin-state enable
A:srl4# set / network-instance default protocols bgp neighbor 172.1.255.255 peer-group ibgp-mesh
A:srl4# set / network-instance default protocols bgp neighbor 172.2.255.255 peer-group ibgp-mesh
A:srl4# set / network-instance default protocols bgp neighbor 172.3.255.255 peer-group ibgp-mesh
A:srl4# commit stay
Four nodes. Eleven commands each. Three neighbor statements per node, each pointing at the loopback of another node. A full mesh means every node peers with every other node directly - no route reflector, no hierarchy. Six sessions total. Every node sees every route from every other node directly.
The peer group ibgp-mesh is a template. The neighbor statements reference it rather than repeating the same attributes four times per node. Both ipv4-unicast and evpn address families are enabled. ipv4-unicast carries IP prefixes. EVPN - Ethernet VPN - carries the MAC and IP binding information for the L2 and L3 services the client containers will use.
What the Script Does Instead
The Python script generates the neighbor statements dynamically from the node list in lab_config.py:
neighbors = "".join([
f"set / network-instance default protocols bgp neighbor 172.{i}.255.255 peer-group ibgp-mesh\n"
for i in range(1, 5) if str(i) != sid
])
For srl1, sid is 1. The loop generates neighbors for sites 2, 3, and 4. For srl4, sid is 4. The loop generates neighbors for sites 1, 2, and 3. The router ID is 172.{sid}.255.255 - the same loopback address IS-IS distributed.
The script writes all four configs in one pass. The startup files load at boot. BGP sessions form as soon as IS-IS has distributed the loopback addresses - which is why IS-IS is configured first and why the audit verifies IS-IS before checking BGP.
Why BGP Is the AI's Lever
Here is the specific reason BGP matters to this series beyond just completing the control plane.
The AI layer, when it determines it needs to reroute traffic, does not reprogram the forwarding table. It does not touch IS-IS. It reaches into BGP and changes one attribute: Local Preference.
Local Preference is a number that tells BGP how much to favor a route relative to alternatives. The default is 100. Higher values win. If the AI detects that latency on Path A is trending toward the 5ms SLA threshold, it changes the Local Preference on the Path A route from 100 to 50. BGP reconverges. Path B becomes preferred. Traffic moves.
When latency recovers, the AI restores Local Preference to 100. BGP reconverges again. Traffic returns to Path A.
The AI did not redesign the network. It changed a number. The network did what it does - it followed the policy.
This is why the overlay had to be built before the intelligence layer is meaningful. The intelligence does not work around BGP. It works through it. And it only works through it because IS-IS is underneath providing the loopback reachability that makes BGP sessions stable.
Verification
After IS-IS has converged and the startup configs have loaded:
docker exec clab-nbl-diamond-v1-srl1 sr_cli \
-c "show network-instance default protocols bgp neighbor"
+----------+---------------+-----------+-------+--------+-------------+-----------+
| Net-Inst | Peer | Group | Flags | Peer-AS| State | Uptime |
+==========+===============+===========+=======+========+=============+===========+
| default | 172.2.255.255 | ibgp-mesh | S | 65000 | established | 0d:0h:28m |
| default | 172.3.255.255 | ibgp-mesh | S | 65000 | established | 0d:0h:28m |
| default | 172.4.255.255 | ibgp-mesh | S | 65000 | established | 0d:0h:28m |
+----------+---------------+-----------+-------+--------+-------------+-----------+
3 configured neighbors, 3 configured sessions are established
Three neighbors. Three sessions. All established. The iBGP full mesh is up.
Every node shows the same result - three established sessions pointing at the other three loopbacks. Six sessions total across the fabric. The control plane is complete.
What the Fabric Is Now
Four nodes. IS-IS underlay distributing loopback reachability across the diamond. iBGP full mesh running across those loopbacks in AS 65000. EVPN address family enabled for L2 and L3 service distribution. End-to-end client connectivity verified.
The fabric is a digital twin of carrier infrastructure running on a laptop. Not a simulation. The actual Nokia SR Linux operating system, running the actual protocols, carrying actual traffic.
The next phase is to give it the ability to watch itself.
Next installment: The Health Audit. The automated verification that the fabric is exactly what we think it is - before we hand it to the AI.