Table of contents (for this page):
BGP and IPv6 routing coursesSeveral times a year I teach two training courses, one about BGP and one about IPv6. The BGP course is half theory and half hands-on practice, and so is the new IPv6 routing course. Previously, we did an IPv6 course without a hands-on part.
The courses consists of a theory part in the morning and a practical part in the afternoon where the participants implement several assignments on a Cisco router (in groups of two participants per router).
Dates for upcoming courses in 2015 are:
Interdomain Routing & IPv6 News
At the NANOG meeting in San Francisco two weeks ago, there was a session on The benefits of deploying IPv6 only. Someone from T-Mobile explained that the latest Windows Mobile and Android support 464XLAT to allow IPv4-only applications to work over IPv6 with NAT64, so those devices now only get IPv6. Other devices only get IPv4, there's no dual stack. At that point, the panelists didn't know yet that Apple is requiring iOS 9 apps to work over IPv6 so those can work through NAT64 without 464XLAT.
Another interesting data point is the observation by Facebook that IPv6 tends to perform better than IPv4, with the margin being as large as 40%:
However, why this is is unclear: the RTTs are the same, yet the performance/bandwidth over IPv6 is better. There was some frustration because Apple's implementation of "happy eyeballs" only looks at the RTT to choose between IPv4 and IPv6, and thus lands on IPv4 a good deal of the time and doesn't enjoy the benefits of that better IPv6 performance.
Earlier this month, RIPE Labs had a lengthy blog post about transfers of IPv4 addresses within the RIPE region. A lot of addresses went from Romania to Saudi Arabia, but the rest of Europe and the Middle East has been busy, too. However:
In the subsequent months of January 2015 through to April 2015, levels of transfer were significantly lower. Because the RIPE NCC listing service continues to show strong demand, the lower amounts transferred may well be a sign that the market in the RIPE region is capped by availability; total demand cannot be met by available supplies. This may change after the recently accepted RIPE policy for inter-RIR transfers has been implemented.
It probably wasn't an accident that two of the sponsors of the RIPE-70 meeting were businesses that facilitate IPv4 address trading.
For some years now, the Regional Internet Registries have been rolling out RPKI. The Resource Public Key Infrastructure allows holders of IP addresses to authorize an autonomous system to inject those addresses in BGP. (See here for an overview of how RPKI works and more links.)
I've always thought it would be hard to deploy RPKI in the real world, because it's just way too easy for a certificate or ROA (route origination authorization) to expire. If that then leads to routes becoming invalid and the addresses in question being unreachable, that would be a good example of the cure being worse than the disease.
Fortunately, that's not the case: RPKI is ready for real-world deployment today.
So packets will follow a path that is RPKI-validated if available. If not, they follow a path that isn't covered by RPKI if that's available. Only if there's no "valid" or "unknown" paths, the packets will be sent over an "invalid" path that is covered by RPKI, but validation failed. The trouble with this approach is that it still allows for invalid more specific prefixes to hijack traffic. For instance:
RIPE has a ROA for prefix 188.8.131.52/21 that allows AS 3333 to originate that prefix, with a maximum prefix length of /21. So if AS 4444 originates 184.108.40.206/21, that will result in the following BGP table:
Network Next Hop Metric LocPrf Weight Path >* 220.127.116.11/21 18.104.22.168 10 200 0 3333 i * 22.214.171.124 10 50 0 4444 i
So effectively, the path through AS 4444 is ignored. However, AS 4444 could also do this:
Network Next Hop Metric LocPrf Weight Path >* 126.96.36.199/21 188.8.131.52 10 200 0 3333 i >* 184.108.40.206/24 220.127.116.11 10 50 0 4444 i >* 18.104.22.168/24 22.214.171.124 10 50 0 4444 i >* 126.96.36.199/24 188.8.131.52 10 50 0 4444 i >* 184.108.40.206/24 220.127.116.11 10 50 0 4444 i >* 18.104.22.168/24 22.214.171.124 10 50 0 4444 i >* 126.96.36.199/24 188.8.131.52 10 50 0 4444 i >* 184.108.40.206/24 220.127.116.11 10 50 0 4444 i >* 18.104.22.168/24 22.214.171.124 10 50 0 4444 i
So even though the path towards the /21 is still routed to AS 3333, the packets flow to AS 4444 because of the longest match first rule. Solution: filter out "invalid" prefixes completely.
But then, what happens when RIPE forgets to renew their certificate or ROA in time? If their prefix would then revert to "invalid", it would disappear from routing tables everywhere, and RIPE would be unreachable:
Network Next Hop Metric LocPrf Weight Path
In this scenario, it would be very dangerous to filter "invalid" prefixes, as RPKI is still relatively immature and mistakes will happen.
❝If ARIN (or another other RIR) went offline or signed broken data, all signed prefixes that previously has the RPKI status "Valid", would fall back to the state "Unknown", as if they were never signed in the first place. The state would NOT be "Invalid".❞
So what would happen is this:
Network Next Hop Metric LocPrf Weight Path >* 126.96.36.199/21 188.8.131.52 10 100 0 3333 i
Obviously, in this case the protection against unauthorized origination of the prefixes in question would go away, but in the normal situation where nobody tries to hijack those prefixes, they would still be reachable and a mistake with certificate or ROA expiration wouldn't immediately lead to a network disappearing off of the internet.
In other words: deploy RPKI today. It doesn't protect against all forms of malicious address hijacking, but it does offer very robust protection against accidental unauthorized route origination, such as the infamous Youtube/Pakistan incident. Also, you can run an RPKI validator locally without the need for your upstream ISPs or peers to do the same.
As you may have noticed, I write about BGP from time to time. When coming up with example configurations, there's always the challenge of which AS numbers and IP addresses/prefixes to use. Although it's unlikely people will simply copy numbers and addresses from examples into their own BGP configurations, experience with NTP has shown that this can be a real problem, so it's a good idea to avoid "real" addresses and numbers in examples.
One obvious choice for IPv4 addresses in examples is the RFC 1918 space: 10.0.0.0/8, 172.16.0.0/12 and 192.168.0.0. For IPv6, you could use the unique (site) local addresses (ULA, RFC 4193: fc00::/7, or, more precisely, the ones you get to generate yourself in fd00::/8. I wouldn't recommend using the original site local IPv6 addresses (fec0::/10), as these are "deprecated" in RFC 3879.
For AS numbers, there's the private range 64512 - 65534 (16 bit) and 4200000000 - 4294967294 (32 bit) in the IANA registry.
However, there are also address and AS number ranges specifically set aside for example and documentation use. These have the advantage that they can easily be recognized as being intended for documentation, and won't clash with ranges used in private networks. They are:
I actually didn't know about the documentation AS number ranges and the second and third IPv4 documentation ranges. The extra IPv4 ranges will be very useful, as just 192.0.2.0/24 often isn't enough in more complex BGP examples, especially as I don't want to give the impression that it's possible to deaggregate a /24 into smaller parts. The 16-bit documentation AS numbers will also be useful. Unfortunately, the 32-bit ones aren't really useful as they look too much like 16-bit numbers. However, the 65552 - 131071 range is "reserved" so I guess I'll continue to use AS numbers in the 9xxxx range as examples of 32-bit ASNs. Archives of all articles - RSS feed
My Books: "BGP" and "Running IPv6"On this page you can find more information about my book "BGP". Or you can jump immediately to chapter 6, "Traffic Engineering", (approx. 150kB) that O'Reilly has put online as a sample chapter. Information about the Japanese translation can be found here.
More information about my second book, "Running IPv6", is available here.
BGP SecurityBGP has some security holes. This sounds very bad, and of course it isn't good, but don't be overly alarmed. There are basically two problems: sessions can be hijacked, and it is possible to inject incorrect information into the BGP tables for someone who can either hijack a session or someone who has a legitimate BGP session.
Session hijacking is hard to do for someone who can't see the TCP sequence number for the TCP session the BGP protocol runs over, and if there are good anti-spoofing filters it is even impossible. And of course using the TCP MD5 password option (RFC 2385) makes all of this nearly impossible even for someone who can sniff the BGP traffic.
Nearly all ISPs filter BGP information from customers, so in most cases it isn't possible to successfully inject false information. However, filtering on peering sessions between ISPs isn't as widespread, although some networks do this. A rogue ISP could do some real damage here.
There are now two efforts underway to better secure BGP:
The IETF RPSEC (routing protocol security) working group is active in this area.
What is BGPexpert.com?BGPexpert.com is a website dedicated to Internet routing issues. What we want is for packets to find their way from one end of the globe to another, and make the jobs of the people that make this happen a little easier.
Ok, but what is BGP?Have a look at the "what is BGP" page. There is also a list of BGP and interdomain routing terms on this page.
BGP and MultihomingIf you are not an ISP, your main reason to be interested in BGP will probably be to multihome. By connecting to two or more ISPs at the same time, you are "multihomed" and you no longer have to depend on a single ISP for your network connectivity.
This sounds simple enough, but as always, there is a catch. For regular customers, it's the Internet Service Provider who makes sure the rest of the Internet knows where packets have to be sent to reach their customer. If you are multihomed, you can't let your ISP do this, because then you would have to depend on a single ISP again. This is where the BGP protocol comes in: this is the protocol used to carry this information from ISP to ISP. By announcing reachability information for your network to two ISPs, you can make sure everybody still knows how to reach you if one of those ISPs has an outage.
For those of you interested in multihoming in IPv6 (which is pretty much impossible at the moment), have a look at the "IPv6 multihoming solutions" page.
Are you a BGP expert? Take the test to find out!
These questions are somewhat Cisco-centric. We now also have another set of questions and answers for self-study purposes.
You are visiting bgpexpert.com over IPv4. Your address is 184.108.40.206.