draft-ietf-grow-diverse-bgp-path-dist-01.txt | draft-ietf-grow-diverse-bgp-path-dist-02.txt | |||
---|---|---|---|---|
GROW Working Group R. Raszuk, Ed. | GROW Working Group R. Raszuk, Ed. | |||
Internet-Draft R. Fernando | Internet-Draft R. Fernando | |||
Intended status: Informational K. Patel | Intended status: Informational K. Patel | |||
Expires: December 25, 2010 Cisco Systems | Expires: January 9, 2011 Cisco Systems | |||
D. McPherson | D. McPherson | |||
Arbor Networks | Verisign | |||
K. Kumaki | K. Kumaki | |||
KDDI Corporation | KDDI Corporation | |||
June 23, 2010 | July 8, 2010 | |||
Distribution of diverse BGP paths. | Distribution of diverse BGP paths. | |||
draft-ietf-grow-diverse-bgp-path-dist-01 | draft-ietf-grow-diverse-bgp-path-dist-02 | |||
Abstract | Abstract | |||
The BGP4 protocol specifies the selection and propagation of a single | The BGP4 protocol specifies the selection and propagation of a single | |||
best path for each prefix. As defined today BGP has no mechanisms to | best path for each prefix. As defined today BGP has no mechanisms to | |||
distribute paths other then best path between it's speakers. This | distribute paths other then best path between its speakers. This | |||
behaviour results in number of disadvantages for new applications and | behaviour results in number of disadvantages for new applications and | |||
services. | services. | |||
This document presents an alternative mechanism for solving the | This document presents an alternative mechanism for solving the | |||
problem based on the concept of parallel route reflector planes. It | problem based on the concept of parallel route reflector planes. | |||
also compares existing solutions and proposed ideas that enable | Such planes can be build in parallel or they can co-exit on the | |||
distribution of more paths than just the best path. | current route reflection platforms. Document also compares existing | |||
solutions and proposed ideas that enable distribution of more paths | ||||
than just the best path. | ||||
This proposal does not specify any changes to the BGP protocol | This proposal does not specify any changes to the BGP protocol | |||
definition. It does not require upgrades to provider edge or core | definition. It does not require upgrades to provider edge or core | |||
routers nor does it need network wide upgrades. The authors believe | routers nor does it need network wide upgrades. The authors believe | |||
that the GROW WG would be the best place for this work. | that the GROW WG would be the best place for this work. | |||
Status of this Memo | Status of this Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on January 9, 2011. | ||||
This Internet-Draft will expire on December 25, 2010. | ||||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2010 IETF Trust and the persons identified as the | Copyright (c) 2010 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
2. History . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 2. History . . . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
2.1. BGP Add-Paths Proposal . . . . . . . . . . . . . . . . . . 3 | 2.1. BGP Add-Paths Proposal . . . . . . . . . . . . . . . . . . 4 | |||
3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 3. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6 | |||
4. Multi plane route reflection . . . . . . . . . . . . . . . . . 5 | 4. Multi plane route reflection . . . . . . . . . . . . . . . . . 6 | |||
4.1. Co-located best and backup path RRs . . . . . . . . . . . 8 | 4.1. Co-located best and backup path RRs . . . . . . . . . . . 9 | |||
4.2. Randomly located best and backup path RRs . . . . . . . . 9 | 4.2. Randomly located best and backup path RRs . . . . . . . . 10 | |||
4.3. Multi plane route servers for Internet Exchanges . . . . . 11 | 4.3. Multi plane route servers for Internet Exchanges . . . . . 13 | |||
5. Discussion on current models of IBGP route distribution . . . 12 | 5. Discussion on current models of IBGP route distribution . . . 13 | |||
5.1. Full Mesh . . . . . . . . . . . . . . . . . . . . . . . . 12 | 5.1. Full Mesh . . . . . . . . . . . . . . . . . . . . . . . . 13 | |||
5.2. Confederations . . . . . . . . . . . . . . . . . . . . . . 13 | 5.2. Confederations . . . . . . . . . . . . . . . . . . . . . . 15 | |||
5.3. Route reflectors . . . . . . . . . . . . . . . . . . . . . 14 | 5.3. Route reflectors . . . . . . . . . . . . . . . . . . . . . 15 | |||
6. Deployment considerations . . . . . . . . . . . . . . . . . . 14 | 6. Deployment considerations . . . . . . . . . . . . . . . . . . 15 | |||
7. Summary of benefits . . . . . . . . . . . . . . . . . . . . . 16 | 7. Summary of benefits . . . . . . . . . . . . . . . . . . . . . 17 | |||
8. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 16 | 8. Applications . . . . . . . . . . . . . . . . . . . . . . . . . 18 | |||
9. Security considerations . . . . . . . . . . . . . . . . . . . 17 | 9. Security considerations . . . . . . . . . . . . . . . . . . . 18 | |||
10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 17 | 10. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 18 | |||
11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 17 | 11. Contributors . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 18 | 12. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 18 | 13. References . . . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
13.1. Normative References . . . . . . . . . . . . . . . . . . . 18 | 13.1. Normative References . . . . . . . . . . . . . . . . . . . 19 | |||
13.2. Informative References . . . . . . . . . . . . . . . . . . 18 | 13.2. Informative References . . . . . . . . . . . . . . . . . . 20 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 19 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 21 | |||
1. Introduction | 1. Introduction | |||
Current BGP4 [RFC4271] protocol specification allows for the | Current BGP4 [RFC4271] protocol specification allows for the | |||
selection and propagation of only one best path for each prefix. The | selection and propagation of only one best path for each prefix. The | |||
BGP protocol as defined today has no mechanism to distribute other | BGP protocol as defined today has no mechanism to distribute other | |||
then best path between it's speakers. This behaviour results in a | then best path between its speakers. This behaviour results in a | |||
number of problems in the deployment of new applications and | number of problems in the deployment of new applications and | |||
services. | services. | |||
This document presents an alternative mechanism for solving the | This document presents an alternative mechanism for solving the | |||
problem based on the concept of parallel route reflector planes. It | problem based on the concept of parallel route reflector planes. It | |||
also compares existing solutions and proposed ideas that enable | also compares existing solutions and proposed ideas that enable | |||
distribution of more paths than just the best path. The parallel | distribution of more paths than just the best path. The parallel | |||
route reflector planes solution brings very significant benefits at a | route reflector planes solution brings very significant benefits at a | |||
negligible capex and opex deployment price as compared to the | negligible capex and opex deployment price as compared to the | |||
alternative techniques and is being considered by a number of network | alternative techniques and is being considered by a number of network | |||
operators for deployment in their networks. | operators for deployment in their networks. | |||
This proposal does not specify any changes to the BGP protocol | This proposal does not specify any changes to the BGP protocol | |||
definition. It does not require upgrades to provider edge or core | definition. It does not require upgrades to provider edge or core | |||
routers nor does it need network wide upgrades. The authors believe | routers nor does it need network wide upgrades. The only upgrade | |||
that the GROW WG would be the best place for this work. | required is the new functionality on the new or current route | |||
reflectors. The authors believe that the GROW WG would be the best | ||||
place for this work. | ||||
2. History | 2. History | |||
The need to disseminate more paths than just the best path is | The need to disseminate more paths than just the best path is | |||
primarily driven by two requirements. One of them is the problem of | primarily driven by three requirements. First is the problem of BGP | |||
BGP oscillations [I-D.ietf-idr-route-oscillation]. The second is the | oscillations [I-D.ietf-idr-route-oscillation]. The second is the | |||
desire for reduction of time of reachability restoration in the event | desire for reduction of time of reachability restoration in the event | |||
of network or network element's failure. These two reasons have lead | of network or network element's failure. Third requirement is to | |||
to the proposal of BGP add-paths [I-D.ietf-idr-add-paths]. | enhance BGP load balancing capabilities. Those reasons have lead to | |||
the proposal of BGP add-paths [I-D.ietf-idr-add-paths]. | ||||
2.1. BGP Add-Paths Proposal | 2.1. BGP Add-Paths Proposal | |||
As it has been proven that distribution of only the best path of a | As it has been proven that distribution of only the best path of a | |||
route is not sufficient to meet the needs of continuously growing | route is not sufficient to meet the needs of continuously growing | |||
number of services carried over BGP the add-paths proposal was | number of services carried over BGP the add-paths proposal was | |||
submitted in 2002 to enable BGP to distribute more then one path. | submitted in 2002 to enable BGP to distribute more then one path. | |||
This is achieved by including as a part of the NLRI an additional | This is achieved by including as a part of the NLRI an additional | |||
four octet value called the Path Identifier. | four octet value called the Path Identifier. | |||
The implication of this change on a BGP implementation is that it | The implication of this change on a BGP implementation is that it | |||
must now maintain per path, instead of per prefix, peer advertisement | must now maintain per path, instead of per prefix, peer advertisement | |||
state to track which of the peers each path was advertised to. This | state to track which of the peers each path was advertised to. This | |||
new requirement has it's own memory and processing cost. Suffice to | new requirement has its own memory and processing cost. Suffice to | |||
say that by the middle of 2009 none of the commercial BGP | say that by the end of 2009 none of the commercial BGP implementation | |||
implementation can claim to support the new add-path behaviour in | could claim to support the new add-path behaviour in production code, | |||
production code, in part because of this resource overhead. | in part because of this resource overhead. | |||
An important observation is that distribution of more than one best | An important observation is that distribution of more than one best | |||
path by Autonomous System Border Routers (ASBRs) with multiple EBGP | path by Autonomous System Border Routers (ASBRs) with multiple EBGP | |||
peers attached to it where no "next hop self" is set may result in | peers attached to it where no "next hop self" is set may result in | |||
bestpath selection inconsistency within the autonomous system. | bestpath selection inconsistency within the autonomous system. | |||
Therefore it is also required to attach in the form of a new | Therefore it is also required to attach in the form of a new | |||
attribute the possible tie breakers and propagate those within the | attribute the possible tie breakers and propagate those within the | |||
domain. The example of such attribute for the purpose of fast | domain. The example of such attribute for the purpose of fast | |||
connectivity restoration to address that very case of ASBR injecting | connectivity restoration to address that very case of ASBR injecting | |||
multiple external paths into the IBGP mesh has been presented and | multiple external paths into the IBGP mesh has been presented and | |||
skipping to change at page 4, line 27 | skipping to change at page 5, line 30 | |||
propagated information also best path selection is recommended to be | propagated information also best path selection is recommended to be | |||
modified to make sure that best and backup path selection within the | modified to make sure that best and backup path selection within the | |||
domain stays consistent. More discussion on this particular point | domain stays consistent. More discussion on this particular point | |||
will be contained in the deployment considerations section below. In | will be contained in the deployment considerations section below. In | |||
the proposed solution in this document we observe that in order to | the proposed solution in this document we observe that in order to | |||
address most of the applications just use of best external | address most of the applications just use of best external | |||
advertisement is required. For ASBRs which are peering to multiple | advertisement is required. For ASBRs which are peering to multiple | |||
upstream ASs setting "next hop self" is recommended. | upstream ASs setting "next hop self" is recommended. | |||
The add paths protocol extensions have to be implemented by all the | The add paths protocol extensions have to be implemented by all the | |||
routers within an AS in order for the system to work correctly. The | routers within an AS in order for the system to work correctly. It | |||
required code modifications include enhancements such as the Fast | remains quite a research topic to analyze benefits or risk associated | |||
with partial add-paths deployments. The risk becomes even greater in | ||||
networks not using some form of edge to edge encapsulation. | ||||
The required code modifications include enhancements such as the Fast | ||||
Connectivity Restoration Using BGP Add-path | Connectivity Restoration Using BGP Add-path | |||
[I-D.pmohapat-idr-fast-conn-restore]. The deployment of such | [I-D.pmohapat-idr-fast-conn-restore]. The deployment of such | |||
technology in an entire service provider network requires software | technology in an entire service provider network requires software | |||
and perhaps sometimes in the cases of End-of-Engineering or End-of- | and perhaps sometimes in the cases of End-of-Engineering or End-of- | |||
Life equipment even hardware upgrades. Such an operation may or may | Life equipment even hardware upgrades. Such operation may or may not | |||
not be economically feasible. Even if add-path functionality was | be economically feasible. Even if add-path functionality was | |||
available today on all commercial routing equipment and across all | available today on all commercial routing equipment and across all | |||
vendors, experience indicates that to achieve 100% deployment | vendors, experience indicates that to achieve 100% deployment | |||
coverage within any medium or large global network may easily take | coverage within any medium or large global network may easily take | |||
years. | years. | |||
While it needs to be clearly acknowledged that the add-path mechanism | While it needs to be clearly acknowledged that the add-path mechanism | |||
provides the most general way to address the problem of distributing | provides the most general way to address the problem of distributing | |||
more then one path between BGP speakers, this document provides a | many paths between BGP speakers, this document provides a much easier | |||
much easier to deploy solution that requires no modification to the | to deploy solution that requires no modification to the BGP protocol | |||
BGP protocol. The alternative method presented is capable of | where only a few additional paths may be required. The alternative | |||
addressing critical service provider requirements for disseminating | method presented is capable of addressing critical service provider | |||
more than a single path across an AS with a significantly lower | requirements for disseminating more than a single path across an AS | |||
deployment cost. | with a significantly lower deployment cost. | |||
3. Goals | 3. Goals | |||
The proposal described in this document is not intended to compete | The proposal described in this document is not intended to compete | |||
with add-paths. Instead if deployed it is to be used as a very easy | with add-paths. Instead if deployed it is to be used as a very easy | |||
method to accommodate the majority of applications which may require | method to accommodate the majority of applications which may require | |||
presence of alternative BGP exit points. | presence of alternative BGP exit points. | |||
It is presented to network operators as a possible choice and | It is presented to network operators as a possible choice and | |||
provides those operators who need additional paths today an | provides those operators who need additional paths today an | |||
skipping to change at page 7, line 19 | skipping to change at page 8, line 19 | |||
ASBR will be available to RRs since the other peering ASBR will | ASBR will be available to RRs since the other peering ASBR will | |||
consider the IBGP path as best and will not announce (or if | consider the IBGP path as best and will not announce (or if | |||
already announced will withdraw) its own external path. The | already announced will withdraw) its own external path. The | |||
exception here is the use of BGP Best-External proposal which | exception here is the use of BGP Best-External proposal which | |||
will allow stated ASBR to still propagate to the RRs its own | will allow stated ASBR to still propagate to the RRs its own | |||
external path. Unfortunately RRs will not be able to distribute | external path. Unfortunately RRs will not be able to distribute | |||
it any further to other clients as only the overall best path | it any further to other clients as only the overall best path | |||
will be reflected. | will be reflected. | |||
The proposed solution is based on the use of additional route | The proposed solution is based on the use of additional route | |||
reflectors or new functionality enabled on the exisiting route | reflectors or new functionality enabled on the existing route | |||
reflectors that instead of distributing the best path for each route | reflectors that instead of distributing the best path for each route | |||
will distribute an alternative path other then best. The best path | will distribute an alternative path other then best. The best path | |||
(main) reflector plane distributes the best path for each route as it | (main) reflector plane distributes the best path for each route as it | |||
does today. The second plane distributes the second best path for | does today. The second plane distributes the second best path for | |||
each route and so on. Distribution of N paths for each route can be | each route and so on. Distribution of N paths for each route can be | |||
achieved by using N reflector planes. | achieved by using N reflector planes. | |||
Each plane of route reflectors is a logical entity and may or may not | Each plane of route reflectors is a logical entity and may or may not | |||
be co-located with the existing best path route reflectors. Adding a | be co-located with the existing best path route reflectors. Adding a | |||
route reflector plane to a network may be as easy as enabling a | route reflector plane to a network may be as easy as enabling a | |||
logical router partition, new BGP process or just a new configuration | logical router partition, new BGP process or just a new configuration | |||
knob on an existing route reflector and configuring an additional | knob on an existing route reflector and configuring an additional | |||
IBGP session from the current clients if required. There are no code | IBGP session from the current clients if required. There are no code | |||
changes required on the route reflector clients for this mechanism to | changes required on the route reflector clients for this mechanism to | |||
work. It is easy to observe that the installation of one or more | work. It is easy to observe that the installation of one or more | |||
additional route reflector control planes is much cheaper and an | additional route reflector control planes is much cheaper and an | |||
easier than the need of upgrading 100s of routers in the entire | easier than the need of upgrading 100s of route reflector clients in | |||
network to support different protocol encoding. | the entire network to support different protocol encoding. | |||
Diverse path route reflectors need the new ability to calculate and | Diverse path route reflectors need the new ability to calculate and | |||
propagate the Nth best path instead of the overall best path. An | propagate the Nth best path instead of the overall best path. An | |||
implementation is encouraged to enable this new functionality on a | implementation is encouraged to enable this new functionality on a | |||
per neighbor basis. | per neighbor basis. | |||
While this is an implementation detail, the code to calculate Nth | While this is an implementation detail, the code to calculate Nth | |||
best path is also required by other BGP solutions. For example in | best path is also required by other BGP solutions. For example in | |||
the application of fast connectivity restoration BGP must calculate a | the application of fast connectivity restoration BGP must calculate a | |||
backup path for installation into the RIB and FIB ahead of the actual | backup path for installation into the RIB and FIB ahead of the actual | |||
skipping to change at page 8, line 47 | skipping to change at page 9, line 47 | |||
*** *** | *** *** | |||
ASBR1 ASBR2 | ASBR1 ASBR2 | |||
EBGP | EBGP | |||
Figure2: Co-located 2nd best RR plane | Figure2: Co-located 2nd best RR plane | |||
The following is a list of configuration changes required to enable | The following is a list of configuration changes required to enable | |||
the 2nd best path route reflector plane: | the 2nd best path route reflector plane: | |||
1. Adding RR1' and RR2' either as logical or physical new control | 1. Unless same RR1/RR2 platform is being used adding RR1' and RR2' | |||
plane RRs in the same IGP points as RR1 and RR2 respectively | either as logical or physical new control plane RRs in the same | |||
IGP points as RR1 and RR2 respectively. | ||||
2. Enabling RR1' and RR2' for 2nd plane route reflection | 2. Enabling best-external on ASBRs | |||
3. Enabling best-external on ASBRs | 3. Enabling RR1' and RR2' for 2nd plane route reflection. | |||
Alternatively instructing existing RR1 and RR2 to calculate also | ||||
2nd best path. | ||||
4. Configuring ASBR-RR's IBGP sessions | 4. Unless one of the existing RRs is turned to advertise only | |||
diverse path to it's current clients configuring new ASBRs-RR' | ||||
IBGP sessions | ||||
The expected behaviour is that under any BGP condition the ASBR3 and | The expected behaviour is that under any BGP condition the ASBR3 and | |||
P routers will receive both paths P1 and P2 for destination D. The | P routers will receive both paths P1 and P2 for destination D. The | |||
availability of both paths will allow them to implement a number of | availability of both paths will allow them to implement a number of | |||
new services as listed in the applications section below. | new services as listed in the applications section below. | |||
As an alternative to fully meshing all RRs and RRs' an operator who | As an alternative to fully meshing all RRs and RRs' an operator who | |||
has a large number of reflectors deployed today may choose to peer | has a large number of reflectors deployed today may choose to peer | |||
newly introduced RRs' to a hierarchical RR' which would be an IBGP | newly introduced RRs' to a hierarchical RR' which would be an IBGP | |||
interconnect point within the 2nd plane as well as between planes. | interconnect point within the 2nd plane as well as between planes. | |||
One of the deployment model of this scenario can be achieved by | One of the deployment model of this scenario can be achieved by | |||
simple upgrade of the existing route reflectors without the need to | simple upgrade of the existing route reflectors without the need to | |||
deploy any new logical or physical platforms. Such upgrade would | deploy any new logical or physical platforms. Such upgrade would | |||
allow route reflectors to service both upgraded to add-paths peers as | allow route reflectors to service both upgraded to add-paths peers as | |||
well as those peers which can not be immediately upgraded while in | well as those peers which can not be immediately upgraded while in | |||
the same time allowing to distribute more then single best path. | the same time allowing to distribute more then single best path. The | |||
obvious protocol benefit of using existing RRs to distribute towards | ||||
their clients best and diverse bgp paths over different IBGP session | ||||
is the automatic assurance that such client would always get | ||||
different paths with their next hop being different. | ||||
The way to accomplish this would be to create a separate IBGP session | The way to accomplish this would be to create a separate IBGP session | |||
for each N-th BGP path. Such session should be preferably terminated | for each N-th BGP path. Such session should be preferably terminated | |||
at a different loopback address of the route reflector. At the BGP | at a different loopback address of the route reflector. At the BGP | |||
OPEN stage of each such session a different bgp_router_id should be | OPEN stage of each such session a different bgp_router_id may be | |||
used. Correspondingly route reflector should also allow its clients | used. Correspondingly route reflector should also allow its clients | |||
to use the same bgp_router_id on each such session. | to use the same bgp_router_id on each such session. | |||
4.2. Randomly located best and backup path RRs | 4.2. Randomly located best and backup path RRs | |||
Now let's consider a deployment case where an operator wishes to | Now let's consider a deployment case where an operator wishes to | |||
enable a 2nd RR' plane using only a single additional router in a | enable a 2nd RR' plane using only a single additional router in a | |||
different network location to his current route reflectors. | different network location to his current route reflectors. This | |||
model would be of particular use in networks where some form of end- | ||||
to-end encapsulation (IP or MPLS) is enabled between provider edge | ||||
routers. | ||||
Note that this model of operation assumes that the present best path | Note that this model of operation assumes that the present best path | |||
route reflectors are only control plane devices. If the route | route reflectors are only control plane devices. If the route | |||
reflector is in the data forwarding path then the implementation must | reflector is in the data forwarding path then the implementation must | |||
be able to clearly separate the Nth best-path selection from the | be able to clearly separate the Nth best-path selection from the | |||
selection of the paths to be used for data forwarding. The basic | selection of the paths to be used for data forwarding. The basic | |||
premise of this mode of deployment assumes that all reflector planes | premise of this mode of deployment assumes that all reflector planes | |||
have the same information to choose from which includes the same set | have the same information to choose from which includes the same set | |||
of BGP paths. It also requires the ability to skip the comparison of | of BGP paths. It also requires the ability to ignore the step of | |||
the IGP metric to reach the bgp next hop during best-path | comparison of the IGP metric to reach the bgp next hop during best- | |||
calculation. | path calculation. | |||
ASBR3 | ASBR3 | |||
*** | *** | |||
* * | * * | |||
+------------* *-----------+ | +------------* *-----------+ | |||
| AS1 * * | | | AS1 * * | | |||
| IBGP *** | | | IBGP *** | | |||
| | | | | | |||
| *** | | | *** | | |||
| * * | | | * * | | |||
skipping to change at page 10, line 35 | skipping to change at page 11, line 44 | |||
+-----* *---------* *----+ | +-----* *---------* *----+ | |||
* * * * | * * * * | |||
*** *** | *** *** | |||
ASBR1 ASBR2 | ASBR1 ASBR2 | |||
EBGP | EBGP | |||
Figure3: Experimental deployment of 2nd best RR | Figure3: Experimental deployment of 2nd best RR | |||
The following is a list of configuration changes required to enable | The following is a list of configuration changes required to enable | |||
the 2nd best path route reflector RR' as a single platform: | the 2nd best path route reflector RR' as a single platform or to | |||
enable one of the existing control plane RRs for diverse-path | ||||
functionality: | ||||
1. Adding RR' logical or physical as new route reflector anywhere in | 1. If needed adding RR' logical or physical as new route reflector | |||
the network | anywhere in the network | |||
2. Enabling RR' for 2nd plane route reflection | 2. Enabling best-external on ASBRs | |||
3. Enabling best-external on ASBRs | 3. Disabling IGP metric check in BGP best path on all route | |||
reflectors. | ||||
4. Fully meshing newly added RRs' with the all other reflectors in | 4. Enabling RR' or any of the existing RR for 2nd plane path | |||
both planes. That condition does not apply if the newly added | calculation | |||
RR'(s) already have peering to all ASBRs/PEs. | ||||
5. Configuring ASBRs-RR' IBGP sessions | 5. If required fully meshing newly added RRs' with the all other | |||
reflectors in both planes. That condition does not apply if the | ||||
newly added RR'(s) already have peering to all ASBRs/PEs. | ||||
6. Disabling IGP metric check in BGP best path on all route | 6. Unless one of the existing RRs is turned to advertise only | |||
reflectors. | diverse path to it's current clients configuring new ASBRs-RR' | |||
IBGP sessions | ||||
In this scenario the operator has the flexibility to instroduce the | In this scenario the operator has the flexibility to introduce the | |||
new additional route reflector on any existing or new hardware in the | new additional route reflector functionality on any existing or new | |||
network. Any of the existing routers that are not already members of | hardware in the network. Any of the existing routers that are not | |||
the best path route reflector plane can be easily configured to serve | already members of the best path route reflector plane can be easily | |||
the 2nd plane either via using a logical / virtual router partition | configured to serve the 2nd plane either via using a logical / | |||
or by local implementation hooks. | virtual router partition or by having their bgp implementation | |||
compliant to this specification. | ||||
Even if the IGP metric is not taken into consideration when comparing | Even if the IGP metric is not taken into consideration when comparing | |||
paths during the bestpath calculation, an implementation still has to | paths during the bestpath calculation, an implementation still has to | |||
consider paths with unreachable nexthops as invalid. It is worth | consider paths with unreachable nexthops as invalid. It is worth | |||
pointing out that some implementations today already allow for | pointing out that some implementations today already allow for | |||
configuration which results in no IGP metric comparison during the | configuration which results in no IGP metric comparison during the | |||
best path calculation. | best path calculation. | |||
The additional planes of route reflectors do not need to be fully | The additional planes of route reflectors do not need to be fully | |||
redundant as the primary one does. If we are preparing for a single | redundant as the primary one does. If we are preparing for a single | |||
skipping to change at page 11, line 36 | skipping to change at page 13, line 8 | |||
redundantly by installing not one, but two or more route reflectors | redundantly by installing not one, but two or more route reflectors | |||
serving each additional plane the additional robustness will be | serving each additional plane the additional robustness will be | |||
achieved. | achieved. | |||
As a result of this solution ASBR3 and other ASBRs peering to RR' | As a result of this solution ASBR3 and other ASBRs peering to RR' | |||
will be receiving the 2nd best path. | will be receiving the 2nd best path. | |||
Similarly to section 4.1 as an alternative to fully meshing all RRs & | Similarly to section 4.1 as an alternative to fully meshing all RRs & | |||
RRs' an operator who may have a large number of reflectors already | RRs' an operator who may have a large number of reflectors already | |||
deployed today may choose to peer newly introduced RRs' to a | deployed today may choose to peer newly introduced RRs' to a | |||
hierarchical RR' which would be an IBGP interconnect point within the | hierarchical RR' which would be an IBGP interconnect point between | |||
2nd plane as well as between planes. | planes. | |||
4.3. Multi plane route servers for Internet Exchanges | 4.3. Multi plane route servers for Internet Exchanges | |||
Another group of devices where the proposed multi-plane architecture | Another group of devices where the proposed multi-plane architecture | |||
may be of particular applicability are EBGP route servers used at the | may be of particular applicability are EBGP route servers used at | |||
majority of internet exchange points. | many of internet exchange points. | |||
In such cases 100s of ISPs are interconnected on a common LAN. | In such cases 100s of ISPs are interconnected on a common LAN. | |||
Instead of having 100s of direct EBGP sessions on each exchange | Instead of having 100s of direct EBGP sessions on each exchange | |||
client, a single peering is created to the transparent route server. | client, a single peering is created to the transparent route server. | |||
The route server can only propagate a single best path. Mandating | The route server can only propagate a single best path. Mandating | |||
the upgrade for 100s of different service providers in order to | the upgrade for 100s of different service providers in order to | |||
implement add-path may be much more difficult as compared to asking | implement add-path may be much more difficult as compared to asking | |||
them for provisioning one new EBGP session to an Nth best-path route | them for provisioning one new EBGP session to an Nth best-path route | |||
server plane. | server plane. That will allow to distribute more then single best | |||
BGP path from a given route server to such IX peer. | ||||
The solution proposed in this document fits very well with the | The solution proposed in this document fits very well with the | |||
requirement of having broader EBGP path diversity among the members | requirement of having broader EBGP path diversity among the members | |||
of any Internet Exchange Point. | of any Internet Exchange Point. | |||
5. Discussion on current models of IBGP route distribution | 5. Discussion on current models of IBGP route distribution | |||
In today's networks BGP4 operates as specified in [RFC4271] | In today's networks BGP4 operates as specified in [RFC4271] | |||
There are a number of technology choices for intra-AS BGP route | There are a number of technology choices for intra-AS BGP route | |||
skipping to change at page 14, line 33 | skipping to change at page 16, line 8 | |||
6. Deployment considerations | 6. Deployment considerations | |||
The diverse BGP path dissemination proposal allows the distribution | The diverse BGP path dissemination proposal allows the distribution | |||
of more paths than just the best-path to route reflector or route | of more paths than just the best-path to route reflector or route | |||
server clients of today's BGP4 implementations. | server clients of today's BGP4 implementations. | |||
From the client's point of view receiving additional paths via | From the client's point of view receiving additional paths via | |||
separate IBGP sessions terminated at the new router reflector plane | separate IBGP sessions terminated at the new router reflector plane | |||
is functionally equivalent to constructing a full mesh peering | is functionally equivalent to constructing a full mesh peering | |||
without the problems that such a full mesh would come with (discussed | without the problems that such a full mesh would come with set of | |||
in section 2.1). | problems as discussed in earlier section. | |||
By precisely defining the number of reflector planes, network | By precisely defining the number of reflector planes, network | |||
operators have full control over the number of redundant paths in the | operators have full control over the number of redundant paths in the | |||
network. This number can be defined to address the needs of the | network. This number can be defined to address the needs of the | |||
service(s) being deployed. | service(s) being deployed. | |||
The Nth plane route reflectors should be acting as control plane | The Nth plane route reflectors should be acting as control plane | |||
devices. While they can be provisioned on the current production | network entities. While they can be provisioned on the current | |||
routers selected backup BGP paths should not be used directly in the | production routers selected Nth best BGP paths should not be used | |||
date plane. Use of the calculated Nth path by the RRs can lead to | directly in the date plane with the exception of such paths being BGP | |||
inconsistent best-path selection in the domain. For the purposes of | multipath eligible and such functionality is enabled. On RRs being | |||
local RIB / FIB installation, any router (including the RRs) which is | in the data plane unless multipath is enabled 2nd best path is | |||
in the data path must use the overall global best and Nth best paths. | expected to be a backup path and should be installed as such into | |||
local RIB/FIB. | ||||
The proposed architecture deployed along with the BGP best-external | The proposed architecture deployed along with the BGP best-external | |||
functionality covers all three cases where the classic BGP route | functionality covers all three cases where the classic BGP route | |||
reflection paradigm would fail to distribute alternate exit points | reflection paradigm would fail to distribute alternate exit points | |||
paths. | paths. | |||
1. ASBRs advertising their single best external paths with no local- | 1. ASBRs advertising their single best external paths with no local- | |||
preference or multi-exit-discriminator present. | preference or multi-exit-discriminator present. | |||
2. ASBRs advertising their single best external paths with local- | 2. ASBRs advertising their single best external paths with local- | |||
preference or multi-exit-discriminator present and with BGP best- | preference or multi-exit-discriminator present and with BGP best- | |||
external functionality enabled. | external functionality enabled. | |||
3. ASBRs with multiple external paths. | 3. ASBRs with multiple external paths. | |||
Let's discuss the last (3rd) case in more detail. This describes the | Let's discuss the 3rd above case in more detail. This describes the | |||
scenario of a single ASBR connected to multiple EBGP peers. In | scenario of a single ASBR connected to multiple EBGP peers. In | |||
practice this peering scenario is quite common. It is mostly due to | practice this peering scenario is quite common. It is mostly due to | |||
the geographic location of EBGP peers and the diversity of those | the geographic location of EBGP peers and the diversity of those | |||
peers (for example peering to multiple tier 1 ISPs etc...). It is | peers (for example peering to multiple tier 1 ISPs etc...). It is | |||
not designed for failure recovery scenarios as single failure of the | not designed for failure recovery scenarios as single failure of the | |||
ASBR would simultaneously result in loss of connectivity to all of | ASBR would simultaneously result in loss of connectivity to all of | |||
the peers. In most medium and large geographically distributed | the peers. In most medium and large geographically distributed | |||
networks there is always another ASBR or multiple ASBRs providing | networks there is always another ASBR or multiple ASBRs providing | |||
peering backups, typically in other geographically diverse locations | peering backups, typically in other geographically diverse locations | |||
in the network. | in the network. | |||
skipping to change at page 15, line 40 | skipping to change at page 17, line 15 | |||
common reason for not setting next hop self is traditionally the | common reason for not setting next hop self is traditionally the | |||
associated drawback of loosing ability to signal the external | associated drawback of loosing ability to signal the external | |||
failures of peering ASBRs or links to those ASBRs by fast IGP | failures of peering ASBRs or links to those ASBRs by fast IGP | |||
flooding. Such potential drawback can be easily avoided by using | flooding. Such potential drawback can be easily avoided by using | |||
different peering address from the address used for next hop mapping | different peering address from the address used for next hop mapping | |||
as well as removing such next hop from IGP at the last possible BGP | as well as removing such next hop from IGP at the last possible BGP | |||
path failure. | path failure. | |||
Herein one may correctly observe that in the case of setting next hop | Herein one may correctly observe that in the case of setting next hop | |||
self on an ASBR, attributes of other external paths such ASBR is | self on an ASBR, attributes of other external paths such ASBR is | |||
peering with may be different from the attributes of it's best | peering with may be different from the attributes of its best | |||
external path. Therefore, not injecting all of those external paths | external path. Therefore, not injecting all of those external paths | |||
with their corresponding attribute can not be compared to equivalent | with their corresponding attribute can not be compared to equivalent | |||
paths for the same prefix coming from different ASBRs. | paths for the same prefix coming from different ASBRs. | |||
While such observation in principle is correct one should put things | While such observation in principle is correct one should put things | |||
in perspective of the overall goal which is to provide data plane | in perspective of the overall goal which is to provide data plane | |||
connectivity upon a single failure with minimal interruption/packet | connectivity upon a single failure with minimal interruption/packet | |||
loss. During such transient conditions, using even potentially | loss. During such transient conditions, using even potentially | |||
suboptimal exit points is reasonable, so long as forwarding | suboptimal exit points is reasonable, so long as forwarding | |||
information loops are not introduced. In the mean time BGP control | information loops are not introduced. In the mean time BGP control | |||
plane will on it's own re-advertise newly elected best external path, | plane will on its own re-advertise newly elected best external path, | |||
route reflector planes will calculate their Nth best paths and | route reflector planes will calculate their Nth best paths and | |||
propagate to it's clients. The result is that after seconds even if | propagate to its clients. The result is that after seconds even if | |||
potential sub-optimality were encountered it will be quickly and | potential sub-optimality were encountered it will be quickly and | |||
naturally healed. | naturally healed. | |||
7. Summary of benefits | 7. Summary of benefits | |||
The diverse BGP path dissemination proposal provides the following | The diverse BGP path dissemination proposal provides the following | |||
benefits when compared to the alternatives: | benefits when compared to the alternatives: | |||
1. No modifications to BGP4 protocol. | 1. No modifications to BGP4 protocol. | |||
2. No requirement for upgrades to edge and core routers. Backward | 2. No requirement for upgrades to edge and core routers. Backward | |||
compatible with the existing BGP deployments. | compatible with the existing BGP deployments. | |||
3. Can be easily enabled by introduction of a new route reflector / | 3. Can be easily enabled by introduction of a new route reflector, | |||
route server plane dedicated to the selection and distribution of | route server plane dedicated to the selection and distribution of | |||
Nth best-path. | Nth best-path or just by new configuration of the upgraded | |||
current route reflector(s). | ||||
4. Does not require major modification to BGP implementations in the | 4. Does not require major modification to BGP implementations in the | |||
entire network which will result in an unnecessary increase of | entire network which will result in an unnecessary increase of | |||
memory and CPU consumption due to the shift from today's per | memory and CPU consumption due to the shift from today's per | |||
prefix to a per path advertisement state tracking. | prefix to a per path advertisement state tracking. | |||
5. Can be safely deployed gradually through addition of a single | 5. Can be safely deployed gradually on a RR cluster basis. | |||
logical or physical route reflector with the new functionality | ||||
described in this document. | ||||
6. The proposed solution is equally applicable to any BGP address | 6. The proposed solution is equally applicable to any BGP address | |||
family as described in Multiprotocol Extensions for BGP-4 RFC4760 | family as described in Multiprotocol Extensions for BGP-4 RFC4760 | |||
[RFC4760]. In particular it can be used "as is" without any | [RFC4760]. In particular it can be used "as is" without any | |||
modifications to both IPv4 and IPv6 address families. | modifications to both IPv4 and IPv6 address families. | |||
8. Applications | 8. Applications | |||
This section lists the most common applications which require | This section lists the most common applications which require | |||
presence of redundant BGP paths: | presence of redundant BGP paths: | |||
skipping to change at page 17, line 12 | skipping to change at page 18, line 33 | |||
maintenane requirements as described in | maintenane requirements as described in | |||
[I-D.decraene-bgp-graceful-shutdown-requirements]. | [I-D.decraene-bgp-graceful-shutdown-requirements]. | |||
2. Multi-path load balancing for both IBGP and EBGP. | 2. Multi-path load balancing for both IBGP and EBGP. | |||
3. BGP control plane churn reduction both intra-domain and inter- | 3. BGP control plane churn reduction both intra-domain and inter- | |||
domain. | domain. | |||
An important point to observe is that all of the above intra-domain | An important point to observe is that all of the above intra-domain | |||
applications based on the use of reflector planes but are also | applications based on the use of reflector planes but are also | |||
applicable in the inter-domain Internet exchange case. As discussed | applicable in the inter-domain Internet exchange point examples. As | |||
in section 4.3 an internet exchange can deploy shadow route server | discussed in section 4.3 an internet exchange can conceptually deploy | |||
slices each responsible for distribution of an Nth best path to it's | shadow route server planes each responsible for distribution of an | |||
EBGP peers. | Nth best path to its EBGP peers. In practice it may just equal to | |||
new short configuration and establishment of new BGP sessions to IX | ||||
peers. | ||||
9. Security considerations | 9. Security considerations | |||
The new mechanism for diverse BGP path dissemination proposed in this | The new mechanism for diverse BGP path dissemination proposed in this | |||
document does not introduce any new security concerns as compared to | document does not introduce any new security concerns as compared to | |||
base BGP4 specification [RFC4271]. | base BGP4 specification [RFC4271]. | |||
10. IANA Considerations | 10. IANA Considerations | |||
The new mechanism for diverse BGP path dissemination does not require | The new mechanism for diverse BGP path dissemination does not require | |||
skipping to change at page 18, line 20 | skipping to change at page 19, line 33 | |||
Isidor Kouvelas | Isidor Kouvelas | |||
Cisco Systems | Cisco Systems | |||
170 West Tasman Drive | 170 West Tasman Drive | |||
San Jose, CA 95134 | San Jose, CA 95134 | |||
US | US | |||
Email: kouvelas@cisco.com | Email: kouvelas@cisco.com | |||
12. Acknowledgments | 12. Acknowledgments | |||
The authors would like to thank Bruno Decraene, Bart Peirens and Eric | The authors would like to thank Bruno Decraene, Bart Peirens, Eric | |||
Rosen for their valuable input. | Rosen, Jim Uttaro, Renwei Li and George Wes for their valuable input. | |||
The authors would also like to express special thank you to number of | ||||
operators who helped to optimize the provided solution to be as close | ||||
as possible to their daily operational practices. Especially many | ||||
thx goes to Ted Seely, Shan Amante, Benson Schliesser and Seiichi | ||||
Kawamura. | ||||
13. References | 13. References | |||
13.1. Normative References | 13.1. Normative References | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | |||
Requirement Levels", BCP 14, RFC 2119, March 1997. | Requirement Levels", BCP 14, RFC 2119, March 1997. | |||
[RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway | [RFC4271] Rekhter, Y., Li, T., and S. Hares, "A Border Gateway | |||
Protocol 4 (BGP-4)", RFC 4271, January 2006. | Protocol 4 (BGP-4)", RFC 4271, January 2006. | |||
skipping to change at page 20, line 21 | skipping to change at page 21, line 39 | |||
Keyur Patel | Keyur Patel | |||
Cisco Systems | Cisco Systems | |||
170 West Tasman Drive | 170 West Tasman Drive | |||
San Jose, CA 95134 | San Jose, CA 95134 | |||
US | US | |||
Email: keyupate@cisco.com | Email: keyupate@cisco.com | |||
Danny McPherson | Danny McPherson | |||
Arbor Networks | Verisign | |||
21345 Ridgetop Circle | ||||
Email: danny@arbor.net | Dulles, VA 20166 | |||
US | ||||
Email: dmcpherson@verisign.com | ||||
Kenji Kumaki | Kenji Kumaki | |||
KDDI Corporation | KDDI Corporation | |||
Garden Air Tower | Garden Air Tower | |||
Iidabashi, Chiyoda-ku, Tokyo 102-8460 | Iidabashi, Chiyoda-ku, Tokyo 102-8460 | |||
Japan | Japan | |||
Email: ke-kumaki@kddi.com | Email: ke-kumaki@kddi.com | |||
End of changes. 50 change blocks. | ||||
116 lines changed or deleted | 153 lines changed or added | |||
This html diff was produced by rfcdiff 1.38. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |