draft-ietf-grow-bgp-gshut-08.txt | draft-ietf-grow-bgp-gshut-09.txt | |||
---|---|---|---|---|
Network Working Group Pierre Francois | Network Working Group P. Francois | |||
Internet-Draft Individual Contributor | Internet-Draft Individual Contributor | |||
Intended status: Informational B. Decraene | Intended status: Informational B. Decraene | |||
Expires: December 27, 2017 Orange | Expires: January 4, 2018 Orange | |||
C. Pelsser | C. Pelsser | |||
Strasbourg University | Strasbourg University | |||
K. Patel | K. Patel | |||
Arrcus, Inc. | Arrcus, Inc. | |||
C. Filsfils | C. Filsfils | |||
Cisco Systems | Cisco Systems | |||
June 25, 2017 | July 3, 2017 | |||
Graceful BGP session shutdown | Graceful BGP session shutdown | |||
draft-ietf-grow-bgp-gshut-08 | draft-ietf-grow-bgp-gshut-09 | |||
Abstract | Abstract | |||
This draft describes operational procedures aimed at reducing the | This draft describes operational procedures aimed at reducing the | |||
amount of traffic lost during planned maintenances of routers or | amount of traffic lost during planned maintenances of routers or | |||
links, involving the shutdown of BGP peering sessions. It defines a | links, involving the shutdown of BGP peering sessions. It defines a | |||
well-known BGP community, called g-shut, to signal the graceful | well-known BGP community, called GRACEFUL_SHUTDOWN, to signal the | |||
shutdown of paths to other Autonomous Systems. | graceful shutdown of paths. | |||
Status of This Memo | Status of This Memo | |||
This Internet-Draft is submitted in full conformance with the | This Internet-Draft is submitted in full conformance with the | |||
provisions of BCP 78 and BCP 79. | provisions of BCP 78 and BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at http://datatracker.ietf.org/drafts/current/. | Drafts is at http://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on December 27, 2017. | This Internet-Draft will expire on January 4, 2018. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2017 IETF Trust and the persons identified as the | Copyright (c) 2017 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(http://trustee.ietf.org/license-info) in effect on the date of | (http://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
carefully, as they describe your rights and restrictions with respect | carefully, as they describe your rights and restrictions with respect | |||
to this document. Code Components extracted from this document must | to this document. Code Components extracted from this document must | |||
include Simplified BSD License text as described in Section 4.e of | include Simplified BSD License text as described in Section 4.e of | |||
the Trust Legal Provisions and are provided without warranty as | the Trust Legal Provisions and are provided without warranty as | |||
described in the Simplified BSD License. | described in the Simplified BSD License. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . 2 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
3. Packet loss upon manual EBGP session shutdown . . . . . . . . 3 | 3. Packet loss upon manual EBGP session shutdown . . . . . . . . 4 | |||
4. Practices to avoid packet losses . . . . . . . . . . . . . . 4 | 4. Practices to avoid packet losses . . . . . . . . . . . . . . 4 | |||
4.1. Improving availability of alternate paths . . . . . . . . 4 | 4.1. Improving availability of alternate paths . . . . . . . . 4 | |||
4.2. Make before break convergence: g-shut . . . . . . . . . . 4 | 4.2. Make before break convergence: graceful shutdown . . . . 5 | |||
5. Forwarding modes and transient forwarding loops during | 4.3. Forwarding modes and transient forwarding loops during | |||
convergence . . . . . . . . . . . . . . . . . . . . . . . . . 7 | convergence . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
6. Link Up cases . . . . . . . . . . . . . . . . . . . . . . . . 7 | 5. EBGP graceful shutdown procedure . . . . . . . . . . . . . . 5 | |||
6.1. Unreachability local to the ASBR . . . . . . . . . . . . 7 | 5.1. Pre-configuration . . . . . . . . . . . . . . . . . . . . 5 | |||
6.2. iBGP convergence . . . . . . . . . . . . . . . . . . . . 7 | 5.2. Operations at maintenance time . . . . . . . . . . . . . 6 | |||
5.3. BGP implementation support for g-Shut . . . . . . . . . . 6 | ||||
6. Beyond EBGP graceful shutdown . . . . . . . . . . . . . . . . 7 | ||||
6.1. IBGP graceful shutdown . . . . . . . . . . . . . . . . . 7 | ||||
6.2. Link Up cases . . . . . . . . . . . . . . . . . . . . . . 7 | ||||
7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | 7. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 8 | |||
8. Security Considerations . . . . . . . . . . . . . . . . . . . 8 | 8. Security Considerations . . . . . . . . . . . . . . . . . . . 9 | |||
9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 | 9. Acknowledgments . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . 9 | |||
10.1. Normative References . . . . . . . . . . . . . . . . . . 9 | 10.1. Normative References . . . . . . . . . . . . . . . . . . 9 | |||
10.2. Informative References . . . . . . . . . . . . . . . . . 9 | 10.2. Informative References . . . . . . . . . . . . . . . . . 9 | |||
Appendix A. Alternative techniques with limited applicability . 9 | Appendix A. Alternative techniques with limited applicability . 10 | |||
A.1. Multi Exit Discriminator tweaking . . . . . . . . . . . . 10 | A.1. Multi Exit Discriminator tweaking . . . . . . . . . . . . 10 | |||
A.2. IGP distance Poisoning . . . . . . . . . . . . . . . . . 10 | A.2. IGP distance Poisoning . . . . . . . . . . . . . . . . . 10 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 10 | Appendix B. Configuration Examples . . . . . . . . . . . . . . . 10 | |||
B.1. Cisco IOS XR . . . . . . . . . . . . . . . . . . . . . . 11 | ||||
B.2. BIRD . . . . . . . . . . . . . . . . . . . . . . . . . . 11 | ||||
B.3. OpenBGPD . . . . . . . . . . . . . . . . . . . . . . . . 12 | ||||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . 12 | ||||
1. Introduction | 1. Introduction | |||
Routing changes in BGP can be caused by planned, maintenance | Routing changes in BGP can be caused by planned, maintenance | |||
operations. This document discusses operational procedures to be | operations. This document discusses operational procedures to be | |||
applied in order to reduce or eliminate losses of packets during the | applied in order to reduce or eliminate losses of packets during the | |||
maintenance. These losses come from the transient lack of | maintenance. These losses come from the transient lack of | |||
reachability during the BGP convergence following the shutdown of an | reachability during the BGP convergence following the shutdown of an | |||
EBGP peering session between two Autonomous System Border Routers | EBGP peering session between two Autonomous System Border Routers | |||
(ASBR). | (ASBR). | |||
skipping to change at page 3, line 15 ¶ | skipping to change at page 3, line 24 ¶ | |||
trigger, in both involved ASes, rerouting to the alternate path, | trigger, in both involved ASes, rerouting to the alternate path, | |||
while allowing routers to keep using old paths until alternate ones | while allowing routers to keep using old paths until alternate ones | |||
are learned, installed in the RIB and in the FIB. This ensures that | are learned, installed in the RIB and in the FIB. This ensures that | |||
routers always have a valid route available during the convergence | routers always have a valid route available during the convergence | |||
process. | process. | |||
The goal of the document is to meet the requirements described in | The goal of the document is to meet the requirements described in | |||
[RFC6198] at best, without changing the BGP protocol. | [RFC6198] at best, without changing the BGP protocol. | |||
This document defines a well-known community [RFC1997], called | This document defines a well-known community [RFC1997], called | |||
g-shut, for the purpose of reducing the management overhead of | GRACEFUL_SHUTDOWN, for the purpose of reducing the management | |||
gracefully shutting down BGP sessions. The well-known community | overhead of gracefully shutting down BGP sessions. The well-known | |||
allows implementers to provide an automated graceful shutdown | community allows implementers to provide an automated graceful | |||
mechanism that does not require any router reconfiguration at | shutdown mechanism that does not require any router reconfiguration | |||
maintenance time. | at maintenance time. | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 2119 [RFC2119]. | |||
2. Terminology | 2. Terminology | |||
g-shut initiator: a router on which the session shutdown is performed | graceful shutdown initiator: a router on which the session shutdown | |||
for the maintenance. | is performed for the maintenance. | |||
g-shut neighbor: a router that has a BGP session, to be shutdown, | graceful shutdown receiver: a router that has a BGP session, to be | |||
with the g-shut initiator. | shutdown, with the graceful shutdown initiator. | |||
Initiator AS: the Autonomous System of the g-shut initiator. | Initiator AS: the Autonomous System of the graceful shutdown | |||
initiator. | ||||
Neighbor AS: the Autonomous System of the g-shut neighbor. | Receiver AS: the Autonomous System of the graceful shutdown receiver. | |||
Loss of Connectivity (LoC: the state when a router has no path | Loss of Connectivity (LoC: the state when a router has no path toward | |||
towards an affected prefix. | an affected prefix. | |||
3. Packet loss upon manual EBGP session shutdown | 3. Packet loss upon manual EBGP session shutdown | |||
Packets can be lost during a manual shutdown of an EBGP session for | Packets can be lost during a manual shutdown of an EBGP session for | |||
two reasons. | two reasons. | |||
First, routers involved in the convergence process can transiently | First, routers involved in the convergence process can transiently | |||
lack of paths towards an affected prefix, and drop traffic destined | lack of paths toward an affected prefix, and drop traffic destined to | |||
to this prefix. This is because alternate paths can be hidden by | this prefix. This is because alternate paths can be hidden by nodes | |||
nodes of an AS. This happens when the paths are not selected as best | of an AS. This happens when the paths are not selected as best by | |||
by the ASBR that receive them on an EBGP session, or by Route | the ASBR that receive them on an EBGP session, or by Route Reflectors | |||
Reflectors that do not propagate them further in the iBGP topology | that do not propagate them further in the IBGP topology because they | |||
because they do not select them as best. | do not select them as best. | |||
Second, within the AS, the FIB of routers can be transiently | Second, within the AS, the FIB of routers can be transiently | |||
inconsistent during the BGP convergence and packets towards affected | inconsistent during the BGP convergence and packets toward affected | |||
prefixes can loop and be dropped. Note that these loops only happen | prefixes can loop and be dropped. Note that these loops only happen | |||
when ASBR-to-ASBR encapsulation is not used within the AS. | when ASBR-to-ASBR encapsulation is not used within the AS. | |||
This document only addresses the first reason. | This document only addresses the first reason. | |||
4. Practices to avoid packet losses | 4. Practices to avoid packet losses | |||
This section describes means for an ISP to reduce the transient loss | This section describes means for an ISP to reduce the transient loss | |||
of packets upon a manual shutdown of a BGP session. | of packets upon a manual shutdown of a BGP session. | |||
skipping to change at page 4, line 28 ¶ | skipping to change at page 4, line 41 ¶ | |||
All solutions that increase the availability of alternate BGP paths | All solutions that increase the availability of alternate BGP paths | |||
at routers performing packet lookups in BGP tables such as | at routers performing packet lookups in BGP tables such as | |||
[I-D.ietf-idr-best-external] and [RFC7911] help in reducing the LoC | [I-D.ietf-idr-best-external] and [RFC7911] help in reducing the LoC | |||
bound with manual shutdown of EBGP sessions. | bound with manual shutdown of EBGP sessions. | |||
One of such solutions increasing diversity in such a way that, at any | One of such solutions increasing diversity in such a way that, at any | |||
single step of the convergence process following the EBGP session | single step of the convergence process following the EBGP session | |||
shutdown, a BGP router does not receive a message withdrawing the | shutdown, a BGP router does not receive a message withdrawing the | |||
only path it currently knows for a given NLRI, allows for a | only path it currently knows for a given NLRI, allows for a | |||
simplified g-shut procedure. | simplified graceful shutdown procedure. | |||
Note that the LoC for the inbound traffic of the maintained router, | Note that the LoC for the inbound traffic of the maintained router, | |||
induced by a lack of alternate path propagation within the iBGP | induced by a lack of alternate path propagation within the IBGP | |||
topology of a neighboring AS is not under the control of the operator | topology of a receiver AS is not under the control of the operator | |||
performing the maintenance. The part of the procedure aimed at | performing the maintenance. The part of the procedure aimed at | |||
avoiding LoC for incoming paths can thus be applied even if no LoC | avoiding LoC for incoming paths can thus be applied even if no LoC | |||
are expected for the outgoing paths. | are expected for the outgoing paths. | |||
4.2. Make before break convergence: g-shut | 4.2. Make before break convergence: graceful shutdown | |||
This section describes configurations and actions to be performed for | The goal of this procedure is to retain the paths to be shutdown | |||
between the peers, but with a lower LOCAL_PREF value, allowing the | ||||
paths to remain in use while alternate paths are selected and | ||||
propagated, rather than simply withdrawing the paths. | ||||
Section 5 describes configurations and actions to be performed for | ||||
the graceful shutdown of BGP sessions. | the graceful shutdown of BGP sessions. | |||
The goal of this procedure is to let, in both ASes, the paths being | 4.3. Forwarding modes and transient forwarding loops during convergence | |||
shutdown visible, but with a lower LOCAL_PREF value, while alternate | ||||
paths spread through the iBGP topology. Instead of withdrawing the | ||||
path, routers of an AS will keep on using it until they become aware | ||||
of alternate paths. | ||||
4.2.1. EBGP g-shut | The graceful shutdown procedure or the solutions improving the | |||
availability of alternate paths, do not change the fact that BGP | ||||
convergence and the subsequent FIB updates are run independently on | ||||
each router of the ASes. If the AS applying the solution does not | ||||
rely on encapsulation to forward packets from the Ingress Border | ||||
Router to the Egress Border Router, then transient forwarding loops | ||||
and consequent packet losses can occur during the convergence | ||||
process. If zero LoC is required, encapsulation is required between | ||||
ASBRs of the AS. | ||||
5. EBGP graceful shutdown procedure | ||||
This section describes configurations and actions to be performed for | This section describes configurations and actions to be performed for | |||
the graceful shutdown of EBGP peering links. | the graceful shutdown of EBGP peering links. | |||
4.2.1.1. Pre-configuration | 5.1. Pre-configuration | |||
On each ASBR supporting the g-shut procedure, an outbound BGP route | ||||
policy is applied on all iBGP sessions of the ASBR, that: | ||||
o matches the g-shut community | ||||
o sets the LOCAL_PREF attribute of the paths tagged with the g-shut | On each ASBR supporting the graceful shutdown receiver procedure, an | |||
community to a low value | inbound BGP route policy is applied on all EBGP sessions of the ASBR, | |||
that: | ||||
o removes the g-shut community from the paths. | o matches the GRACEFUL_SHUTDOWN community | |||
o optionally, adds an AS specific g-shut community on these paths to | o sets the LOCAL_PREF attribute of the paths tagged with the | |||
indicate that these are to be withdrawn soon. If some ingress | GRACEFUL_SHUTDOWN community to a low value | |||
ASBRs reset the LOCAL_PREF attribute, this AS specific g-shut | ||||
community will be used to override other LOCAL_PREF preference | ||||
changes. | ||||
Note that in the case where an AS is aggregating multiple routes | Note that in the case where an AS is aggregating multiple routes | |||
under a covering prefix, it is recommended to filter out the g-shut | under a covering prefix, it is recommended to filter out the | |||
community from the resulting aggregate BGP route. By doing so, the | GRACEFUL_SHUTDOWN community from the resulting aggregate BGP route. | |||
setting of the g-shut community on one of the aggregated routes will | By doing so, the setting of the GRACEFUL_SHUTDOWN community on one of | |||
not let the entire aggregate inherit the community. Not doing so | the aggregated routes will not let the entire aggregate inherit the | |||
would let the entire aggregate undergo the g-shut behavior. | community. Not doing so would let the entire aggregate undergo the | |||
graceful shutdown behavior. | ||||
4.2.1.2. Operations at maintenance time | 5.2. Operations at maintenance time | |||
On the g-shut initiator, upon maintenance time, it is required to: | On the graceful shutdown initiator, upon maintenance time, it is | |||
required to: | ||||
o apply an outbound BGP route policy on the EBGP session to be | o apply an outbound BGP route policy on the EBGP session to be | |||
shutdown. This policy tags the paths propagated over the session | shutdown. This policy tags the paths propagated over the session | |||
with the g-shut community. This will trigger the BGP | with the GRACEFUL_SHUTDOWN community. This will trigger the BGP | |||
implementation to re-advertise all active routes previously | implementation to re-advertise all active routes previously | |||
advertised, and tag them with the g-shut community. | advertised, and tag them with the GRACEFUL_SHUTDOWN community. | |||
o apply an inbound BGP route policy on the maintained EBGP session | o apply an inbound BGP route policy on the maintained EBGP session | |||
to tag the paths received over the session with the g-shut | to tag the paths received over the session with the | |||
community. | GRACEFUL_SHUTDOWN community. | |||
o wait for convergence to happen. | o wait for convergence to happen. | |||
o shutdown the EBGP session, optionally using | o shutdown the EBGP session, optionally using | |||
[I-D.ietf-idr-shutdown] to communicate the reason of the shutdown. | [I-D.ietf-idr-shutdown] to communicate the reason of the shutdown. | |||
4.2.1.3. BGP implementation support for g-Shut | In the case of a shutdown of the whole router, in addition to the | |||
graceful shutdown of all EBGP sessions, there is a need to graceful | ||||
shutdown the routes originated by this router (e.g, BGP aggregates | ||||
redistributed from other protocols, including static routes). This | ||||
can be performed by tagging such routes with the GRACEFUL_SHUTDOWN | ||||
community. | ||||
5.3. BGP implementation support for g-Shut | ||||
A BGP router implementation MAY provide features aimed at automating | A BGP router implementation MAY provide features aimed at automating | |||
the application of the graceful shutdown procedures described above. | the application of the graceful shutdown procedures described above. | |||
Upon a session shutdown specified as graceful by the operator, a BGP | Upon a session shutdown specified as graceful by the operator, a BGP | |||
implementation supporting a g-shut feature SHOULD: | implementation supporting a graceful shutdown feature SHOULD: | |||
1. On the EBGP side, update all the paths propagated over the | 1. Update all the paths propagated over the corresponding EBGP | |||
corresponding EBGP session, tagging the g-shut community to them. | session, tagging the GRACEFUL_SHUTDOWN community to them. Any | |||
Any subsequent update sent over the session being gracefully shut | subsequent update sent over the session being gracefully shut | |||
down would be tagged with the g-shut community. | down would be tagged with the GRACEFUL_SHUTDOWN community. | |||
2. On the iBGP side, lower the LOCAL_PREF value of the paths | 2. Lower the LOCAL_PREF value of the paths received over the EBGP | |||
received over the EBGP session being shut down, upon their | session being shut down. | |||
propagation over iBGP sessions. Optionally, also tag these paths | ||||
with an AS specific g-shut community. | ||||
3. Optionally shut down the session after a configured time. | 3. Optionally shut down the session after a configured time. | |||
4. Prevent the g-shut community from being inherited by a path that | 4. Prevent the GRACEFUL_SHUTDOWN community from being inherited by a | |||
would aggregate some paths tagged with the GSHUT community. This | path that would aggregate some paths tagged with the GSHUT | |||
behavior avoids the GSHUT procedure to be applied to the | community. This behavior avoids the GSHUT procedure to be | |||
aggregate upon the graceful shutdown of one of its covered | applied to the aggregate upon the graceful shutdown of one of its | |||
prefixes. | covered prefixes. | |||
A BGP implementation supporting a g-shut feature SHOULD also | ||||
automatically install the BGP policies that are supposed to be | ||||
configured, as described in Section 4.2.1.1 for sessions over which | ||||
g-shut is to be supported. | ||||
4.2.2. iBGP g-shut | ||||
For the shutdown of an iBGP session, provided the iBGP topology is | ||||
viable after the maintenance of the session, i.e, if all BGP speakers | ||||
of the AS have an iBGP signaling path for all prefixes advertised on | ||||
this g-shut iBGP session, then the shutdown of an iBGP session does | ||||
not lead to transient unreachability. As a consequence, no specific | ||||
g-shut action is required. | ||||
4.2.3. Router g-shut | A BGP implementation supporting a graceful shutdown feature SHOULD | |||
also automatically install the BGP policies that are supposed to be | ||||
configured, as described in Section 5.1 for sessions over which | ||||
graceful shutdown is to be supported. | ||||
In the case of a shutdown of the whole router, in addition to the | 6. Beyond EBGP graceful shutdown | |||
g-shut of all EBGP sessions, there is a need to g-shut the routes | ||||
originated by this router (e.g, BGP aggregates redistributed from | ||||
other protocols, including static routes). This can be performed by | ||||
tagging such routes with the g-shut community. | ||||
5. Forwarding modes and transient forwarding loops during convergence | 6.1. IBGP graceful shutdown | |||
The g-shut procedure or the solutions improving the availability of | For the shutdown of an IBGP session, provided the IBGP topology is | |||
alternate paths, do not change the fact that BGP convergence and the | viable after the maintenance of the session, i.e, if all BGP speakers | |||
subsequent FIB updates are run independently on each router of the | of the AS have an IBGP signaling path for all prefixes advertised on | |||
ASes. If the AS applying the solution does not rely on encapsulation | this graceful shutdown IBGP session, then the shutdown of an IBGP | |||
to forward packets from the Ingress Border Router to the Egress | session does not lead to transient unreachability. As a consequence, | |||
Border Router, then transient forwarding loops and consequent packet | no specific graceful shutdown action is required. | |||
losses can occur during the convergence process. If zero LoC is | ||||
required, encapsulation is required between ASBRs of the AS. | ||||
6. Link Up cases | 6.2. Link Up cases | |||
We identify two potential causes for transient packet losses upon an | We identify two potential causes for transient packet losses upon an | |||
EBGP link up event. The first one is local to the g-no-shut | EBGP link up event. The first one is local to the graceful no-shut | |||
initiator, the second one is due to the BGP convergence following the | initiator, the second one is due to the BGP convergence following the | |||
injection of new best paths within the iBGP topology. | injection of new best paths within the IBGP topology. | |||
6.1. Unreachability local to the ASBR | 6.2.1. Unreachability local to the ASBR | |||
An ASBR that selects as best a path received over a newly brought up | An ASBR that selects as best a path received over a newly brought up | |||
EBGP session may transiently drop traffic. This can typically happen | EBGP session may transiently drop traffic. This can typically happen | |||
when the nexthop attribute differs from the IP address of the EBGP | when the NEXT_HOP attribute differs from the IP address of the EBGP | |||
peer, and the receiving ASBR has not yet resolved the MAC address | peer, and the receiving ASBR has not yet resolved the MAC address | |||
associated with the IP address of that "third party" nexthop. | associated with the IP address of that "third party" NEXT_HOP. | |||
A BGP speaker implementation could avoid such losses by ensuring that | A BGP speaker implementation could avoid such losses by ensuring that | |||
"third party" nexthops are resolved before installing paths using | "third party" NEXT_HOPs are resolved before installing paths using | |||
these in the RIB. | these in the RIB. | |||
If the link up event corresponds to an EBGP session that is being | If the link up event corresponds to an EBGP session that is being | |||
manually brought up, over an already up multi-access link, then the | manually brought up, over an already up multi-access link, then the | |||
operator can ping third party nexthops that are expected to be used | operator can ping third party NEXT_HOP that are expected to be used | |||
before actually bringing the session up, or ping directed broadcast | before actually bringing the session up, or ping directed broadcast | |||
the subnet IP address of the link. By proceeding like this, the MAC | the subnet IP address of the link. By proceeding like this, the MAC | |||
addresses associated with these third party nexthops will be resolved | addresses associated with these third party NEXT_HOP will be resolved | |||
by the g-no-shut initiator. | by the graceful no-shut initiator. | |||
6.2. iBGP convergence | 6.2.2. IBGP convergence | |||
Corner cases leading to LoC can occur during an EBGP link up event. | Corner cases leading to LoC can occur during an EBGP link up event. | |||
A typical example for such transient unreachability for a given | A typical example for such transient unreachability for a given | |||
prefix is the following: | prefix is the following: | |||
Let's consider 3 route reflectors RR1, RR2, RR3. There is a full | Let's consider 3 route reflectors RR1, RR2, RR3. There is a full | |||
mesh of iBGP session between them. | mesh of IBGP session between them. | |||
1. RR1 is initially advertising the current best path to the | 1. RR1 is initially advertising the current best path to the | |||
members of its iBGP RR full-mesh. It propagated that path within | members of its IBGP RR full-mesh. It propagated that path within | |||
its RR full-mesh. RR2 knows only that path toward the prefix. | its RR full-mesh. RR2 knows only that path toward the prefix. | |||
2. RR3 receives a new best path originated by the "g-no-shut" | 2. RR3 receives a new best path originated by the "graceful no- | |||
initiator, being one of its RR clients. RR3 selects it as best, | shut" initiator, being one of its RR clients. RR3 selects it as | |||
and propagates an UPDATE within its RR full-mesh, i.e., to RR1 and | best, and propagates an UPDATE within its RR full-mesh, i.e., to | |||
RR2. | RR1 and RR2. | |||
3. RR1 receives that path, reruns its decision process, and picks | 3. RR1 receives that path, reruns its decision process, and picks | |||
this new path as best. As a result, RR1 withdraws its previously | this new path as best. As a result, RR1 withdraws its previously | |||
announced best-path on the iBGP sessions of its RR full-mesh. | announced best-path on the IBGP sessions of its RR full-mesh. | |||
4. If, for any reason, RR3 processes the withdraw generated in | 4. If, for any reason, RR3 processes the withdraw generated in | |||
step 3, before processing the update generated in step 2, RR3 | step 3, before processing the update generated in step 2, RR3 | |||
transiently suffers from unreachability for the affected prefix. | transiently suffers from unreachability for the affected prefix. | |||
The use of [I-D.ietf-idr-best-external] among the RR of the iBGP | The use of [I-D.ietf-idr-best-external] among the RR of the IBGP | |||
full-mesh can solve these corner cases by ensuring that within an AS, | full-mesh can solve these corner cases by ensuring that within an AS, | |||
the advertisement of a new route is not translated into the withdraw | the advertisement of a new route is not translated into the withdraw | |||
of a former route. | of a former route. | |||
Indeed, "best-external" ensures that an ASBR does not withdraw a | Indeed, "best-external" ensures that an ASBR does not withdraw a | |||
previously advertised (EBGP) path when it receives an additional, | previously advertised (EBGP) path when it receives an additional, | |||
preferred path over an iBGP session. Also, "best-intra-cluster" | preferred path over an IBGP session. Also, "best-intra-cluster" | |||
ensures that a RR does not withdraw a previously advertised (iBGP) | ensures that a RR does not withdraw a previously advertised (IBGP) | |||
path to its non clients (e.g. other RRs in a mesh of RR) when it | path to its non clients (e.g. other RRs in a mesh of RR) when it | |||
receives a new, preferred path over an iBGP session. | receives a new, preferred path over an IBGP session. | |||
7. IANA Considerations | 7. IANA Considerations | |||
The IANA has assigned the community value 0xFFFF0000 to the planned- | The IANA has assigned the community value 0xFFFF0000 to the planned- | |||
shut community in the "BGP Well-known Communities" registry. IANA is | shut community in the "BGP Well-known Communities" registry. IANA is | |||
requested to change the name planned-shut to g-shut and set this | requested to change the name planned-shut to GRACEFUL_SHUTDOWN and | |||
document as the reference. | set this document as the reference. | |||
8. Security Considerations | 8. Security Considerations | |||
By providing the g-shut service to a neighboring AS, an ISP provides | By providing the graceful shutdown service to a neighboring AS, an | |||
means to this neighbor and possibly its downstream ASes to lower the | ISP provides means to this neighbor and possibly its downstream ASes | |||
LOCAL_PREF value assigned to the paths received from this neighbor. | to lower the LOCAL_PREF value assigned to the paths received from | |||
this neighbor. | ||||
The neighbor could abuse the technique and do inbound traffic | The neighbor could abuse the technique and do inbound traffic | |||
engineering by declaring some prefixes as undergoing a maintenance so | engineering by declaring some prefixes as undergoing a maintenance so | |||
as to switch traffic to another peering link. | as to switch traffic to another peering link. | |||
If this behavior is not tolerated by the ISP, it SHOULD monitor the | If this behavior is not tolerated by the ISP, it SHOULD monitor the | |||
use of the g-shut community by this neighbor. | use of the graceful shutdown community by this neighbor. | |||
9. Acknowledgments | 9. Acknowledgments | |||
The authors wish to thank Olivier Bonaventure, Pradosh Mohapatra and | The authors wish to thank Olivier Bonaventure, Pradosh Mohapatra and | |||
Job Snijders for their useful comments on this work. | Job Snijders for their useful comments on this work. | |||
10. References | 10. References | |||
10.1. Normative References | 10.1. Normative References | |||
skipping to change at page 9, line 49 ¶ | skipping to change at page 10, line 17 ¶ | |||
Administrative Shutdown Communication", draft-ietf-idr- | Administrative Shutdown Communication", draft-ietf-idr- | |||
shutdown-10 (work in progress), June 2017. | shutdown-10 (work in progress), June 2017. | |||
[RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, | [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, | |||
"Advertisement of Multiple Paths in BGP", RFC 7911, | "Advertisement of Multiple Paths in BGP", RFC 7911, | |||
DOI 10.17487/RFC7911, July 2016, | DOI 10.17487/RFC7911, July 2016, | |||
<http://www.rfc-editor.org/info/rfc7911>. | <http://www.rfc-editor.org/info/rfc7911>. | |||
Appendix A. Alternative techniques with limited applicability | Appendix A. Alternative techniques with limited applicability | |||
A few alternative techniques have been considered to provide g-shut | A few alternative techniques have been considered to provide graceful | |||
capabilities but have been rejected due to their limited | shutdown capabilities but have been rejected due to their limited | |||
applicability. This section describe them for possible reference. | applicability. This section describe them for possible reference. | |||
A.1. Multi Exit Discriminator tweaking | A.1. Multi Exit Discriminator tweaking | |||
The MED attribute of the paths to be avoided can be increased so as | The MED attribute of the paths to be avoided can be increased so as | |||
to force the routers in the neighboring AS to select other paths. | to force the routers in the neighboring AS to select other paths. | |||
The solution only works if the alternate paths are as good as the | The solution only works if the alternate paths are as good as the | |||
initial ones with respect to the Local-Pref value and the AS Path | initial ones with respect to the Local-Pref value and the AS Path | |||
Length value. In the other cases, increasing the MED value will not | Length value. In the other cases, increasing the MED value will not | |||
have an impact on the decision process of the routers in the | have an impact on the decision process of the routers in the | |||
neighboring AS. | neighboring AS. | |||
A.2. IGP distance Poisoning | A.2. IGP distance Poisoning | |||
The distance to the BGP nexthop corresponding to the maintained | The distance to the BGP NEXT_HOP corresponding to the maintained | |||
session can be increased in the IGP so that the old paths will be | session can be increased in the IGP so that the old paths will be | |||
less preferred during the application of the IGP distance tie-break | less preferred during the application of the IGP distance tie-break | |||
rule. However, this solution only works for the paths whose | rule. However, this solution only works for the paths whose | |||
alternates are as good as the old paths with respect to their Local- | alternates are as good as the old paths with respect to their Local- | |||
Pref value, their AS Path length, and their MED value. | Pref value, their AS Path length, and their MED value. | |||
Also, this poisoning cannot be applied when nexthop self is used as | Also, this poisoning cannot be applied when nexthop self is used as | |||
there is no nexthop specific to the maintained session to poison in | there is no nexthop specific to the maintained session to poison in | |||
the IGP. | the IGP. | |||
Appendix B. Configuration Examples | ||||
This appendix is non-normative. | ||||
Example routing policy configurations to honor the GRACEFUL_SHUTDOWN | ||||
well-known BGP community. | ||||
B.1. Cisco IOS XR | ||||
community-set comm-graceful-shutdown | ||||
65535:0 | ||||
end-set | ||||
! | ||||
route-policy AS64497-ebgp-inbound | ||||
! normally this policy would contain much more | ||||
if community matches-any comm-graceful-shutdown then | ||||
set local-preference 0 | ||||
endif | ||||
end-policy | ||||
! | ||||
router bgp 64496 | ||||
neighbor 2001:db8:1:2::1 | ||||
remote-as 64497 | ||||
description a fantastic EBGP neighbor | ||||
address-family ipv6 unicast | ||||
send-community-ebgp | ||||
route-policy AS64497-ebgp-inbound in | ||||
route-policy AS65040v6-bgp-out out | ||||
! | ||||
! | ||||
! | ||||
B.2. BIRD | ||||
function honor_graceful_shutdown() { | ||||
if (65535, 0) ~ bgp_community then { | ||||
bgp_local_pref = 0; | ||||
} | ||||
} | ||||
filter AS64497_ebgp_inbound | ||||
{ | ||||
# normally this policy would contain much more | ||||
honor_graceful_shutdown(); | ||||
} | ||||
protocol bgp peer_64497_1 { | ||||
description "a fantastic EBGP neighbor"; | ||||
neighbor 2001:db8:1:2::1 as 64497; | ||||
local as 64496; | ||||
import keep filtered; | ||||
import filter AS64497_ebgp_inbound; | ||||
export filter AS64497_ebgp_outbound; | ||||
} | ||||
B.3. OpenBGPD | ||||
AS 64496 | ||||
router-id 192.0.2.1 | ||||
neighbor 2001:db8:1:2::1 { | ||||
descr "a fantastic EBGP neighbor" | ||||
remote-as 64497 | ||||
} | ||||
# normally this policy would contain much more | ||||
match from any community GRACEFUL_SHUTDOWN set { localpref 0 } | ||||
Authors' Addresses | Authors' Addresses | |||
Pierre Francois | Pierre Francois | |||
Individual Contributor | Individual Contributor | |||
Email: pfrpfr@gmail.com | Email: pfrpfr@gmail.com | |||
Bruno Decraene | Bruno Decraene | |||
Orange | Orange | |||
End of changes. 67 change blocks. | ||||
148 lines changed or deleted | 216 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |