draft-ietf-grow-bgp-gshut-11.txt | draft-ietf-grow-bgp-gshut-12.txt | |||
---|---|---|---|---|
Network Working Group P. Francois, Ed. | Network Working Group P. Francois, Ed. | |||
Internet-Draft Individual Contributor | Internet-Draft Individual Contributor | |||
Intended status: Informational B. Decraene, Ed. | Intended status: Informational B. Decraene, Ed. | |||
Expires: March 24, 2018 Orange | Expires: April 14, 2018 Orange | |||
C. Pelsser | C. Pelsser | |||
Strasbourg University | Strasbourg University | |||
K. Patel | K. Patel | |||
Arrcus, Inc. | Arrcus, Inc. | |||
C. Filsfils | C. Filsfils | |||
Cisco Systems | Cisco Systems | |||
September 20, 2017 | October 11, 2017 | |||
Graceful BGP session shutdown | Graceful BGP session shutdown | |||
draft-ietf-grow-bgp-gshut-11 | draft-ietf-grow-bgp-gshut-12 | |||
Abstract | Abstract | |||
This draft standardizes a new well-known BGP community | This draft standardizes a new well-known BGP community | |||
GRACEFUL_SHUTDOWN to signal the graceful shutdown of paths. This | GRACEFUL_SHUTDOWN to signal the graceful shutdown of paths. This | |||
draft also describes operational procedures which use this community | draft also describes operational procedures which use this community | |||
to reduce the amount of traffic lost when BGP peering sessions are | to reduce the amount of traffic lost when BGP peering sessions are | |||
about to be shut down deliberately, e.g. for planned maintenance. | about to be shut down deliberately, e.g. for planned maintenance. | |||
Status of This Memo | Status of This Memo | |||
skipping to change at page 1, line 41 ¶ | skipping to change at page 1, line 41 ¶ | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF). Note that other groups may also distribute | Task Force (IETF). Note that other groups may also distribute | |||
working documents as Internet-Drafts. The list of current Internet- | working documents as Internet-Drafts. The list of current Internet- | |||
Drafts is at https://datatracker.ietf.org/drafts/current/. | Drafts is at https://datatracker.ietf.org/drafts/current/. | |||
Internet-Drafts are draft documents valid for a maximum of six months | Internet-Drafts are draft documents valid for a maximum of six months | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
This Internet-Draft will expire on March 24, 2018. | This Internet-Draft will expire on April 14, 2018. | |||
Copyright Notice | Copyright Notice | |||
Copyright (c) 2017 IETF Trust and the persons identified as the | Copyright (c) 2017 IETF Trust and the persons identified as the | |||
document authors. All rights reserved. | document authors. All rights reserved. | |||
This document is subject to BCP 78 and the IETF Trust's Legal | This document is subject to BCP 78 and the IETF Trust's Legal | |||
Provisions Relating to IETF Documents | Provisions Relating to IETF Documents | |||
(https://trustee.ietf.org/license-info) in effect on the date of | (https://trustee.ietf.org/license-info) in effect on the date of | |||
publication of this document. Please review these documents | publication of this document. Please review these documents | |||
skipping to change at page 2, line 50 ¶ | skipping to change at page 2, line 50 ¶ | |||
Routing changes in BGP can be caused by planned maintenance | Routing changes in BGP can be caused by planned maintenance | |||
operations. This document defines a well-known community [RFC1997], | operations. This document defines a well-known community [RFC1997], | |||
called GRACEFUL_SHUTDOWN, for the purpose of reducing the management | called GRACEFUL_SHUTDOWN, for the purpose of reducing the management | |||
overhead of gracefully shutting down BGP sessions. The well-known | overhead of gracefully shutting down BGP sessions. The well-known | |||
community allows implementers to provide an automated graceful | community allows implementers to provide an automated graceful | |||
shutdown mechanism that does not require any router reconfiguration | shutdown mechanism that does not require any router reconfiguration | |||
at maintenance time. | at maintenance time. | |||
This document discusses operational procedures to be applied in order | This document discusses operational procedures to be applied in order | |||
to reduce or eliminate loss of packets during a maintenance. Loss | to reduce or eliminate loss of packets during a maintenance | |||
comes from transient lack of reachability during BGP convergence | operation. Loss comes from transient lack of reachability during BGP | |||
which follows the shutdown of an EBGP peering session between two | convergence which follows the shutdown of an EBGP peering session | |||
Autonomous System Border Routers (ASBR). | between two Autonomous System Border Routers (ASBR). | |||
This document presents procedures for the cases where the forwarding | This document presents procedures for the cases where the forwarding | |||
plane is impacted by the maintenance, hence when the use of Graceful | plane is impacted by the maintenance, hence when the use of Graceful | |||
Restart does not apply. | Restart does not apply. | |||
The procedures described in this document can be applied to reduce or | The procedures described in this document can be applied to reduce or | |||
avoid packet loss for outbound and inbound traffic flows initially | avoid packet loss for outbound and inbound traffic flows initially | |||
forwarded along the peering link to be shut down. These procedures | forwarded along the peering link to be shut down. These procedures | |||
trigger, in both ASes, rerouting to alternate paths if they exist | trigger, in both Autonomous Sytems (AS), rerouting to alternate paths | |||
within the AS, while allowing the use of the old path until alternate | if they exist within the AS, while allowing the use of the old path | |||
ones are learned. This ensures that routers always have a valid | until alternate ones are learned. This ensures that routers always | |||
route available during the convergence process. | have a valid route available during the convergence process. | |||
The goal of the document is to meet the requirements described in | The goal of the document is to meet the requirements described in | |||
[RFC6198] at best, without changing the BGP protocol. | [RFC6198] at best, without changing the BGP protocol. | |||
The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", | |||
"SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this | |||
document are to be interpreted as described in RFC 2119 [RFC2119]. | document are to be interpreted as described in RFC 8174 [RFC8174]. | |||
2. Terminology | 2. Terminology | |||
graceful shutdown initiator: a router on which the session shutdown | graceful shutdown initiator: a router on which the session shutdown | |||
is performed for the maintenance. | is performed for the maintenance. | |||
graceful shutdown receiver: a router that has a BGP session, to be | graceful shutdown receiver: a router that has a BGP session, to be | |||
shutdown, with the graceful shutdown initiator. | shutdown, with the graceful shutdown initiator. | |||
3. Packet loss upon manual EBGP session shutdown | 3. Packet loss upon manual EBGP session shutdown | |||
skipping to change at page 4, line 14 ¶ | skipping to change at page 4, line 14 ¶ | |||
4. EBGP graceful shutdown procedure | 4. EBGP graceful shutdown procedure | |||
This section describes configurations and actions to be performed for | This section describes configurations and actions to be performed for | |||
the graceful shutdown of EBGP peering links. | the graceful shutdown of EBGP peering links. | |||
The goal of this procedure is to retain the paths to be shutdown | The goal of this procedure is to retain the paths to be shutdown | |||
between the peers, but with a lower LOCAL_PREF value, allowing the | between the peers, but with a lower LOCAL_PREF value, allowing the | |||
paths to remain in use while alternate paths are selected and | paths to remain in use while alternate paths are selected and | |||
propagated, rather than simply withdrawing the paths. The LOCAL_PREF | propagated, rather than simply withdrawing the paths. The LOCAL_PREF | |||
value must be lower than the one of the alternate path. 0 being the | value SHOULD be lower than any of the alternative paths. The | |||
lowest value, it can be used in all cases, except if it already has a | RECOMMENDED value is 0. | |||
special meaning within the AS. | ||||
4.1. Pre-configuration | 4.1. Pre-configuration | |||
On each ASBR supporting the graceful shutdown receiver procedure, an | On each ASBR supporting the graceful shutdown receiver procedure, an | |||
inbound BGP route policy is applied on all EBGP sessions of the ASBR, | inbound BGP route policy is applied on all EBGP sessions of the ASBR, | |||
that: | that: | |||
o matches the GRACEFUL_SHUTDOWN community. | o matches the GRACEFUL_SHUTDOWN community. | |||
o sets the LOCAL_PREF attribute of the paths tagged with the | o sets the LOCAL_PREF attribute of the paths tagged with the | |||
skipping to change at page 4, line 48 ¶ | skipping to change at page 4, line 47 ¶ | |||
advertised, and tag them with the GRACEFUL_SHUTDOWN community. | advertised, and tag them with the GRACEFUL_SHUTDOWN community. | |||
o applies an inbound BGP route policy on the EBGP session to be | o applies an inbound BGP route policy on the EBGP session to be | |||
shutdown. This policy tags the paths received over the session | shutdown. This policy tags the paths received over the session | |||
with the GRACEFUL_SHUTDOWN community and sets LOCAL_PREF to a low | with the GRACEFUL_SHUTDOWN community and sets LOCAL_PREF to a low | |||
value. | value. | |||
o wait for route readvertisement over the EBGP session, and BGP | o wait for route readvertisement over the EBGP session, and BGP | |||
routing convergence on both ASBRs. | routing convergence on both ASBRs. | |||
o shutdown the EBGP session, optionally using | o shutdown the EBGP session, optionally using [RFC8203] to | |||
[I-D.ietf-idr-shutdown] to communicate the reason of the shutdown. | communicate the reason of the shutdown. | |||
In the case of a shutdown of the whole router, in addition to the | In the case of a shutdown of the whole router, in addition to the | |||
graceful shutdown of all EBGP sessions, there is a need to gracefully | graceful shutdown of all EBGP sessions, there is a need to gracefully | |||
shutdown the routes originated by this router (e.g, BGP aggregates | shutdown the routes originated by this router (e.g, BGP aggregates | |||
redistributed from other protocols, including static routes). This | redistributed from other protocols, including static routes). This | |||
can be performed by tagging these routes with the GRACEFUL_SHUTDOWN | can be performed by tagging these routes with the GRACEFUL_SHUTDOWN | |||
community and setting LOCAL_PREF to a low value. | community and setting LOCAL_PREF to a low value. | |||
4.3. BGP implementation support for graceful shutdown | 4.3. BGP implementation support for graceful shutdown | |||
skipping to change at page 5, line 40 ¶ | skipping to change at page 5, line 39 ¶ | |||
The neighbor could abuse the technique and do inbound traffic | The neighbor could abuse the technique and do inbound traffic | |||
engineering by declaring some prefixes as undergoing a maintenance so | engineering by declaring some prefixes as undergoing a maintenance so | |||
as to switch traffic to another peering link. | as to switch traffic to another peering link. | |||
If this behavior is not tolerated by the ISP, it SHOULD monitor the | If this behavior is not tolerated by the ISP, it SHOULD monitor the | |||
use of the graceful shutdown community. | use of the graceful shutdown community. | |||
7. Acknowledgments | 7. Acknowledgments | |||
The authors wish to thank Olivier Bonaventure, Pradosh Mohapatra, Job | The authors wish to thank Olivier Bonaventure, Pradosh Mohapatra, Job | |||
Snijders John Heasley, and Christopher Morrow for their useful | Snijders, John Heasley, and Christopher Morrow for their useful | |||
comments. | comments. | |||
8. References | 8. References | |||
8.1. Normative References | 8.1. Normative References | |||
[RFC1997] Chandra, R., Traina, P., and T. Li, "BGP Communities | [RFC1997] Chandra, R., Traina, P., and T. Li, "BGP Communities | |||
Attribute", RFC 1997, DOI 10.17487/RFC1997, August 1996, | Attribute", RFC 1997, DOI 10.17487/RFC1997, August 1996, | |||
<https://www.rfc-editor.org/info/rfc1997>. | <https://www.rfc-editor.org/info/rfc1997>. | |||
[RFC2119] Bradner, S., "Key words for use in RFCs to Indicate | ||||
Requirement Levels", BCP 14, RFC 2119, | ||||
DOI 10.17487/RFC2119, March 1997, | ||||
<https://www.rfc-editor.org/info/rfc2119>. | ||||
[RFC6198] Decraene, B., Francois, P., Pelsser, C., Ahmad, Z., | [RFC6198] Decraene, B., Francois, P., Pelsser, C., Ahmad, Z., | |||
Elizondo Armengol, A., and T. Takeda, "Requirements for | Elizondo Armengol, A., and T. Takeda, "Requirements for | |||
the Graceful Shutdown of BGP Sessions", RFC 6198, | the Graceful Shutdown of BGP Sessions", RFC 6198, | |||
DOI 10.17487/RFC6198, April 2011, | DOI 10.17487/RFC6198, April 2011, | |||
<https://www.rfc-editor.org/info/rfc6198>. | <https://www.rfc-editor.org/info/rfc6198>. | |||
[RFC8174] Leiba, B., "Ambiguity of Uppercase vs Lowercase in RFC | ||||
2119 Key Words", BCP 14, RFC 8174, DOI 10.17487/RFC8174, | ||||
May 2017, <https://www.rfc-editor.org/info/rfc8174>. | ||||
8.2. Informative References | 8.2. Informative References | |||
[I-D.ietf-idr-best-external] | [I-D.ietf-idr-best-external] | |||
Marques, P., Fernando, R., Chen, E., Mohapatra, P., and H. | Marques, P., Fernando, R., Chen, E., Mohapatra, P., and H. | |||
Gredler, "Advertisement of the best external route in | Gredler, "Advertisement of the best external route in | |||
BGP", draft-ietf-idr-best-external-05 (work in progress), | BGP", draft-ietf-idr-best-external-05 (work in progress), | |||
January 2012. | January 2012. | |||
[I-D.ietf-idr-shutdown] | ||||
Snijders, J., Heitz, J., and J. Scudder, "BGP | ||||
Administrative Shutdown Communication", draft-ietf-idr- | ||||
shutdown-10 (work in progress), June 2017. | ||||
[RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, | [RFC7911] Walton, D., Retana, A., Chen, E., and J. Scudder, | |||
"Advertisement of Multiple Paths in BGP", RFC 7911, | "Advertisement of Multiple Paths in BGP", RFC 7911, | |||
DOI 10.17487/RFC7911, July 2016, | DOI 10.17487/RFC7911, July 2016, | |||
<https://www.rfc-editor.org/info/rfc7911>. | <https://www.rfc-editor.org/info/rfc7911>. | |||
[RFC8203] Snijders, J., Heitz, J., and J. Scudder, "BGP | ||||
Administrative Shutdown Communication", RFC 8203, | ||||
DOI 10.17487/RFC8203, July 2017, | ||||
<https://www.rfc-editor.org/info/rfc8203>. | ||||
Appendix A. Alternative techniques with limited applicability | Appendix A. Alternative techniques with limited applicability | |||
A few alternative techniques have been considered to provide graceful | A few alternative techniques have been considered to provide graceful | |||
shutdown capabilities but have been rejected due to their limited | shutdown capabilities but have been rejected due to their limited | |||
applicability. This section describe them for possible reference. | applicability. This section describes them for possible reference. | |||
A.1. Multi Exit Discriminator tweaking | A.1. Multi Exit Discriminator tweaking | |||
The MED attribute of the paths to be avoided can be increased so as | The MED attribute of the paths to be avoided can be increased so as | |||
to force the routers in the neighboring AS to select other paths. | to force the routers in the neighboring AS to select other paths. | |||
The solution only works if the alternate paths are as good as the | The solution only works if the alternate paths are as good as the | |||
initial ones with respect to the Local-Pref value and the AS Path | initial ones with respect to the LOCAL_PREF value and the AS Path | |||
Length value. In the other cases, increasing the MED value will not | Length value. In the other cases, increasing the MED value will not | |||
have an impact on the decision process of the routers in the | have an impact on the decision process of the routers in the | |||
neighboring AS. | neighboring AS. | |||
A.2. IGP distance Poisoning | A.2. IGP distance Poisoning | |||
The distance to the BGP NEXT_HOP corresponding to the maintained | The distance to the BGP NEXT_HOP corresponding to the maintained | |||
session can be increased in the IGP so that the old paths will be | session can be increased in the IGP so that the old paths will be | |||
less preferred during the application of the IGP distance tie-break | less preferred during the application of the IGP distance tie-break | |||
rule. However, this solution only works for the paths whose | rule. However, this solution only works for the paths whose | |||
alternates are as good as the old paths with respect to their Local- | alternates are as good as the old paths with respect to their | |||
Pref value, their AS Path length, and their MED value. | LOCAL_PREF value, their AS Path length, and their MED value. | |||
Also, this poisoning cannot be applied when nexthop self is used as | Also, this poisoning cannot be applied when BGP NEXT_HOP self is used | |||
there is no nexthop specific to the maintained session to poison in | as there is no BGP NEXT_HOP specific to the maintained session to | |||
the IGP. | poison in the IGP. | |||
Appendix B. Configuration Examples | Appendix B. Configuration Examples | |||
This appendix is non-normative. | This appendix is non-normative. | |||
Example routing policy configurations to honor the GRACEFUL_SHUTDOWN | Example routing policy configurations to honor the GRACEFUL_SHUTDOWN | |||
well-known BGP community. | well-known BGP community. | |||
B.1. Cisco IOS XR | B.1. Cisco IOS XR | |||
skipping to change at page 9, line 13 ¶ | skipping to change at page 9, line 13 ¶ | |||
following the injection of new best paths within the IBGP topology. | following the injection of new best paths within the IBGP topology. | |||
C.2.1. Unreachability local to the ASBR | C.2.1. Unreachability local to the ASBR | |||
An ASBR that selects as best a path received over a newly established | An ASBR that selects as best a path received over a newly established | |||
EBGP session may transiently drop traffic. This can typically happen | EBGP session may transiently drop traffic. This can typically happen | |||
when the NEXT_HOP attribute differs from the IP address of the EBGP | when the NEXT_HOP attribute differs from the IP address of the EBGP | |||
peer, and the receiving ASBR has not yet resolved the MAC address | peer, and the receiving ASBR has not yet resolved the MAC address | |||
associated with the IP address of that "third party" NEXT_HOP. | associated with the IP address of that "third party" NEXT_HOP. | |||
A BGP speaker implementation may avoid such losses by ensuring that | A BGP speaker implementation MAY avoid such losses by ensuring that | |||
"third party" NEXT_HOPs are resolved before installing paths using | "third party" NEXT_HOPs are resolved before installing paths using | |||
these in the RIB. | these in the RIB. | |||
Alternatively, the operator (script) may ping third party NEXT_HOPs | Alternatively, the operator (script) MAY ping third party NEXT_HOPs | |||
that are expected to be used before establishing the session. By | that are expected to be used before establishing the session. By | |||
proceeding like this, the MAC addresses associated with these third | proceeding like this, the MAC addresses associated with these third | |||
party NEXT_HOPs are resolved by the startup initiator. | party NEXT_HOPs are resolved by the startup initiator. | |||
C.2.2. IBGP convergence | C.2.2. IBGP convergence | |||
During the establishment of an EBGP session, in some corner cases a | During the establishment of an EBGP session, in some corner cases a | |||
router may have no path toward an affected prefix, leading to loss of | router may have no path toward an affected prefix, leading to loss of | |||
connectivity. | connectivity. | |||
A typical example for such transient unreachability for a given | A typical example for such transient unreachability for a given | |||
prefix is the following: | prefix is the following: | |||
Let's consider 3 route reflectors RR1, RR2, RR3. There is a full | Let's consider three Route Reflectors (RR): RR1, RR2, RR3. There is | |||
mesh of IBGP sessions between them. | a full mesh of IBGP sessions between them. | |||
1. RR1 is initially advertising the current best path to the | 1. RR1 is initially advertising the current best path to the | |||
members of its IBGP RR full-mesh. It propagated that path within | members of its IBGP RR full-mesh. It propagated that path within | |||
its RR full-mesh. RR2 knows only that path toward the prefix. | its RR full-mesh. RR2 knows only that path toward the prefix. | |||
2. RR3 receives a new best path originated by the startup | 2. RR3 receives a new best path originated by the startup | |||
initiator, being one of its RR clients. RR3 selects it as best, | initiator, being one of its RR clients. RR3 selects it as best, | |||
and propagates an UPDATE within its RR full-mesh, i.e., to RR1 and | and propagates an UPDATE within its RR full-mesh, i.e., to RR1 and | |||
RR2. | RR2. | |||
End of changes. 21 change blocks. | ||||
40 lines changed or deleted | 38 lines changed or added | |||
This html diff was produced by rfcdiff 1.45. The latest version is available from http://tools.ietf.org/tools/rfcdiff/ |