Internet Engineering Task Force                                R. Shakir
Internet-Draft                                                        BT
Intended status: Informational                             July 30,                         December 27, 2012
Expires: January 31, June 30, 2013

Operational Requirements for Enhanced Error Handling Behaviour in BGP-4



   BGP is utilised as a key intra- and inter-Autonomous System inter-autonomous system routing
   protocol in modern IP networks.  The failure modes modes, as defined by the
   original protocol standards standards, are based on a number of assumptions
   around the impact of session failure.  Numerous incidents both in the
   global Internet routing table and within Service Provider service provider networks
   have been caused by strict handling of a single invalid UPDATE
   message causing large-scale failures in one or more Autonomous
   Systems. autonomous

   This memo describes the current use of BGP-4 BGP within Service Provider service provider
   networks, and outlines a set of requirements for further work to
   enhance the mechanisms available to a BGP-4 BGP implementation when
   erroneous data is detected.  Whilst this document does not provide
   specification of any standard, it is intended as an overview of a set
   of enhancements to BGP-4 BGP to improve the protocol's robustness to suit
   its current deployment.

Status of this Memo

   This Internet-Draft is submitted in full conformance with the
   provisions of BCP 78 and BCP 79.

   Internet-Drafts are working documents of the Internet Engineering
   Task Force (IETF).  Note that other groups may also distribute
   working documents as Internet-Drafts.  The list of current Internet-
   Drafts is at

   Internet-Drafts are draft documents valid for a maximum of six months
   and may be updated, replaced, or obsoleted by other documents at any
   time.  It is inappropriate to use Internet-Drafts as reference
   material or to cite them other than as "work in progress."

   This Internet-Draft will expire on January 31, June 30, 2013.

Copyright Notice

   Copyright (c) 2012 IETF Trust and the persons identified as the
   document authors.  All rights reserved.

   This document is subject to BCP 78 and the IETF Trust's Legal
   Provisions Relating to IETF Documents
   ( in effect on the date of
   publication of this document.  Please review these documents
   carefully, as they describe your rights and restrictions with respect
   to this document.  Code Components extracted from this document must
   include Simplified BSD License text as described in Section 4.e of
   the Trust Legal Provisions and are provided without warranty as
   described in the Simplified BSD License.

Table of Contents

   1.  Introduction . . . . .  Requirements Language  . . . . . . . . . . . . . . . . . . . .  3
     1.1.  Role of BGP-4 in Service Provider Networks .
   2.  Problem Statement  . . . . . . .  3
     1.2.  Overview of Operator Requirements for BGP-4 Error
           Handling . . . . . . . . . . . . . . .  4
     2.1.  Role of BGP-4 in Service Provider Networks . . . . . . . .  4
   3.  Critical and Non-Critical Errors . .  5
   2.  Errors within BGP-4 UPDATE Messages . . . . . . . . . . . . .  7
     2.1.  Classifying BGP Errors and Expected
   4.  Error Handling . . . .  8
       2.1.1.  Critical BGP for Non-Critical Errors . . . . . . . . . . . . . . . . .  9
       2.1.2.  Semantic BGP Errors  . . . . . . .
     4.1.  NLRI-level Error Handling Requirements . . . . . . . . . .  9
   3.  Avoiding use of NOTIFICATION . . . . . . . . . . . . . . . . . 11
     4.2.  Recovering RIB Consistency following NLRI-level Error
           Handling . . . . . . . . . . . . . . . . . . 13
   5.  Reducing the Impact of Session Reset . . . . . . . . . . . . . 15
   6.  Operational Toolset 10
   5.  Error Handling for Monitoring BGP . . . . . . Critical Errors . . . . . . 17
   7.  Operational Complexities Introduced by Altering RFC4271 . . . 21
     7.1.  Reducing the Network Impact of Session Teardown . . . . . 23
   8. 12
   6.  IANA Considerations  . . . . . . . . . . . . . . . . . . . . . 25
   9. 14
   7.  Security Considerations  . . . . . . . . . . . . . . . . . . . 26
   10. 15
   8.  Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 27
   11. 16
   9.  References . . . . . . . . . . . . . . . . . . . . . . . . . . 28
     11.1. 17
     9.1.  Normative References . . . . . . . . . . . . . . . . . . . 28
     11.2. 17
     9.2.  Informational References . . . . . . . . . . . . . . . . . 28 17
   Author's Address . . . . . . . . . . . . . . . . . . . . . . . . . 30 19

1.  Introduction

   Where BGP-4 [RFC4271] is deployed  Requirements Language

   The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
   document are to be interpreted as described in RFC 2119 [RFC2119].

2.  Problem Statement

   BGP has become a key intra- and inter-domain routing protocol,
   deployed within both the Internet and Service
   Provider networks, numerous incidents have been recorded due to private networks.  The
   increased reliance on the
   manner in which [RFC4271] specifies errors protocol has resulted in routing information
   should be handled.  Whilst increased demand
   for robustness - with the error handling behaviour defined in the existing
   standards retains utility, the deployments of the protocol have
   changed within modern networks, resulting in significantly different
   demands for protocol robustness.  Whilst a number of Internet Drafts
   [RFC4271] having been written shown to begin to enhance the behaviour of BGP-4 in terms have caused numerous incidents within
   live network deployments.  This document provides an overview of the handling of erroneous messages, this memo intends to
   current deployment cases for BGP-4, and define a set of requirements for ongoing work.  These requirements are
   considered from
   (from the perspective of a Network Operator, and hence this
   draft does not intend to define the protocol mechanisms by which such network operator) for enhancing error
   handling behaviour is to be implemented.

1.1. within the protocol.

2.1.  Role of BGP-4 in Service Provider Networks

   BGP was designed as an inter-Autonomous System inter-autonomous system (AS) routing protocol
   and hence many protocol.
   Many of the error handling mechanisms within the protocol
   specification are designed defined
   in order to be conducive to this role.  In general,
   this consideration as an inter-AS routing propagation mechanism
   results in the view guarantee consistency, and correctness of information
   between two neighbouring speakers.  The assumption is made that a BGP session propagates each
   AS operates with many adjacencies, each propagating a relatively
   small amount of network-layer reachability routing information.  Through focusing on information (NLRI) between two
   ASes.  In this case, it is
   consistency, the expectation protocol specification prefers failure of session resilience for
   those adjacencies that are key to an
   individual routing continuity (for example, it
   is expected that two networks peering via BGP would connect multiple
   times in order adjacency to safeguard equipment or protocol failure).  In
   addition, there is some expectation of multiple paths maintaining reachability to all NLRI
   received from a particular
   NLRI being available - it would be expected neighbour, with the expectation that a network can fall
   back to utilising
   alternate, less direct, paths can be selected where a failure occurs.
   The assumptions of the nature of BGP deployments resulted in the
   specification made in [RFC4271] whereby the receipt of an erroneous
   UPDATE message is reacted to by sending a NOTIFICATION message, and
   tearing down the adjacency with the remote speaker from whom the
   error was observed.

   Historically, a
   more direct path occurs.

   Traditional network architectures would deploy an Interior Gateway
   Protocol interior gateway protocol
   (IGP) to carry infrastructure and customer routes, with and utilise an
   Exterior Gateway Protocol
   external gateway protocol (EGP) such as BGP being utilised to propagate these routes to
   other Autonomous Systems. autonomous systems.  However, BGP's deployments have evolved
   with the growth of IP-based services, this is no longer considered best
   practice.  In order to services.  To ensure that route convergence
   within an AS is within acceptable time bounds, bounds the amount of routing
   information carried within the IGP
   is significantly reduced - and tends has been minimised (typically to be only
   routes. routes). iBGP is then utilised to propagate carry both customer, internal,
   customer and external routes within an AS.  As such, BGP this has
   resulted in BGP having become an IGP, with traditional IGPs acting as a means by which to propagate providing
   only reachability between nodes within the routing
   information which is required AS for packet forwarding
   and to establish a BGP session, and reach
   the egress node within the local routing domain. iBGP sessions.  This change in role
   presents different requirements for within the robustness
   overall architecture of BGP as a
   routing protocol - an AS has resulted in an increased robustness
   requirement for BGP, with the expectation of a similar level of
   robustness to that of an IGP being set.

   Along  The loss of an iBGP session
   can result in significant levels of unreachability internally to an
   AS, especially since there are typically limited (when compared to
   the Internet) signalling and forwarding paths available.

   In parallel with this change in role, of deployment, the volume and nature of
   the IP routing information that is carried within BGP has also changed.  BGP has become a
   the ubiquitous means by through which service information can be
   propagated between devices.  For instance, BGP is being utilised to carry routing
   IP/MPLS service information for IP/
   MPLS VPN services such as described in [RFC4364]. Layer 3 IP VPN routes [RFC4364] ,
   and Layer 2 Virtual Private LAN Service device membership [RFC4761].
   Since there is an
   existing deployment of the protocol between PE devices in numerous
   networks, it has been adapted these extensions to propagate this routing information,
   as its use limits the number protocol allow signalling of routing protocols required on multiple
   services (represented by address families within BGP), and multiple
   customer topologies (i.e., subsets of routes within each
   device.  This additional information being propagated represents a
   large change in requirement for address
   family) via the error handling of BGP protocol, the protocol -
   where impact of session failure occurs, it is likely a complete service outage
   for at least a subset
   increased.  The tear down of a network's customers is experienced where
   an erroneous packet may have occurred within single BGP session can result in a different sub-topology
   complete outage to all customer services signalled via the session,
   even service (a different address family for example).  For this
   reason, there where the triggering event is a significant demand related to avoid only one service affecting
   failures that may be triggered by routing information within a single
   sub-topology or service.

   The combination of the increased number of deployments of BGP-4 as an
   intra-AS routing protocol, its use for the propagation of additional
   types of routing and service information,
   topology being carried - reflecting a disproportional impact to all
   other services and the growth routing topologies.

   The convergence of IP services to IP, and BGP's changing deployment has
   resulted in a substantial increase significant growth in the volume of routing information
   carried within BGP-4. in the protocol.  In numerous networks, the RIB sizes size of
   individual BGP speakers can be of the order of millions of entries exist within individual BGP
   speakers, with particularly high-scale points exhibited paths.
   Particularly large RIBs are observed at BGP speakers performing
   aggregation and border roles (such as ASBR, or functionality designed improve
   utilisation of network resources (e.g., route reflector
   Clearly an increase in the amount routing information carried in BGP  This increased volume of routes results not only in greater impact to a
   significant number of services being impacted during failures, which is only
   amplified by a corresponding increase in recovery times.  Following a protocol
   failure, there is a substantial but also increases the time to recovery after re-
   establishing a BGP session.  The time taken to learn, compute and
   distribute new paths, which results in a greater observed paths increases the impact to of failures on services affected, and hence adds
   carried by the network - adding further weight to the requirement to
   avoid failures altogether or, at least, mitigate their impact to failures, or limit the narrowest scope possible, (e.g., a specific NLRI).  Whilst an
   argument could be made that convergence time extent of BGP-4 could
   potentially be reduced through deployment their impact.  Furthermore,
   the impact of additional computational
   resource, it is notable that solution is not necessarily
   straightforward from an implementation or deployment perspective,
   (e.g., scaling computation resources within a single address-family individual session failures is difficult).  Thus, significant challenges continue increased due to exist for
   operators when scaling BGP-4 deployments, and hence mechanisms which
   improve the scalability
   existence of BGP-4 are very important.

   Both within Internet and multi-service routing architectures, a relatively small number of highly-critical BGP
   sessions within Internet and multi-service network deployments.
   These sessions propagate a large proportion high-proportion of the required
   routing reachability
   information - for network operation.  For instance, providing an Internet routing,
   these are typically BGP sessions which propagate AS with the global
   routing table from upstream providers, or connecting IP/MPLS Provider
   Edge devices to an AS - failure of these sessions may have a large impact on
   network service, based on a single erroneous update.  In an multi-
   service environment, typical deployments utilise a small number of
   core-facing BGP sessions, typically towards route reflector devices.
   Failure hierarchies from which they are
   signalled reachability for services connected elsewhere within the
   routing domain.  In both cases, the failure of these sessions may also can
   result in a large impact to
   network operation.  Clearly, the avoidance of conditions requiring
   these sessions to fail is of great utility significant outage to any network operator,
   and provides further motivation for customer services.

   For the revision current deployments of the existing

   Whilst BGP, the behaviour described in
   [RFC4271] is suited related to ensuring that BGP handling errors in UPDATE messages with erroneous routing information is
   suboptimal, and results in are limited significant disruption to services in scope
   (by means
   modern network deployments.  This document defines a set of session reset), with the above considerations, it is
   clear that this mechanism is not suited
   requirements for protocol developments, and revisions to all deployments. [RFC4271] to
   address these concerns through a set of generalised definitions.  It
   should, however,
   should be noted that the change in scope affects of these requirements is limited to
   the handling only of errors occurring after BGP session establishment.
   There UPDATE messages as, at the time of writing, there is
   no current operational requirement to amend the means by which error handling
   in session establishment, or liveliness
   detection, detection are performed.

1.2.  Overview of Operator Requirements for BGP-4 Error Handling


3.  Critical and Non-Critical Errors

   As described in Section 2.1, the error handling behaviour described
   in [RFC4271] is applied at a per-session level, affecting all NLRI
   signalled via the adjacency on which an erroneous message is
   observed.  In order to reduce the intention impact of this document error handling to define those
   NLRI affected by an erroneous UPDATE, a set of criteria for BGP speaker MUST limit the manner in which a revised
   error handling mechanism in BGP-4 mechanisms implemented to those NLRI contained within
   an erroneous UPDATE message where it is
   required possible to conform.  The motivation for do so.  Clearly,
   some errors within the definition formation of these
   requirements can be summarised based on certain behaviour currently
   present BGP UPDATE messages may result in
   it being impossible to reliably extract NLRI from the protocol that is not deemed acceptable within current
   operational deployments, or where there is a short-fall in received
   message, and hence the tool
   set available to an operator.  These key requirements can be
   summarised as follows:

   o  It same error handling procedures may not apply.
   There is unacceptable within modern deployments of the BGP-4 protocol
      that therefore a single erroneous UPDATE packet affects routes that it does
      not carry.  This requirement therefore requires some modification to classify errors based on their
   impact to the means by which erroneous BGP UPDATE packets message, hence messages whereby the NLRI
   attribute cannot be extracted or parsed are handled, and
      reacted referred to throughout
   this document as Critical errors.  These Critical errors are limited

   o  UPDATE Message Length errors - where the specified UPDATE message
      length is inconsistent with a particular focus on avoiding the use sum of the
      NOTIFICATION message.

   o  It is recognised that some error conditions may occur within the
      BGP-4 protocol may not always be handled gracefully, Total Path Attribute
      and Withdrawn Routes length.  These errors relate to message
      packing or framing, and may result in conditions cases whereby an implementation the NLRI attribute
      cannot recover.  In
      these (and similar) cases, it is undesirable for be correctly extracted from the message.

   o  Errors parsing the NLRI attribute of an operator that
      this reset UPDATE message - where the
      contents of the BGP-4 session results IPv4 Unicast Advertised or Withdrawn Routes
      attributes, or multi-protocol BGP NLRI attributes (MP_REACH_NLRI
      and/or MP_UNREACH_NLRI as defined in interruption to
      forwarding packets (by means [RFC2858]), cannot be
      successfully parsed.

   In the case of withdrawing routes installed by
      BGP-4 into a device's RIB, and subsequently FIB).  To this end,
      there Critical errors is a requirement to define expected that error handling is
   applied at a session reset mechanism which
      provides session re-initialisation in a non-destructive manner.

   o  Further to the requirements to provide a more robust protocol, the
      current visibility into error conditions within the BGP-4 protocol
      is extremely limited - where further modifications to level as per Section 5 of this
      behaviour are to be made, complexity is likely to be added.  Thus,
      to ensure that BGP-4 is manageable, there are requirements for
      mechanisms by which document.

   All errors whereby the protocol contained NLRI can be examined and monitored.

   This document describes each of these requirements in further depth,
   along with an overview of means by which they are expected to be
   achieved.  In addition, the mechanism by which the enhancements
   meeting these requirements extracted, are referred
   to interact as Non-Critical.  It is discussed.

2.  Errors within BGP-4 UPDATE Messages

   Both through analysis of incidents occurring with expected that the Internet DFZ,
   and multi-service environments utilising BGP-4 to signal service or
   routing information, a number of different classes of errors following cases fall
   BGP-4 UPDATE messages have been observed.  In order to consider the
   applicability of enhanced error handling mechanisms, it is possible
   to divide these this category:

   o  Zero or invalid length errors into a number of sub-classes, particularly
   focusing around in path attributes, excluding those
      containing NLRI, or where the location length of the error all path attributes
      contained within the UPDATE message.

   Where an UPDATE message is considered invalid by a BGP speaker due does not correspond to
   an error within the total path
      attribute length.

   o  Messages where invalid data or flags are contained in a path
      attribute that is does not relate to the NLRI (where NLRI.

   o  UPDATE messages missing mandatory attributes, unrecognised non-
      optional attributes, or those that contain duplicate or invalid
      attributes (be they unsupported, or unexpected).

   o  Those messages where the
   definition of NLRI includes reachability information encoded in NEXT_HOP, the MP_REACH_NLRI and MP_UNREACH_NLRI attributes as specified in
   [RFC4760]) it is a requirement of any enhanced error handling
   mechanism to handle the error in a manner focused on the NLRI
   contained within the message found to be erroneous.  Since in this
   case, the message received from the remote peer is syntactically
   valid, it is considered that such an UPDATE is indicative of
   erroneous data within one or more path attributes.  The impact of the
   current behaviour defined within the protocol makes the implication
   that the BGP speaker from whom the message is received is now an
   invalid path for all NLRI announced via the session - which results
   in a disproportionate impact to overall network operation.  In
   particular scenarios (such as networks with centralised BGP route
   reflection) such action can result in a loss of all reachability to a
   network.  In other contexts (such as the Internet DFZ), it cannot be
   assumed that the BGP speaker from whom the UPDATE message is received
   is directly responsible for the erroneous information contained
   within the message.

   Two further error cases exist within UPDATE messages, both of which
   are related to the mechanisms that are applicable to messages
   received where some difficulty exists in parsing the entire BGP
   message.  The two cases concern those cases where a valid NLRI
   attribute can be extracted, and those where such an attribute is not
   able to be parsed.  In these cases, errors in the packing of
   attributes within a BGP message may have occurred.  Such errors are
   likely indicative of an error specifically caused by the remote BGP
   speaker.  It is, however, desirable to an operator that such errors
   are handled without affecting all NLRI across a BGP session.  As
   such, there is a key requirement to maximise the number of cases in
   which it is possible to extract NLRI from a BGP UPDATE message.  To
   this end, it is required that where possible the MP_REACH_NLRI and
   MP_UNREACH_NLRI attributes are utilised for encoding all NLRI
   (including IPv4 Unicast), and that this attribute is included as the
   first attribute of a BGP UPDATE message (as originally recommended in
   [I-D.chen-ebgp-error-handling]).  Such a change to the order of
   inclusion of this attribute maximises the number of cases in which
   NLRI can be extracted from an UPDATE.  Where this is possible, it is
   again required that the error handling mechanisms utilised should be
   directly applied to the NLRI included in the UPDATE.

   For all cases whereby NLRI can be obtained from an UPDATE message, it
   is expected that the requirements outlined in Section 3 should be
   considered by any enhancement to the BGP-4 protocol.

   In the case that it is not possible to completely parse the NLRI
   attribute from the UPDATE message received from a peer, it is
   extremely likely that this is indicative of a serious error with
   either the process of attribute packing, or buffer usage on the
   remote BGP speaker.  In this case, clearly, it is not possible to
   apply any error handling mechanism that is limited to a specific set
   of NLRI, since an implementation has no knowledge of the NLRI
   included within the UPDATE message.  In addition, such errors are
   considered to be relatively fundamental to the operation of a BGP
   implementation, and hence may indicate a case whereby significant
   system errors have occurred.  The current BGP-4 standard results in a
   BGP speaker restarting a session with the remote BGP speaker.
   However where such an error does occur, it is required that a
   graceful mechanism is utilised to provide a lower impact to network
   operation.  The requirements for enhancements of this nature to BGP-4
   are outlined in Section 5, with the requirements outlined therein
   focused on providing a means by which system integrity can be
   restored whilst allowing for continued network operation.

2.1.  Classifying BGP Errors and Expected Error Handling

   It is clearly of advantage for BGP-4 implementations to utilise a
   consistent set of error handling mechanisms for the different types
   of errors that are described in Section 2, and provide consistent
   nomenclature to refer to them.  It is therefore suggested that errors
   that are indicative of larger scale failures of a BGP speaker, and
   hence require some error handling at the session level are referred
   to as 'critical' errors, whilst those errors that are identified
   based on incorrect content of one of more attributes of a message are
   referred to as 'semantic' errors.

   The errors identified within the following sections consider only
   those errors within the specifications at the time of writing, it is
   recommended that in the definition of future extensions to the BGP-4
   specification, the error handling behaviour (and the category within
   which errors within the extension should be considered by an
   implementation) is defined.

2.1.1.  Critical BGP Errors

   As described in this document, it is of advantage to limit the number
   of 'critical' errors that occur within the protocol, therefore, based
   on analysis of the processing of BGP UPDATE messages, it is required
   that 'critical' error handling behaviour is applied to:

   o  UPDATE Message Length errors - whereby the specified overall
      UPDATE message length is inconsistent with sum of the Total Path
      Attribute and Withdrawn Routes length.  In this case, this is
      indicative of message packing failure, whereby the NLRI may not be
      correctly extracted.

   o  Errors Parsing the NLRI attributes of an UPDATE message - where
      NLRI is carried in either the IPv4-Unicast Advertised or Withdrawn
      routes, or in the MP_REACH_NLRI or MP_UNREACH_NLRI attributes
      [RFC2858], it is not possible to target error handling mechanisms
      to specific NLRI, and hence session level mechanisms must be

   It is expected that those requirements outlined in Section 5 are
   utilised to provide session-level handling of those errors identified
   as 'critical'.

2.1.2.  Semantic BGP Errors

   Where a BGP message is correctly formed, a number of cases exist
   whereby the contents of the UPDATE are not valid - in these cases,
   this represents errors that can be identified to affect specific
   NLRI.  The following cases are expected to be classified as semantic

   o  Zero or invalid length errors in path attributes excluding those
      containing NLRI, or where the length of all path attributes
      contained within the UPDATE does not correspond to the total path
      attributes length.  In this case, the NLRI can be correctly
      extracted, and hence acted upon.

   o  Messages where invalid data or flags are contained in a path
      attribute that does not relate to the NLRI.

   o  UPDATE messages missing mandatory attributes, unrecognised non-
      optional attributes or those that contain duplicate or invalid
      attributes (be they unsupported or unexpected).

   o  Those messages where the NEXT_HOP, or MP_REACH next-hop values are
      missing, length zero, or invalid for the relevant AFI/SAFI.

   In these cases, it is expected that these errors can be handled
   gracefully, following the requirements detailed in Section 3 and
   Section 4 of this memo.

3.  Avoiding use of NOTIFICATION

   The error handling behaviour defined in RFC4271 is problematic due to
   the limited options that are available to an implementation.  When an
   erroneous BGP message is received, at the current time, the
   implementation must either ignore the error, or send a NOTIFICATION
   message, after which it is mandatory to terminate the BGP session.
   It is apparent that this requirement is at odds with that of protocol

   There is significant complexity to this requirement.  The mechanism
   defined in [I-D.chen-ebgp-error-handling] describes a means by which
   no NOTIFICATION message is generated for all cases whereby NLRI can
   be extracted from an UPDATE.  The NLRI contained within the erroneous
   UPDATE message is considered as though the remote BGP speaker has
   provided an UPDATE marking it as withdrawn.  This results in a limit
   in the propagation of the invalid routing information, whilst also
   ensuring that no traffic is forwarded via a previously-known path
   that may no longer be valid.  This mechanism is referred to as

   Whilst this behaviour results in avoiding a NOTIFICATION message,
   keeping other routing information advertised by the remote BGP
   speaker within the RIB, it may result in unreachability for a sub-set
   of the NLRI advertised by the remote speaker.  Two cases should be
   considered - that where the entry for a route in the Adj-RIB-In of
   the neighbour propagating an erroneous packet is utilised, and that
   where the route installed in the device's RIB is learnt from another
   BGP speaker.  In the former case, should the identified NLRI not be
   treated as withdrawn, the original NLRI is utilised within the global
   RIB.  However, this information is potentially now invalid (i.e. it
   no longer provides a valid forwarding path), whilst an alternate
   (valid) path may exist in another Adj-RIB-In.  By continuing to
   utilise the NLRI for which the UPDATE was considered invalid, traffic
   may be forwarded via an invalid path, resulting in routing loops, or
   black-holing.  In the second case, no impact to the forwarding of
   traffic, or global RIB, is incurred, yet where treat-as-withdraw is
   implemented, possibly stale routing information is purged from the
   Adj-RIB-In of the neighbour propagating errors.

   Whilst mechanisms such as "treat-as-withdraw" are currently
   documented, the proposals are limited in their scope - particularly
   in terms of restrictions to implementation only on eBGP sessions.
   This limitation is made based on the view that the BGP RIB must be
   consistent across an autonomous system.  By implementing treat-as-
   withdraw for a iBGP session, one or more routers within the
   Autonomous System may not have reachability to a route, and hence
   blackholing of traffic, or routing loops, may occur.  It should,
   however, be considered if this view is valid, in light of the manner
   in which BGP is utilised within operator networks.  Inconsistency in
   a RIB based on a single UPDATE being treated as withdrawn may cause a
   inconsistency in a single sub-topology (e.g.  Layer 3 VPN service),
   or a service not operating completely (in the case of an UPDATE
   carrying service membership information).  Where a NOTIFICATION and
   teardown is utilised this is destructive to all sub-topologies in all
   address family identifiers (AFIs) carried by the session in question.
   Even where mechanisms such as multi-session BGP are utilised, a whole
   AFI is affected by such a NOTIFICATION message.  In terms of routing
   operation, it is therefore far less costly to endure a situation
   where a limited sub-set of routing information within an AS is
   invalid, than to consider all routing information as invalid based on
   a single trigger.

   At the time of writing, error handling mechanisms related to
   optional, transitive attributes - such as
   [I-D.ietf-idr-optional-transitive] are restricted to handling only a
   subset of attribute errors - whereas the operational requirement is
   to expand this coverage to the widest set of errors possible (i.e.,
   all semantic errors within UPDATE messages).  Additionally, where
   approaches applicable to a greater number of attributes are proposed
   (e.g., [I-D.chen-ebgp-error-handling]), these are limited to
   deployment in eBGP applications only, where requirements also exist
   in intra-domain cases.  As such, it is envisaged that if extended to
   cover these expanded cases, these mechanisms provide a means to avoid
   the transmission of a NOTIFICATION message to a remote BGP speaker,
   based on a single erroneous message, where at all possible, and hence
   meet this requirement.  Critical errors, including those whereby the
   NLRI cannot be extracted from the UPDATE message, represent cases
   whereby the receiving system cannot handle the error gracefully based
   on this mechanism.

4.  Recovering RIB Consistency

   The recommendations described in Section 3 may result in the RIB for
   a topology within an AS being inconsistent across the AS' internal
   routers.  Alternatively, where such mechanisms are deployed at an AS
   boundary, interconnects between two ASes may be inconsistent with
   each other.  There are therefore risks of traffic blackholing, due to
   missing routing information, or forwarding loops.  Whilst this is
   deemed an acceptable compromise in the short term, clearly, it is
   suboptimal.  Therefore, a requirement exists to provide mechanisms by
   which a BGP speaker is able to recover the consistency of the Adj-
   RIB-In for a particular neighbour.

   In the general case, the consistency of the BGP RIB can be recovered
   by re-requesting the entire Adj-RIB-Out of a remote BGP speaker is
   re-advertised.  A mechanism to achieve this re-advertisement is
   defined within the ROUTE-REFRESH specification [RFC2918].  It is
   envisaged that by requesting a refresh of all NLRI advertised by a
   BGP speaker, any NLRI which has been withdrawn due to being contained
   within an invalid UPDATE message is re-learnt.  Where a ROUTE REFRESH
   is used to directly perform a consistency check between the Adj-RIB-
   Out of a remote device, and the Adj-RIB-In of the local BGP speaker,
   a demarcation between the ROUTE-REFRESH, and normal UPDATE messages
   is required (in order that an "end" of the refresh can be used to
   identify any 'stale' NLRI) -
   [I-D.ietf-idr-bgp-enhanced-route-refresh] provides a means by which
   the ROUTE-REFRESH mechanism can be extended to meet this requirement.

   Whilst re-advertisement of the whole BGP RIB provides a means by
   which withdrawn NLRI can be re-advertised, there are some scaling
   implications that must be considered.  In the case that a ROUTE-
   REFRESH is generated, all NLRI must be re-packed into UPDATE messages
   and advertised by one speaker on the BGP session, whilst the other
   must receive all UPDATE messages, and validate the RIB's consistency.
   In order to avoid the control-plane load, it is therefore a
   requirement to utilise targeted mechanisms where possible, rather
   than incurring the additional load on both the advertising and
   receiving speaker of building and processing UPDATEs for the entire
   contents of the RIB.

   It is envisaged that during routing inconsistencies caused by
   utilising the 'treat-as-withdraw' mechanism, the local BGP speaker is
   aware that some routing information was not able to be processed -
   due to the fact that an UPDATE message was not parsed correctly.
   Since this mechanism (as discussed in Section 3) requires the local
   BGP speaker to have determined the set of NLRI for which an erroneous
   UPDATE message was received, it is possible to use a targeted
   mechanisms to re-request the specific NLRI that was contained within
   the erroneous UPDATE message.  By re-requesting, this provides the
   remote BGP speaker an opportunity to re-transmit the NLRI - possibly
   providing an opportunity to leverage alternative methods to build the
   UPDATE message.  Such a request requires extension to the existing
   BGP-4 protocol, in terms of specific UPDATE generation filters with a
   transient lifetime.  It is envisaged that the work within
   [I-D.zeng-idr-one-time-prefix-orf] provides a mechanism allowing
   targeted elements of the Adj-RIB-In for a BGP neighbour to be

   It is of particular note for both means of recovering RIB consistency
   described that these are effective only when considering transient
   errors within an implementation - for instance, should an RFC
   interpretation error within an implementation be present, regardless
   of the number of times a specific UPDATE is generated, it is likely
   that this error condition will persist (as it may with the existing
   behaviour defined by [RFC4271]).  For this reason, there is an
   requirement to consider the means by which such consistency recovery
   mechanisms are utilised.  It is not advisable that a dynamic filter
   and advertisement mechanism is triggered by all error handling events
   due to the load this is likely to place on the neighbour receiving
   such a request.  Where this BGP speaker is a relatively centralised
   device - a route reflector (as described by [RFC4456]) for example -
   the act of generation of UPDATE messages with such frequency is
   likely to cause disproportionate load.  It is therefore an
   operational requirement of such mechanisms that means of request
   dampening be required by any such extension.

   In cases whereby the consistency of the Adj-RIB-In is to be restored
   (e.g., following the 'treat-as-withdraw' behaviour described in
   Section 3), and mechanisms such as those described herein are
   triggered, such a condition should be noted to an operator by means
   of a specific flag, SNMP trap, or other logging mechanism.  In order
   to identify the subset of NLRI that are considered to be
   inconsistent, this information is of operational benefit and hence
   should be logged.

5.  Reducing the Impact of Session Reset

   Even where protocol enhancements allow errors in the BGP-4 protocol
   to cease to trigger NOTIFICATION messages, and hence reset a BGP
   session, it is clear that some error conditions may not be exited.
   In particular, errors due to existing state, or memory structures,
   associated with a specific BGP session will not be handled.  It is
   therefore important to consider how these error conditions are
   currently handled by the protocol.  It should be noted that the
   following discussion and analysis considers only those NOTIFICATION
   messages generated in response to errors in UPDATE messages (as
   defined by Section 6.3 in [RFC4271]).

   The existing NOTIFICATION behaviour triggers a reset of all elements
   of the BGP-4 session, as described in Section 6 of [RFC4271].  It is
   expected that session teardown requires an implementation to re-
   initialise all structures and state required for session maintenance.
   Clearly, there is some utility to this requirement, as error
   conditions in BGP are, in general, exited from.  However, this
   definition is responsible for the forwarding outages within networks
   utilising BGP for propagation of routing or service when each error
   is experienced.  The requirement described in Section 3 is intended
   to reduce the cases whereby a NOTIFICATION is required, however, any
   mechanism implemented as a response to this requirement by definition
   cannot provide a session reset to the extent of that achieved by the
   current behaviour.

   In order to address this, there is a requirement for a means by which
   a BGP speaker can signal that an unhandled error condition in an
   UPDATE message occurred - requiring a session reset - yet also
   continue to utilise the paths advertised by the neighbour that are
   currently in use within the RIB.  In this case, the Adj-RIB-In
   received from the neighbour is not considered invalid, despite a
   NOTIFICATION, and session reset, being required.  This set of
   requirements is akin to those answered by the BGP Graceful Restart
   mechanism described in [RFC4724].  Since the operational requirement
   in this case is to provide a means to achieve a complete session
   restart without disrupting the forwarding path of those routes in use
   within a BGP speaker's RIB, it is expected that utilising a procedure
   similar to the Graceful Restart mechanism meets the error handling
   requirement.  By responding to an error condition (repeated or
   otherwise) with a message indicating that an error that cannot be
   handled has occurred, forcing session reset, whilst retaining
   forwarding information within the RIB allows forwarding to all routes
   within a system's RIB to continue during the period in which the
   session restarts.  It is envisaged that the additional complexity
   introduced by the introduction of such a mechanism can be limited by
   extending existing BGP messages - one such approach is proposed in

   [I-D.ietf-idr-bgp-gr-notification].  By placing a time bound on the
   restart lifetime, should an error condition not be transient - for
   example, should an error have occurred with the BGP process, rather
   than a specific of the BGP session - the remote BGP speaker is still
   detected as an invalid device for forwarding.

   In some cases, the erroneous condition may be due to corruption of
   the Adj-RIB-Out on the advertising BGP speaker - rather than caused
   by the receiving speaker's state.  In these cases, where existing
   structures are replayed whilst performing graceful restart
   functionality, the error condition is not necessarily resolved.
   Therefore, it is recommended that during a session restart event, as
   described within this section, the advertising speaker purge and
   rebuild RIB structures, in order to resolve any corruption within
   these structures.

   It should be noted that a protocol enhancement meeting this
   requirement is not able to solve all error conditions - however, a
   complete restart of the BGP and TCP session between two BGP speakers
   implements an identical recovery mechanism to that which is achieved
   by the existing behaviour.  Where an error condition such as memory
   or configuration corruption has occurred in a BGP implementation, it
   is expected that a mechanism meeting this requirement continues to
   detect this, by means of a bound on time for session restart to
   occur.  Whilst there may be some consideration that packets continue
   to be forwarded through a device which can be in an failure mode of
   this nature for a longer period due to this requirement, the
   architecture of modern IP routers should be considered.  A divided
   forwarding and control plane is common in many devices, as well as
   process separation for software-based devices - corruption of a
   specific protocol daemon does not necessarily imply forwarding is
   affected.  Indeed, where forwarding behaviour of a device is
   affected, it is envisaged that a failure detection mechanism (be it
   Bidirectional Forwarding Detection, or indeed BGP KEEPALIVE packets)
   will detect such a failure in almost all cases, with the symptomatic
   behaviour of such a failure being an invalid UPDATE message in very
   few other cases.

6.  Operational Toolset for Monitoring BGP

   A significant complexity that is introduced through the requirements
   defined in this document is that of monitoring BGP session status for
   an operator.  Although the existing error handling behaviour causes a
   disproportionate failure, session failure is extremely visible to
   most operational personnel within a Network Operator due to both
   existing definitions of SNMP trap mechanisms for BGP, along with the
   forwarding impact typically caused by such a failure.  By introducing
   mechanisms by which errors of this nature are not as visible, this is
   no longer the case.  There is a requirement that where subsets of the
   RIB on a device are no longer reachable from a BGP speaker, or indeed
   an AS, that some visibility of this situation, alongside a mechanism
   to determine the cause is available to an operator.  Whilst, to some
   extent, this can be solved by mandating a sub-requirement of each of
   the aforementioned requirements that a BGP speaker must log where
   such errors occur, and are hence handled, this does not solve all
   cases.  In order to clarify this requirement, the example of the
   transmission of an erroneous Optional Transitive attribute can be
   considered.  Since, by definition, there is no requirement for all
   BGP speakers to parse such an attribute, a receiving router may treat
   NLRI as withdrawn based on an erroneous attribute not examined by its
   neighbour.  In this case, the upstream device or network, propagating
   the UPDATE, has no visibility of this error.  Operationally, however,
   it is of interest to the upstream router operator that such invalid
   information was propagated.

   The requirement for logging of error conditions in transmitted BGP
   messages, which are visible to only the receiver, cannot be achieved
   by any existing BGP message, or capability.  It is envisaged that
   each erroneous event should be transmitted to the remote peer -
   including the information as to the set of NLRI that were considered
   invalid.  Whilst with some mechanisms this is achieved by default
   (for example, One-Time Prefix ORF [I-D.zeng-idr-one-time-prefix-orf]
   (Outbound Route Filtering) will transmit the set of routes that are
   required), the operator requirement is to know which routes may have
   been unreachable in all cases.  It is envisaged that an extension to
   meet this requirement will allow for such information to be
   transmitted between peers, and hence logged.  Such a mechanism may
   provide further utility as a either a diagnostic, or logging toolset.

   As such, it is possible to divide the messages that are required in
   order to provide further visibility into BGP for an operator.  Such a
   division can be made both due to the required means of message
   transmission, alongside the criticality of each request.

   o  Messages required to replace NOTIFICATION - In cases where the
      error handling mechanisms defined by [RFC4271] currently result in
      a NOTIFICATION message being generated, a number of the
      requirements detailed within this document result this message
      being suppressed.  Despite this change, the error condition's
      occurrence is still of interest to an operator in order to provide
      both monitoring and troubleshooting capabilities, since some form
      of invalid data has been received on a session.  It therefore
      considered that an implementation must generate a message both
      locally, and transmitted to the remote peer, based on the such a
      condition.  Where such a message is transmitted to the remote
      peer, it is considered that the BGP session via which the
      erroneous UPDATE message was received should be used as transport
      to the remote peer.  The information transmitted in such a message
      should be minimised to allow identification of the paths which
      were considered erroneous (i.e. restricting the information to
      that which is directly relevant to a network operator in the case
      of an error condition occurring).  Any delay to convergence on the
      session in question is considered to be acceptable, given the
      suboptimal nature of the reception of invalid routing information
      via a BGP session.  Further concerns regarding such a mechanism
      relate to the load generated on the BGP speaker in question,
      however, it must be considered that in the case of an erroneous
      UPDATE being received, and the 'treat-as-withdraw' mechanism being
      utilised, where the erroneous path is removed from the Loc-RIB,
      there is likely to be a requirement to generate UPDATE messages
      withdrawing the route from all further BGP speakers to which the
      prefix is advertised.  The load generated by the generation of
      such UPDATEs is likely to be much greater than that of
      transmitting error information via a logging message type back to
      the speaker from which it was received.  It is envisaged that
      light-weight BGP message-based signalling mechanisms such as the
      ADVISORY message types detailed in
      [I-D.ietf-idr-operational-message] provide a suitable means to
      satisfy this requirement.

   o  Additional Diagnostic Capabilities for BGP - In a number of cases,
      there is an operational requirement to further debug erroneous BGP
      UPDATE messages, along with the particulars of the state of a BGP
      speaker.  For instance, where an invalid BGP UPDATE message is
      transmitted between two BGP speakers, the exact format of the
      UPDATE message is of interest to an operator, as this information
      provides a clear indication of an message considered to be
      erroneous by the BGP speaker to which it was transmitted.  In this
      case, it is considered of great utility that the entire UPDATE
      message is transmitted back to the advertising speaker, in order
      to allow for further debugging to occur.  Whilst such information
      is particularly useful to an operator, it clearly provides
      information that is not key to protocol operation - for this
      reason, it is expected that some of the concerns regarding the
      additional complexity, and load that a BGP speaker is subjected to
      is not acceptable.  For this reason, it is required that where
      mechanisms are developed to support this requirement, messages of
      this nature can be supported both within an existing BGP session,
      and via a dedicated separate session, be it BGP carrying messages
      such as those defined in [I-D.ietf-idr-operational-message] or a
      dedicated monitoring protocol akin to BMP described in

   Whilst the operational requirement for such monitoring tools to allow
   for visibility into BGP is clearly agreed upon, the means by which
   such messages are transmitted between two BGP speakers is likely to
   be dependent upon both the positions of the speakers in question (for
   instances, the requirements for such a protocol may differ where a
   session is between two ASBRs under separate administration).  The
   introduction of additional message types to the BGP protocol clearly
   introduces further complexity - and leaves room for further
   implementation and standardisation errors that may compromise the
   robustness of the BGP protocol.  In addition, next-hop
      values are missing, zero-length, or invalid for the queuing and
   scheduling of relevant
      address family.

   For these BGP messages must be interleaved with the
   transmission of Non-Critical errors, the key protocol messages - such as KEEPALIVE and
   UPDATE packets.  It is therefore a concern that NLRI-targeted error handling
   requirements described in Section 4 should a large number
   of messages specifically for operational visibility be transmitted,
   this will delay followed.

   In order to maximise the transmission number of UPDATE packets, and hence
   adversely affect cases whereby the end-to-end convergence time for NLRI carried
   within BGP.  The operational requirement for why messages are
   advantageous to attributes
   can be in-band to reliably extracted from a protocol should also be considered.
   In particular, it should be noted that received message, where such information is to
   be transmitted between administrative boundaries a BGP session
   represents an existing channel between the two ASes.  This channel is
   considered to be secure insofar as the routing information, and
   requests sent via
   speaker supports multi-protocol extensions, the session are considered to come from a trusted
   source.  Since error information relates to both a particular
   attachment, MP_REACH_NLRI and is key to ensuring that such a session is operating
   as expected, it is considered of great operational benefit that this
   information is transmitted over this channel.  In addition, the
   overall system scalability is improved by such in-band transmission.
   It is expected that erroneous information resulting in the 'treat-as-
   withdraw' mechanism being
   MP_UNREACH_NLRI attributes SHOULD be utilised is relatively infrequently
   transmitted between two peers (when compared to the frequency of
   UPDATE messages transmission).  The impact of including an additional
   BGP message type for such operational visibility is relatively small
   from a resource utilisation perspective - additional processing
   overhead is only experienced when such a message is received.  Where
   a separate session is maintained, particular network elements within
   a service provider topology may require hundreds, or thousands, of
   additional sessions for the transmission of this information.  Such
   an resource consumption overhead is likely to all address
   families (including IPv4 Unicast) and these attributes should be unacceptable the
   first attribute contained within the UPDATE message.

   Where attributes are introduced by future extensions to some
   network operators.

   For the reasons explained above, it is expected BGP
   protocol the error handling behaviour applied MUST be assumed that mechanisms
   applied to meet the requirements for event visibility consider Non-Critical errors, unless otherwise specified within the
   relative impacts of additional monitoring sessions,
   per-extension memo, or message
   inclusion in band the attribute relates directly to carrying
   NLRI.  Authors of future BGP in order not to compromise extensions SHOULD specify the security,
   scalability and robustness error
   handling behaviour required for new attributes in terms of the BGP-4 protocol.

7.  Operational Complexities Introduced by Altering RFC4271

   The existing NOTIFICATION and subsequent teardown of
   classification into a BGP session
   upon encountering an Critical or Non-Critical error has the advantage that on a consistent
   approach to per-
   attribute error basis.

4.  Error Handling for Non-Critical Errors

4.1.  NLRI-level Error Handling Requirements

   When a Non-Critical error handling is required of all implementations of detected within an UPDATE message a BGP
   speaker MUST NOT send a NOTIFICATION message to the
   BGP-4 protocol.  This is of operational advantage remote neighbour.
   Instead, the NLRI contained within the message MUST be considered as it provides
   no longer viable until they are updated by a
   clear expectation of subsequent UPDATE
   message, thus treating the NLRI as withdrawn as per the treat-as-
   withdraw mechanism described in [I-D.chen-ebgp-error-handling].

   Network operators SHOULD recognise that where such behaviour is
   implemented black-holing or looping of traffic may occur in the protocol.  The requirements
   defined herein add further complexity to
   period between the error-handling within
   BGP, NLRI being treated as withdrawn, and hence are liable to compromise subsequent
   updates, dependent upon the existing deterministic
   protocol behaviour. routing topology.  It is therefore deemed SHOULD be noted
   that there is a further
   requirement to define a set of recommended behaviours based on the
   reception such periods of RIB inconsistency (where one speaker has
   advertised a particular class of erroneous UPDATE message,
   alongside highlighting some of prefix, which has been treated as withdrawn by the
   receiving speaker) may be relatively long lived, based on situations
   such as an erroneous implementation complexities that
   may need to be handled in at the case that particular recommendations
   made receiver, or the error
   occurring within this memo are deployed.

   Utilising an optional, transitive attribute not examined by
   the classes advertising device.  In order to allow operators to select
   sessions on which this risk of erroneous UPDATE message described in
   Section 2, the recommended behaviour for a BGP-4 inconsistency is acceptable, an
   implementation SHOULD provide means by which NLRI-level error
   handling for Non-Critical errors can be divided into two branches.  Primarily, where disabled on a semantic error is
   identified, an implementation is expected to utilise per-session

   Since the reduced-
   impact Non-Critical error handling approach, as described in Section 3.  In the
   case that such an approach required within this section
   results in known NLRI no NOTIFICATION message being withdrawn from transmitted, the BGP speaker's RIB, and an implementation provides functionality
   such fact that these errors are recovered from through an automatically
   triggered means, such as those described within Section 4, some
   consideration of the scalability of these recovery mechanisms is
   required.  Clearly, there is
   an computational error has occurred and bandwidth overhead
   associated with the re-advertisement of NLRI hence there may be inconsistency between two
   the local and remote BGP speakers
   - both due speaker MUST be flagged to the generation of UPDATE messages, their transmission
   between network
   operator through standard operational interfaces (e.g., SNMP,
   syslog).  The information highlighted MUST include the two speakers, and NLRI
   identified to be contained within the parsing error message, and processing into SHOULD
   contain a exact copy of the RIB
   required.  This overhead is directly proportional to received message for further analysis.

   In order that the number operator of the BGP speaker from whom an erroneous
   UPDATE messages message has been advertised is aware of the fact that are required.  Where some
   NLRI advertised to the remote speaker have been considered withdrawn
   due to being contained within an erroneous UPDATE, a semantic BGP speaker
   SHOULD support mechanisms to report the occurrence of Non-Critical
   error is
   experienced, by definition handling to the remote speaker.  The receiving speaker SHOULD
   transmit the NLRI contained within the UPDATE can
   be extracted.  It is therefore possible erroneous message to minimise the proportion
   advertising speaker.  An exact copy of the RIB that received UPDATE message
   SHOULD also be sent.

   The exchange of information related to events occurring as a result
   of BGP messages is re-advertised not currently supported by targeting any recovery mechanism on extension to the NLRI contained
   protocol.  Clearly, where the two speakers reside within the erroneous UPDATE.  Such a targeted
   mechanism same
   administrative domain, shared logging infrastructure can be achieved through a means such as One-Time ORF, or
   other means utilised
   to identify the root cause of targeting UPDATE messages not discussed errors, however, in many cases
   neighbouring BGP speakers reside within separate administrative
   domains (e.g., are ASBRs for Internet or private networks).  In this
   memo.  It is recommended that where available, any automatic (or
   manual) triggered recovery mechanism behaviour utilises such targeted
   means in preference
   case, mechanisms allowing transmission in-band to any whole RIB refresh mechanism (such as

   In the case that BGP session
   SHOULD be utilised (e.g., the OPERATIONAL message described in
   [I-D.ietf-idr-operational-message]).  Such an erroneous UPDATE has been processed through in-band channel is
   preferred based on the BGP session representing a
   means such as treat-as-withdraw (described pre-established
   trusted channel which is related to a specific BGP-speaking device
   within Section 3), a
   recovering mechanism may be considered superfluous, if the assumption network.  It is made expected that the RIB inconsistency will only be recovered from based
   on overall system scalability
   of a path re-convergence (or change in BGP attribute) for speaker is improved through utilising the
   advertising BGP speaker. existing channel,
   rather than incurring overhead for maintaining many additional
   logging-specific protocol sessions for relatively infrequent
   messaging events when errors occur.  However, where this assumption is not
   considered the extensions
   providing such a channel MUST consider their impact to provide adequate recovery behaviour, base BGP
   protocol functions such as the transmission of UPDATE or KEEPALIVE
   messages, and a mechanism SHOULD limit the volume of messaging to direct
   reactions to
   restore RIB consistency automatically is implemented, some
   consideration must Non-Critical errors occurring.  These considerations
   SHOULD be made for where repeated erroneous messages
   occur.  In this case, in order to limit the impact ensure that no compromise is made to the
   security, scalability and robustness of BGP.  Where additional BGP
   speaker's network operation, at a pre-defined point it is recommended
   monitoring information that such automatic recovery is not suitable to be carried in-band is
   required, out-of-band mechanisms towards such as the BGP speaker from
   which BMP protocol described
   in [I-D.ietf-grow-bmp] could be utilised to provide further
   information relating to erroneous UPDATEs are repeatedly received are suppressed, messages.

4.2.  Recovering RIB Consistency following NLRI-level Error Handling

   Following NLRI being treated as withdrawn due to Non-Critical error
   handling, inconsistencies exist between the Adj-RIB-Out of the
   advertising BGP speaker, and the fact Adj-RIB-In of the receiving device.
   These inconsistencies may result in forwarding loops or blackholing
   of traffic in some routing topologies.  In order to ensure that such suppression has occurred is highlighted
   cases can be recovered from a means by which a validation and
   recovery of consistency can be achieved SHOULD be provided to an
   operator.  The point at which such behaviour is suppressed is to  This function may be
   defined on a per-implementation basis, taking into account feedback
   from provided through enhancing the Network Operator community based on ROUTE-
   REFRESH [RFC2918] mechanism to add means to identify the deployment beginning
   and end of the
   recommendations described in this document.  It is expected that such
   trigger points are dependent upon the mechanisms implemented for a
   particular BGP-4 implementations, and replay of the impact upon entire Adj-RIB-Out of the advertising
   speaker of
   these means of RIB recovery.

   Where critical errors are experienced, such that a session reset is
   required, (as per the mechanism discussed suggestion in Section 5 should be used.
   Again, since such

   As Non-Critical error handling is localised to the NLRI contained
   within the erroneous UPDATE message, a targeted recovery mechanism results in
   MAY be provided allowing a restart speaker to request re-advertisement of a BGP session,
   it expected that all NLRI carried over the session is re-advertised
   as it is re-established, incurring processing overhead on both
   particular subset of the
   advertising and receiving BGP speaker.  In order Adj-RIB-Out.  Where such targeted refresh
   functions are available, they SHOULD be preferred to minimise the
   consumption mechanisms
   requesting re-advertisement of control-plane computational resource the whole Adj-RIB-Out based on both speakers,
   it is recommended that mechanisms allowing a reduced set their
   more limited use of CPU and network resources.

   A BGP
   UPDATE messages to be re-transmitted between two speakers are
   employed wherever possible - for instance through employing speaker may automatically trigger recovery mechanisms such as
   those described in [I-D.ietf-idr-enhanced-gr].

   In the case that repeated critical errors occur, this section following the overhead receipt of
   performing any mechanism implemented based on the requirements in
   Section 5 is incurred following each an erroneous
   UPDATE message.  Since
   these mechanisms are, by definition, performed automatically in
   response to the erroneous message being received similar
   considerations identified as Non-Critical to the impact to the BGP speaker must expedite recovery.  It
   should be taken into
   account.  As such, it is expected noted that after a certain if automatic recovery mechanisms trigger level,
   the ongoing receipt only
   re-advertisement of critical errors within BGP UPDATE messages is
   deemed an identical erroneous message, they are likely
   to be indicative of ineffective.  Additionally, where the best-path to be
   advertised by remote speaker changes, this will be advertised
   directly, without a long-lasting failure, and requirement for a session no
   longer considered viable. request from the receiver.
   However, in some cases, RIB consistency recovery mechanisms may
   prompt alternate UPDATE message packing, and hence allow quicker
   recovery.  Where such an case is experienced, it is
   expected that the BGP session reverts mechanisms are implemented, mechanisms focused
   to smaller sets of NLRI SHOULD be preferred over those requesting the standard session failure
   behaviour, as described in [RFC4271]
   entire RIB.  In addition, such mechanisms SHOULD have dampening
   mechanisms to ensure that their impact to computational and documents updating this base
   standard. network
   resources is limited.

5.  Error Handling for Critical Errors

   Where such an UPDATE message containing a reversion Critical error is implemented this condition
   should received,
   since the NLRI cannot be flagged extracted, error handling mechanisms must be
   applied at the per-session level.  In order to an network operator.  The number of restart
   attempts before limit the session reverts impact to being shut down should
   network operation, these session-level mechanisms MUST be
   determined based on applied in
   a manner which allows the overhead of paths NLRI received from the recovery mechanisms
   implemented (for instance, where [I-D.ietf-idr-enhanced-gr] is
   implemented, remote speaker
   to continue to be utilised for forwarding during the impact of session restart may be significantly
   lower), reset
   and operational experience of the deployment of the
   recommendations described in this document.

   Since repeated erroneous UPDATE messages which experience critical
   errors may be indicative of long-lasting failure modes, it re-establishment.  It is
   recommended envisaged that a back-off from restarting BGP sessions experiencing
   such behaviour is implemented.  As such, this is not applicable to
   restart behaviour requirement may be
   met through means such as those described in Section 5
   since such restarts are time-bound based on the period for which extension of the
   Adj-RIB-In from a BGP speaker is maintained as valid (e.g., when
   considering BGP Graceful Restart, such restarts are time-bound by the Restart Time described in [RFC4724]).  However, following a session
   reverting to being pulled down based on repeated error conditions, it
   is recommended that following restart attempts are subject mechanism
   ([RFC4724]) to an
   exponentially increasing interval between subsequent attempts.  It is
   therefore recommended that in such cases an implementation implements be triggered by NOTIFICATION messages indicating the increasing values
   occurrence of IdleHoldTimer as described in the BGP-4 FSM
   documented in [RFC4271].

7.1.  Reducing the Network Impact a Critical error.  Such an extension allows a restart
   of Session Teardown

   As discussed within the preceding section, where repeated critical
   UPDATE message errors are received, it is recommended that the impact TCP and BGP sessions between two speakers, in a similar manner
   to the both advertising and receiving BGP-4 speakers be limited current session restart behaviour triggered by
   reverting a NOTIFICATION
   message.  In order to tearing maximise the BGP-4 session experiencing such errors down.
   The BGP-4 specification presented in [RFC4271] achieves level of re-initialisation which
   occurs during such a
   session shutdown restart triggered by sending a NOTIFICATION message, however, this has
   the net result that all downstream Critical error, BGP
   speakers (i.e. those MAY re-initialise memory structures related to whom the routes carried over
   Adj-RIB-In and Adj-RIB-Out associated with the now ceased BGP session on which the
   erroneous UPDATE was readvertised)
   must withdraw this route from their RIB, and perform observed.

   Where such a best-path
   selection if required.  In some cases, there may restart event occurs, the continued liveliness of the
   remote device MAY be no alternate path
   available, and hence verified by BGP KEEPALIVE packets or other OAM
   functions such as Bidirectional Forwarding Detection ([RFC5880]).  In
   cases where the observed Critical BGP error is indicative of a period wider
   device failure of time for which no valid BGP route
   exists.  Particularly, this the remote speaker, it is very likely to occur where an upstream expected that a BGP
   sessions will not re-establish correctly.  Each BGP speaker performs a best-path selection and advertises only SHOULD
   maintain a
   single path to its neighbours - there limited time window in which session restart is expected
   in order to mitigate this possibility.

   When a requirement for Critical error occurs, the
   upstream network operator MUST be made aware
   of its occurrence through local logging mechanisms (e.g., SNMP traps
   or syslog).  The BGP speaker to perform receiving an UPDATE message identified
   as a best-path selection, Critical error MUST log its occurrence and re-advertise a
   new set copy of NLRI before the downstream system UPDATE
   message.  Where a inter-device messaging mechanism is able to converge to implemented (as
   discussed in Section Section 4.1) a
   new path.  It should be noted that where UPDATE messages withdrawing
   NLRI are not subject copy of the erroneous UPDATE
   message SHOULD be transmitted to the remote speaker.  Both BGP session's configured
   MinRouteAdvertisementInterval (MRAI) [RFC4271], but re-advertisements
   are, this
   speakers MUST indicate to an operator the cause of a session restart
   was a Critical error in an UPDATE message.

   Since repeated critical errors (and session restarts) may result have an
   impact in overall device scaling if the failure condition is not
   resolved by session restart, a BGP speaker being without a path for a
   period up MAY choose to the MRAI.

   Clearly, it is advantageous revert to avoid this period of time for which
   there may
   the session tear down behaviour described in the base BGP
   specification.  This reversion SHOULD only be no reachability for utilised after a set number
   of routes, especially since attempts which SHOULD be controllable by the BGP speaker terminating network operator.
   Where a particular session is doing so due to shut down, the implementation MAY utilise a
   particular error handling policy.  The graceful shutdown mechanism
   detailed back-
   off from session restart attempts (as per the IdleHoldTimer described
   in [I-D.ietf-grow-bgp-gshut] provides a mechanism by which a the BGP speaker is able FSM [RFC4271]).  Where reversion to signal that tearing down the BGP
   session is performed, a set speaker SHOULD limit the impact of routes are to be
   withdrawn, and hence allow
   withdrawing prefixes from downstream systems to pre-emptively
   perform a best-path selection, and hence advertise new reachability
   information in a make-before-break manner. speakers where possible.  It is therefore envisaged,
   envisaged that where a session is to this can be shutdown,
   based on achieved by utilising a trigger relating to erroneous UPDATE messages being
   received (be they repeated or not) that mechanism such as
   the graceful shutdown BGP Graceful Shutdown procedure in utilised, so as to reduce the forwarding impact of
   routes received on the session being withdrawn.

8. described in

6.  IANA Considerations

   This memo includes no request to IANA.


7.  Security Considerations

   The requirements outlined in this document provide mechanisms by which erroneous BGP messages may be responded to with limited
   limit the overall impact of the response to forwarding operation. an error in a BGP UPDATE
   message.  This is of benefit to the security of a BGP
   speaker in general.  Where speaker.
   Without these mechanisms, where erroneous UPDATE messages may have been propagated
   by relating to
   a single malicious Autonomous System or router within a network
   (or the Internet default free zone - DFZ), which are then NLRI entry can be propagated to all devices within the same routing domain, a BGP speaker, all other
   available over carried via the same session become unreachable. are affected by the resulting
   session tear-down.  This mechanism may provide means by which result in an Autonomous System can be AS being isolated from
   particular routing domains (such as the Internet), Internet) should the relevant an UPDATE messages
   message be propagated via targeted specific paths.  By  It is envisaged
   by reducing the impact of such failures, it is envisaged that this possibility may the reaction of the receiving speaker to
   these messages, the isolation can be constrained to a specific set sets of
   NLRI, or a specific topology.


   A number of the mechanisms meeting the requirements specified in this document,
   particularly those within Section 6
   the document (particularly those relating to operational monitoring)
   may provide raise further security
   concerns, however, it is envisaged that these are concerns.  Such concerns will be addressed in per-
   enhancement memos.

   during the specification of these mechanisms.

8.  Acknowledgements

   The author would like to thank the following network operators for
   their insight, and valuable input in into defining the requirements for
   a variety of operational deployments of the BGP-4 protocol; BGP protocol: Shane Amante, Bruno
   Decraene, Rob Evans, David Freedman, Wes George, Tom Hodgson, Sven
   Huster, Jonathan Newton, Neil McRae, Thomas Mangin, Tom Scholl and
   Ilya Varlashkin.

   In addition, many thanks are extended to Jeff Haas, Wim Hendrickx,
   Tony Li, Alton Lo, Keyur Patel, John Scudder, Adam Simpson and Robert
   Raszuk for their expertise relating to implementations of the BGP-4 BGP


9.  References


9.1.  Normative References

   [RFC2119]  Bradner, S., "Key words for use in RFCs to Indicate
              Requirement Levels", BCP 14, RFC 2119, March 1997.

   [RFC2858]  Bates, T., Rekhter, Y., Chandra, R., and D. Katz,
              "Multiprotocol Extensions for BGP-4", RFC 2858, June 2000.

   [RFC2918]  Chen, E., "Route Refresh Capability for BGP-4", RFC 2918,
              September 2000.

   [RFC4271]  Rekhter, Y., Li, T., and S. Hares, "A Border Gateway
              Protocol 4 (BGP-4)", RFC 4271, January 2006.

   [RFC4364]  Rosen, E. and Y. Rekhter, "BGP/MPLS IP Virtual Private
              Networks (VPNs)", RFC 4364, February 2006.

   [RFC4456]  Bates, T., Chen, E., and R. Chandra, "BGP Route
              Reflection: An Alternative to Full Mesh Internal BGP
              (IBGP)", RFC 4456, April 2006.

   [RFC4724]  Sangli, S., Chen, E., Fernando, R., Scudder, J., and Y.
              Rekhter, "Graceful Restart Mechanism for BGP", RFC 4724,
              January 2007.

   [RFC4760]  Bates, T., Chandra, R., Katz, D.,

   [RFC4761]  Kompella, K. and Y. Rekhter,
              "Multiprotocol Extensions "Virtual Private LAN Service
              (VPLS) Using BGP for BGP-4", Auto-Discovery and Signaling",
              RFC 4760, 4761, January 2007.


   [RFC5880]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
              (BFD)", RFC 5880, June 2010.

9.2.  Informational References

              Chen, E., Mohapatra, P., and K. Patel, "Revised Error
              Handling for BGP Updates from External Neighbors",
              draft-chen-ebgp-error-handling-01 (work in progress),
              September 2011.

              Francois, P., Decraene, B., Pelsser, C., Patel, K., and C.
              Filsfils, "Graceful BGP session shutdown",
              draft-ietf-grow-bgp-gshut-04 (work in progress),
              December 2011.
              October 2012.

              Scudder, J., Fernando, R., and S. Stuart, "BGP Monitoring
              Protocol", draft-ietf-grow-bmp-06 draft-ietf-grow-bmp-07 (work in progress),
              December 2011.
              October 2012.

              Patel, K., Chen, E., and B. Venkatachalapathy, "Enhanced
              Route Refresh Capability for BGP-4",
              draft-ietf-idr-bgp-enhanced-route-refresh-02 (work in
              progress), June 2012.

              Patel, K., Fernando, R., and J. Scudder, "Notification
              Message support for BGP Graceful Restart",
              draft-ietf-idr-bgp-enhanced-route-refresh-03 (work in
              progress), December 2011.

              Patel, K., Chen, E., Fernando, R., and J. Scudder,
              "Accelerated Routing Convergence for BGP Graceful
              Restart", draft-ietf-idr-enhanced-gr-01 (work in
              progress), June 2012.

              Freedman, D., Raszuk, R., and R. Shakir, "BGP OPERATIONAL
              Message", draft-ietf-idr-operational-message-00 (work in
              progress), March 2012.

              Scudder, J., Chen, E., Mohapatra, P., and K. Patel,
              "Revised Error Handling for BGP UPDATE Messages",
              draft-ietf-idr-optional-transitive-04 (work in progress),
              October 2011.

              Zeng, Q., Dong, J., Heitz, J., Patel, K., Shakir, R., and
              Z. Huang, "One-time Address-Prefix Based Outbound Route
              Filter for BGP-4", draft-zeng-idr-one-time-prefix-orf-02
              (work in progress), July 2012.

   [RFC5881]  Katz, D. and D. Ward, "Bidirectional Forwarding Detection
              (BFD) for IPv4 and IPv6 (Single Hop)", RFC 5881,
              June 2010.

Author's Address

   Rob Shakir
   pp C3L C3L, BT Centre
   81, Newgate Street
   London  EC1A 7AJ