draft-ietf-grow-anycast-02.txt | draft-ietf-grow-anycast-03.txt | |||
---|---|---|---|---|
Network Working Group J. Abley | Network Working Group J. Abley | |||
Internet-Draft ISC | Internet-Draft ISC | |||
Expires: April 24, 2006 K. Lindqvist | Expires: July 28, 2006 K. Lindqvist | |||
Netnod Internet Exchange | Netnod Internet Exchange | |||
October 21, 2005 | January 24, 2006 | |||
Operation of Anycast Services | Operation of Anycast Services | |||
draft-ietf-grow-anycast-02 | draft-ietf-grow-anycast-03 | |||
Status of this Memo | Status of this Memo | |||
By submitting this Internet-Draft, each author represents that any | By submitting this Internet-Draft, each author represents that any | |||
applicable patent or other IPR claims of which he or she is aware | applicable patent or other IPR claims of which he or she is aware | |||
have been or will be disclosed, and any of which he or she becomes | have been or will be disclosed, and any of which he or she becomes | |||
aware will be disclosed, in accordance with Section 6 of BCP 79. | aware will be disclosed, in accordance with Section 6 of BCP 79. | |||
Internet-Drafts are working documents of the Internet Engineering | Internet-Drafts are working documents of the Internet Engineering | |||
Task Force (IETF), its areas, and its working groups. Note that | Task Force (IETF), its areas, and its working groups. Note that | |||
skipping to change at page 1, line 35 | skipping to change at page 1, line 35 | |||
and may be updated, replaced, or obsoleted by other documents at any | and may be updated, replaced, or obsoleted by other documents at any | |||
time. It is inappropriate to use Internet-Drafts as reference | time. It is inappropriate to use Internet-Drafts as reference | |||
material or to cite them other than as "work in progress." | material or to cite them other than as "work in progress." | |||
The list of current Internet-Drafts can be accessed at | The list of current Internet-Drafts can be accessed at | |||
http://www.ietf.org/ietf/1id-abstracts.txt. | http://www.ietf.org/ietf/1id-abstracts.txt. | |||
The list of Internet-Draft Shadow Directories can be accessed at | The list of Internet-Draft Shadow Directories can be accessed at | |||
http://www.ietf.org/shadow.html. | http://www.ietf.org/shadow.html. | |||
This Internet-Draft will expire on April 24, 2006. | This Internet-Draft will expire on July 28, 2006. | |||
Copyright Notice | Copyright Notice | |||
Copyright (C) The Internet Society (2005). | Copyright (C) The Internet Society (2006). | |||
Abstract | Abstract | |||
As the Internet has grown, and as systems and networked services | As the Internet has grown, and as systems and networked services | |||
within enterprises have become more pervasive, many services with | within enterprises have become more pervasive, many services with | |||
high availability requirements have emerged. These requirements have | high availability requirements have emerged. These requirements have | |||
increased the demands on the reliability of the infrastructure on | increased the demands on the reliability of the infrastructure on | |||
which those services rely. | which those services rely. | |||
Various techniques have been employed to increase the availability of | Various techniques have been employed to increase the availability of | |||
services deployed on the Internet. This document presents commentary | services deployed on the Internet. This document presents commentary | |||
and recommendations for distribution of services using anycast. | and recommendations for distribution of services using anycast. | |||
Table of Contents | Table of Contents | |||
1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | 1. Introduction . . . . . . . . . . . . . . . . . . . . . . . . . 3 | |||
2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 | 2. Terminology . . . . . . . . . . . . . . . . . . . . . . . . . 4 | |||
3. Anycast Service Distribution . . . . . . . . . . . . . . . . . 5 | 3. Anycast Service Distribution . . . . . . . . . . . . . . . . . 5 | |||
3.1 General Description . . . . . . . . . . . . . . . . . . . 5 | 3.1. General Description . . . . . . . . . . . . . . . . . . . 5 | |||
3.2 Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | 3.2. Goals . . . . . . . . . . . . . . . . . . . . . . . . . . 5 | |||
4. Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | 4. Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7 | |||
4.1 Protocol Suitability . . . . . . . . . . . . . . . . . . . 7 | 4.1. Protocol Suitability . . . . . . . . . . . . . . . . . . . 7 | |||
4.2 Node Placement . . . . . . . . . . . . . . . . . . . . . . 7 | 4.2. Node Placement . . . . . . . . . . . . . . . . . . . . . . 7 | |||
4.3 Routing Systems . . . . . . . . . . . . . . . . . . . . . 8 | 4.3. Routing Systems . . . . . . . . . . . . . . . . . . . . . 8 | |||
4.3.1 Anycast within an IGP . . . . . . . . . . . . . . . . 8 | 4.3.1. Anycast within an IGP . . . . . . . . . . . . . . . . 8 | |||
4.3.2 Anycast within the Global Internet . . . . . . . . . . 9 | 4.3.2. Anycast within the Global Internet . . . . . . . . . . 9 | |||
4.4 Routing Considerations . . . . . . . . . . . . . . . . . . 9 | 4.4. Routing Considerations . . . . . . . . . . . . . . . . . . 9 | |||
4.4.1 Signalling Service Availability . . . . . . . . . . . 9 | 4.4.1. Signalling Service Availability . . . . . . . . . . . 9 | |||
4.4.2 Covering Prefix . . . . . . . . . . . . . . . . . . . 10 | 4.4.2. Covering Prefix . . . . . . . . . . . . . . . . . . . 10 | |||
4.4.3 Equal-Cost Paths . . . . . . . . . . . . . . . . . . . 10 | 4.4.3. Equal-Cost Paths . . . . . . . . . . . . . . . . . . . 10 | |||
4.4.4 Route Dampening . . . . . . . . . . . . . . . . . . . 12 | 4.4.4. Route Dampening . . . . . . . . . . . . . . . . . . . 12 | |||
4.4.5 Reverse Path Forwarding Checks . . . . . . . . . . . . 13 | 4.4.5. Reverse Path Forwarding Checks . . . . . . . . . . . . 13 | |||
4.4.6 Propagation Scope . . . . . . . . . . . . . . . . . . 13 | 4.4.6. Propagation Scope . . . . . . . . . . . . . . . . . . 13 | |||
4.4.7 Other Peoples' Networks . . . . . . . . . . . . . . . 14 | 4.4.7. Other Peoples' Networks . . . . . . . . . . . . . . . 14 | |||
4.4.8 Aggregation Risks . . . . . . . . . . . . . . . . . . 14 | 4.4.8. Aggregation Risks . . . . . . . . . . . . . . . . . . 14 | |||
4.5 Addressing Considerations . . . . . . . . . . . . . . . . 15 | 4.5. Addressing Considerations . . . . . . . . . . . . . . . . 15 | |||
4.6 Data Synchronisation . . . . . . . . . . . . . . . . . . . 15 | 4.6. Data Synchronisation . . . . . . . . . . . . . . . . . . . 15 | |||
4.7 Node Autonomy . . . . . . . . . . . . . . . . . . . . . . 16 | 4.7. Node Autonomy . . . . . . . . . . . . . . . . . . . . . . 16 | |||
4.8 Multi-Service Nodes . . . . . . . . . . . . . . . . . . . 16 | 4.8. Multi-Service Nodes . . . . . . . . . . . . . . . . . . . 17 | |||
4.8.1 Multiple Covering Prefixes . . . . . . . . . . . . . . 17 | 4.8.1. Multiple Covering Prefixes . . . . . . . . . . . . . . 17 | |||
4.8.2 Pessimistic Withdrawal . . . . . . . . . . . . . . . . 17 | 4.8.2. Pessimistic Withdrawal . . . . . . . . . . . . . . . . 17 | |||
4.8.3 Intra-Node Interior Connectivity . . . . . . . . . . . 17 | 4.8.3. Intra-Node Interior Connectivity . . . . . . . . . . . 18 | |||
5. Service Management . . . . . . . . . . . . . . . . . . . . . . 19 | 5. Service Management . . . . . . . . . . . . . . . . . . . . . . 19 | |||
5.1 Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 19 | 5.1. Monitoring . . . . . . . . . . . . . . . . . . . . . . . . 19 | |||
6. Security Considerations . . . . . . . . . . . . . . . . . . . 20 | 6. Security Considerations . . . . . . . . . . . . . . . . . . . 20 | |||
6.1 Denial-of-Service Attack Mitigation . . . . . . . . . . . 20 | 6.1. Denial-of-Service Attack Mitigation . . . . . . . . . . . 20 | |||
6.2 Service Compromise . . . . . . . . . . . . . . . . . . . . 20 | 6.2. Service Compromise . . . . . . . . . . . . . . . . . . . . 20 | |||
6.3 Service Hijacking . . . . . . . . . . . . . . . . . . . . 20 | 6.3. Service Hijacking . . . . . . . . . . . . . . . . . . . . 20 | |||
7. Protocol Considerations . . . . . . . . . . . . . . . . . . . 21 | 7. Protocol Considerations . . . . . . . . . . . . . . . . . . . 21 | |||
8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 | 8. IANA Considerations . . . . . . . . . . . . . . . . . . . . . 22 | |||
9. Acknowlegements . . . . . . . . . . . . . . . . . . . . . . . 23 | 9. Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . 23 | |||
10. References . . . . . . . . . . . . . . . . . . . . . . . . . 24 | 10. References . . . . . . . . . . . . . . . . . . . . . . . . . . 24 | |||
10.1 Normative References . . . . . . . . . . . . . . . . . . . 24 | 10.1. Normative References . . . . . . . . . . . . . . . . . . . 24 | |||
10.2 Informative References . . . . . . . . . . . . . . . . . . 24 | 10.2. Informative References . . . . . . . . . . . . . . . . . . 24 | |||
Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . 26 | Appendix A. Change History . . . . . . . . . . . . . . . . . . . 27 | |||
A. Change History . . . . . . . . . . . . . . . . . . . . . . . . 27 | Authors' Addresses . . . . . . . . . . . . . . . . . . . . . . . . 28 | |||
Intellectual Property and Copyright Statements . . . . . . . . 28 | Intellectual Property and Copyright Statements . . . . . . . . . . 29 | |||
1. Introduction | 1. Introduction | |||
To distribute a service using anycast, the service is first | To distribute a service using anycast, the service is first | |||
associated with a stable set of IP addresses, and reachability to | associated with a stable set of IP addresses, and reachability to | |||
those addresses is advertised in a routing system from multiple, | those addresses is advertised in a routing system from multiple, | |||
independent service nodes. Various techniques for anycast deployment | independent service nodes. Various techniques for anycast deployment | |||
of services are discussed in [RFC1546], [ISC-TN-2003-1] and [ISC-TN- | of services are discussed in [RFC1546], [ISC-TN-2003-1] and [ISC-TN- | |||
2004-1]. | 2004-1]. | |||
The techniques and considerations described in this document apply to | ||||
services reachable over both IPv4 and IPv6. | ||||
Anycast has in recent years become increasingly popular for adding | Anycast has in recent years become increasingly popular for adding | |||
redundancy to DNS servers to complement the redundancy which the DNS | redundancy to DNS servers to complement the redundancy which the DNS | |||
architecture itself already provides. Several root DNS server | architecture itself already provides. Several root DNS server | |||
operators have distributed their servers widely around the Internet, | operators have distributed their servers widely around the Internet, | |||
and both resolver and authority servers are commonly distributed | and both resolver and authority servers are commonly distributed | |||
within the networks of service providers. Anycast distribution has | within the networks of service providers. Anycast distribution has | |||
been used by commercial DNS authority server operators for several | been used by commercial DNS authority server operators for several | |||
years. The use of anycast is not limited to the DNS, although the | years. The use of anycast is not limited to the DNS, although the | |||
use of anycast imposes some additional limitations on the nature of | use of anycast imposes some additional limitations on the nature of | |||
the service being distributed, including transaction longevity, | the service being distributed, including transaction longevity, | |||
skipping to change at page 4, line 18 | skipping to change at page 4, line 18 | |||
(e.g. the destination address used by DNS resolvers to reach a | (e.g. the destination address used by DNS resolvers to reach a | |||
particular authority server). | particular authority server). | |||
Anycast: the practice of making a particular Service Address | Anycast: the practice of making a particular Service Address | |||
available in multiple, discrete, autonomous locations, such that | available in multiple, discrete, autonomous locations, such that | |||
datagrams sent are routed to one of several available locations. | datagrams sent are routed to one of several available locations. | |||
Anycast Node: an internally-connected collection of hosts and routers | Anycast Node: an internally-connected collection of hosts and routers | |||
which together provide service for an anycast Service Address. An | which together provide service for an anycast Service Address. An | |||
Anycast Node might be as simple as a single host participating in | Anycast Node might be as simple as a single host participating in | |||
a routing protocol with adjacent routers, or it might include a | a routing system with adjacent routers, or it might include a | |||
number of hosts connected in some more elaborate fashion; in | number of hosts connected in some more elaborate fashion; in | |||
either case, to the routing system across which the service is | either case, to the routing system across which the service is | |||
being anycast, each Anycast Node presents a unique path to the | being anycast, each Anycast Node presents a unique path to the | |||
Service Address. The entire anycast system for the service | Service Address. The entire anycast system for the service | |||
consists of two or more separate Anycast Nodes. | consists of two or more separate Anycast Nodes. | |||
Catchment: in physical geography, an area drained by a river, also | ||||
known as a drainage basin. By analogy, as used in this document, | ||||
the topological region of a network within which packets directed | ||||
at an anycast address are routed to one particular node. | ||||
Local-Scope Anycast: reachability information for the anycast Service | Local-Scope Anycast: reachability information for the anycast Service | |||
Address is propagated through a routing system in such a way that | Address is propagated through a routing system in such a way that | |||
a particular anycast node is only visible to a subset of the whole | a particular anycast node is only visible to a subset of the whole | |||
routing system. | routing system. | |||
Local Node: an Anycast Node providing service using a Local-Scope | Local Node: an Anycast Node providing service using a Local-Scope | |||
Anycast address. | Anycast address. | |||
Global-Scope Anycast: reachability information for the anycast | Global-Scope Anycast: reachability information for the anycast | |||
Service Address is propagated through a routing system in such a | Service Address is propagated through a routing system in such a | |||
way that a particular anycast node is potentially visible to the | way that a particular anycast node is potentially visible to the | |||
whole routing system. | whole routing system. | |||
Global Node: an Anycast Node providing service using a Global-Scope | Global Node: an Anycast Node providing service using a Global-Scope | |||
Anycast address. | Anycast address. | |||
3. Anycast Service Distribution | 3. Anycast Service Distribution | |||
3.1 General Description | 3.1. General Description | |||
Anycast is the name given to the practice of making a Service Address | Anycast is the name given to the practice of making a Service Address | |||
available to a routing system at Anycast Nodes in two or more | available to a routing system at Anycast Nodes in two or more | |||
discrete locations. The service provided by each node is consistent | discrete locations. The service provided by each node is generally | |||
regardless of the particular node chosen by the routing system to | consistent regardless of the particular node chosen by the routing | |||
handle a particular request. | system to handle a particular request (although some services may | |||
benefit from deliberate differences in the behaviours of individual | ||||
nodes, in order to facilitate locality-specific behaviour; see | ||||
Section 4.6). | ||||
For services distributed using anycast, there is no inherent | For services distributed using anycast, there is no inherent | |||
requirement for referrals to other servers or name-based service | requirement for referrals to other servers or name-based service | |||
distribution ("round-robin DNS"), although those techniques could be | distribution ("round-robin DNS"), although those techniques could be | |||
combined with anycast service distribution if an application required | combined with anycast service distribution if an application required | |||
it. The routing system decides which node is used for each request, | it. The routing system decides which node is used for each request, | |||
based on the topological design of the routing system and the point | based on the topological design of the routing system and the point | |||
in the network at which the request originates. | in the network at which the request originates. | |||
The Anycast Node chosen to service a particular query can be | The Anycast Node chosen to service a particular query can be | |||
skipping to change at page 5, line 42 | skipping to change at page 5, line 45 | |||
nodes for the purposes of reliability, and coarse-grained | nodes for the purposes of reliability, and coarse-grained | |||
distribution of load for the purposes of making popular services | distribution of load for the purposes of making popular services | |||
scalable can often be achieved, however. | scalable can often be achieved, however. | |||
The scale of the routing system through which a service is anycast | The scale of the routing system through which a service is anycast | |||
can vary from a small Interior Gateway Protocol (IGP) connecting a | can vary from a small Interior Gateway Protocol (IGP) connecting a | |||
small handful of components, to the Border Gateway Protocol (BGP) | small handful of components, to the Border Gateway Protocol (BGP) | |||
[RFC1771] connecting the global Internet, depending on the nature of | [RFC1771] connecting the global Internet, depending on the nature of | |||
the service distribution that is required. | the service distribution that is required. | |||
3.2 Goals | 3.2. Goals | |||
A service may be anycast for a variety of reasons. A number of | A service may be anycast for a variety of reasons. A number of | |||
common objectives are: | common objectives are: | |||
1. Coarse ("unbalanced") distribution of load across nodes, to allow | 1. Coarse ("unbalanced") distribution of load across nodes, to allow | |||
infrastructure to scale to increased numbers of queries and to | infrastructure to scale to increased numbers of queries and to | |||
accommodate transient query peaks; | accommodate transient query peaks; | |||
2. Mitigation of non-distributed denial of service attacks by | 2. Mitigation of non-distributed denial of service attacks by | |||
localising damage to single anycast nodes; | localising damage to single anycast nodes; | |||
3. Constraint of distributed denial of service attacks or flash | 3. Constraint of distributed denial of service attacks or flash | |||
crowds to local regions around anycast nodes (perhaps restricting | crowds to local regions around anycast nodes (perhaps restricting | |||
query traffic to local peering links, rather than paid transit | query traffic to local peering links, rather than paid transit | |||
circuits); | circuits); | |||
4. To provide additional information to help locate location of | 4. To provide additional information to help locate location of | |||
traffic sources in the case of attack (or query) traffic which | traffic sources in the case of attack (or query) traffic which | |||
incorporates spoofed source addresses. This information is | incorporates spoofed source addresses. This information is | |||
derived from the property of anycast service distribution that | derived from the property of anycast service distribution that | |||
the the selection of the Anycast Node used to service a | the selection of the Anycast Node used to service a particular | |||
particular query may be related to the topological source of the | query may be related to the topological source of the request. | |||
request. | ||||
5. Improvement of query response time, by reducing the network | 5. Improvement of query response time, by reducing the network | |||
distance between client and server with the provision of a local | distance between client and server with the provision of a local | |||
Anycast Node. The extent to which query response time is | Anycast Node. The extent to which query response time is | |||
improved depends on the way that nodes are selected for the | improved depends on the way that nodes are selected for the | |||
clients by the routing system. Topological nearness within the | clients by the routing system. Topological nearness within the | |||
routing system does not, in general, correlate to round-trip | routing system does not, in general, correlate to round-trip | |||
performance across a network; in some cases response times may | performance across a network; in some cases response times may | |||
see no reduction, and may increase. | see no reduction, and may increase. | |||
6. To reduce a list of servers to a single, distributed address. | 6. To reduce a list of servers to a single, distributed address. | |||
For example, a large number of authoritative nameservers for a | For example, a large number of authoritative nameservers for a | |||
zone may be deployed using a small set of anycast Service | zone may be deployed using a small set of anycast Service | |||
Addresses; this approach can increase the accessibility of zone | Addresses; this approach can increase the accessibility of zone | |||
data in the DNS without increasing the size of a referral | data in the DNS without increasing the size of a referral | |||
response from a nameserver authoritative for the parent zone. | response from a nameserver authoritative for the parent zone. | |||
4. Design | 4. Design | |||
4.1 Protocol Suitability | 4.1. Protocol Suitability | |||
When a service is anycast between two or more nodes, the routing | When a service is anycast between two or more nodes, the routing | |||
system makes the node selection decision on behalf of a client. | system makes the node selection decision on behalf of a client. | |||
Since it is usually a requirement that a single client-server | Since it is usually a requirement that a single client-server | |||
interaction is carried out between a client and the same server node | interaction is carried out between a client and the same server node | |||
for the duration of the transaction, it follows that the routing | for the duration of the transaction, it follows that the routing | |||
system's node selection decision ought to be stable for substantially | system's node selection decision ought to be stable for substantially | |||
longer than the expected transaction time, if the service is to be | longer than the expected transaction time, if the service is to be | |||
provided reliably. | provided reliably. | |||
Some services have very short transaction times, and may even be | Some services have very short transaction times, and may even be | |||
carried out using a single packet request and a single packet reply | carried out using a single packet request and a single packet reply | |||
in some cases (e.g. DNS transactions over UDP transport). Other | (e.g. DNS transactions over UDP transport). Other services involve | |||
services involve far longer-lived transactions (e.g. bulk file | far longer-lived transactions (e.g. bulk file downloads and audio- | |||
downloads and audio-visual media streaming). | visual media streaming). | |||
Some anycast deployments have very predictable routing systems, which | Services may be anycast within very predictable routing systems, | |||
can remain stable for long periods of time (e.g. anycast within an | which can remain stable for long periods of time (e.g. anycast within | |||
well-managed and topologically-simple IGP, where node selection | a well-managed and topologically-simple IGP, where node selection | |||
changes only occur as a response to node failures). Other | changes only occur as a response to node failures). Other | |||
deployments have far less predictable characteristics (see | deployments have far less predictable characteristics (see | |||
Section 4.4.7). | Section 4.4.7). | |||
The stability of the routing system together with the transaction | The stability of the routing system together with the transaction | |||
time of the service should be carefully compared when deciding | time of the service should be carefully compared when deciding | |||
whether a service is suitable for distribution using anycast. In | whether a service is suitable for distribution using anycast. In | |||
some cases, for new protocols, it may be practical to split large | some cases, for new protocols, it may be practical to split large | |||
transactions into an initialisation phase which is handled by anycast | transactions into an initialisation phase which is handled by anycast | |||
servers, and a sustained phase which is provided by non-anycast | servers, and a sustained phase which is provided by non-anycast | |||
servers, perhaps chosen during the initialisation phase. | servers, perhaps chosen during the initialisation phase. | |||
This document deliberately avoids prescribing rules as to which | This document deliberately avoids prescribing rules as to which | |||
protocols or services are suitable for distribution by anycast; to | protocols or services are suitable for distribution by anycast; to | |||
attempt to do so would be presumptuous. | attempt to do so would be presumptuous. | |||
4.2 Node Placement | 4.2. Node Placement | |||
Decisions as to where Anycast Nodes should be placed will depend to a | Decisions as to where Anycast Nodes should be placed will depend to a | |||
large extent on the goals of the service distribution. For example: | large extent on the goals of the service distribution. For example: | |||
o A DNS recursive resolver service might be distributed within an | o A DNS recursive resolver service might be distributed within an | |||
ISP's network, one Anycast Node per site. | ISP's network, one Anycast Node per site. | |||
o A root DNS server service might be distributed throughout the | o A root DNS server service might be distributed throughout the | |||
Internet with nodes located in regions with poor external | Internet; Anycast Nodes could be located in regions with poor | |||
connectivity, to ensure that the DNS functions adequately within | external connectivity to ensure that the DNS functions adequately | |||
the region during times of external network failure. | within the region during times of external network failure. | |||
o An FTP mirror service might include local nodes located at | o An FTP mirror service might include local nodes located at | |||
exchange points, so that ISPs connected to that exchange point | exchange points, so that ISPs connected to that exchange point | |||
could download bulk data more cheaply than if they had to use | could download bulk data more cheaply than if they had to use | |||
expensive transit circuits. | expensive transit circuits. | |||
In general node placement decisions should be made with consideration | In general node placement decisions should be made with consideration | |||
of likely traffic requirements, the potential for flash crowds or | of likely traffic requirements, the potential for flash crowds or | |||
denial-of-service traffic, the stability of the local routing system | denial-of-service traffic, the stability of the local routing system | |||
and the failure modes with respect to node failure, or local routing | and the failure modes with respect to node failure, or local routing | |||
system failure. | system failure. | |||
4.3 Routing Systems | 4.3. Routing Systems | |||
4.3.1 Anycast within an IGP | 4.3.1. Anycast within an IGP | |||
There are several common motivations for the distribution of a | There are several common motivations for the distribution of a | |||
Service Address within the scope of an IGP: | Service Address within the scope of an IGP: | |||
1. to improve service response times, by hosting a service close to | 1. to improve service response times, by hosting a service close to | |||
other users of the network; | other users of the network; | |||
2. to improve service reliability by providing automatic fail-over | 2. to improve service reliability by providing automatic fail-over | |||
to backup nodes; and | to backup nodes; and | |||
skipping to change at page 8, line 50 | skipping to change at page 8, line 50 | |||
When a service is anycast within an IGP the routing system is | When a service is anycast within an IGP the routing system is | |||
typically under the control of the same organisation that is | typically under the control of the same organisation that is | |||
providing the service, and hence the relationship between service | providing the service, and hence the relationship between service | |||
transaction characteristics and network stability are likely to be | transaction characteristics and network stability are likely to be | |||
well-understood. This technique is consequently applicable to a | well-understood. This technique is consequently applicable to a | |||
larger number of applications than Internet-wide anycast service | larger number of applications than Internet-wide anycast service | |||
distribution (see Section 4.1). | distribution (see Section 4.1). | |||
An IGP will generally have no inherent restriction on the length of | An IGP will generally have no inherent restriction on the length of | |||
prefix that can be introduced to it. There may well therefore be no | prefix that can be introduced to it. In this case there is no need | |||
need to construct a covering prefix for particular Service Addresses; | to construct a covering prefix for particular Service Addresses; host | |||
host routes corresponding to the Service Address can instead be | routes corresponding to the Service Address can instead be introduced | |||
introduced to the routing system. See Section 4.4.2 for more | to the routing system. See Section 4.4.2 for more discussion of the | |||
discussion of the requirement for a covering prefix. | requirement for a covering prefix. | |||
IGPs often feature little or no aggregation of routes, partly due to | IGPs often feature little or no aggregation of routes, partly due to | |||
algorithmic complexities in supporting aggregation. There is little | algorithmic complexities in supporting aggregation. There is little | |||
motivation for aggregation in many networks' IGPs in any case, since | motivation for aggregation in many networks' IGPs in many cases, | |||
the amount of routing information carried in the IGP is small enough | since the amount of routing information carried in the IGP is small | |||
that scaling concerns in routers do not arise. For discussion of | enough that scaling concerns in routers do not arise. For discussion | |||
aggregation risks in other routing systems, see Section 4.4.8. | of aggregation risks in other routing systems, see Section 4.4.8. | |||
By reducing the scope of the IGP to just the hosts providing service | By reducing the scope of the IGP to just the hosts providing service | |||
(together with one or more gateway routers) this technique can be | (together with one or more gateway routers) this technique can be | |||
applied to the construction of server clusters. This application is | applied to the construction of server clusters. This application is | |||
discussed in some detail in [ISC-TN-2004-1]. | discussed in some detail in [ISC-TN-2004-1]. | |||
4.3.2 Anycast within the Global Internet | 4.3.2. Anycast within the Global Internet | |||
Service Addresses may be anycast within the global Internet routing | Service Addresses may be anycast within the global Internet routing | |||
system in order to distribute services across the entire network. | system in order to distribute services across the entire network. | |||
The principal differences between this application and the IGP-scope | The principal differences between this application and the IGP-scope | |||
distribution discussed in Section 4.3.1 are that: | distribution discussed in Section 4.3.1 are that: | |||
1. the routing system is, in general, controlled by other people; | 1. the routing system is, in general, controlled by other people; | |||
2. the routing protocol concerned (BGP), and commonly-accepted | 2. the routing protocol concerned (BGP), and commonly-accepted | |||
practices in its deployment, impose some additional constraints | practices in its deployment, impose some additional constraints | |||
(see Section 4.4). | (see Section 4.4). | |||
4.4 Routing Considerations | 4.4. Routing Considerations | |||
4.4.1 Signalling Service Availability | 4.4.1. Signalling Service Availability | |||
When a routing system is provided with reachability information for a | When a routing system is provided with reachability information for a | |||
Service Address from an individual node, packets addressed to that | Service Address from an individual node, packets addressed to that | |||
Service Address will start to arrive at the node. Since it is | Service Address will start to arrive at the node. Since it is | |||
essential for the node to be ready to accept requests before they | essential for the node to be ready to accept requests before they | |||
start to arrive, a coupling between the routing information and the | start to arrive, a coupling between the routing information and the | |||
availability of the service at a particular node is desirable. | availability of the service at a particular node is desirable. | |||
Where a routing advertisement from a node corresponds to a single | Where a routing advertisement from a node corresponds to a single | |||
Service Address, this coupling might be such that availability of the | Service Address, this coupling might be such that availability of the | |||
skipping to change at page 10, line 11 | skipping to change at page 10, line 10 | |||
routing protocol implementations on the same server which provide the | routing protocol implementations on the same server which provide the | |||
service being distributed, which are configured to advertise and | service being distributed, which are configured to advertise and | |||
withdraw the route advertisement in conjunction with the availability | withdraw the route advertisement in conjunction with the availability | |||
(and health) of the software on the host which processes service | (and health) of the software on the host which processes service | |||
requests. An example of such an arrangement for a DNS service is | requests. An example of such an arrangement for a DNS service is | |||
included in [ISC-TN-2004-1]. | included in [ISC-TN-2004-1]. | |||
Where a routing advertisement from a node corresponds to two or more | Where a routing advertisement from a node corresponds to two or more | |||
Service Addresses, it may not be appropriate to trigger a route | Service Addresses, it may not be appropriate to trigger a route | |||
withdrawal due to the non-availability of a single service. Another | withdrawal due to the non-availability of a single service. Another | |||
approach is to route requests for the service which is down at one | approach in the case where the service is down at one Anycast Node is | |||
Anycast Node to a different Anycast Node at which the service is up. | to route requests to a different Anycast Node where the service is | |||
This approach is discussed in Section 4.8. | working normally. This approach is discussed in Section 4.8. | |||
Rapid advertisement/withdrawal oscillations can cause operational | Rapid advertisement/withdrawal oscillations can cause operational | |||
problems, and nodes should be configured such that rapid oscillations | problems, and nodes should be configured such that rapid oscillations | |||
are avoided (e.g. by implementing a minimum delay following a | are avoided (e.g. by implementing a minimum delay following a | |||
withdrawal before the service can be re-advertised). See | withdrawal before the service can be re-advertised). See | |||
Section 4.4.4 for a discussion of route oscillations in BGP. | Section 4.4.4 for a discussion of route oscillations in BGP. | |||
4.4.2 Covering Prefix | 4.4.2. Covering Prefix | |||
In some routing systems (e.g. the BGP-based routing system of the | In some routing systems (e.g. the BGP-based routing system of the | |||
global Internet) it is not possible, in general, to propagate a host | global Internet) it is not possible, in general, to propagate a host | |||
route with confidence that the route will propagate throughout the | route with confidence that the route will propagate throughout the | |||
network. This is a consequence of operational policy, and not a | network. This is a consequence of operational policy, and not a | |||
protocol restriction. | protocol restriction. | |||
In such cases it is necessary to propagate a route which covers the | In such cases it is necessary to propagate a route which covers the | |||
Service Address, and which has a sufficiently short prefix that it | Service Address, and which has a sufficiently short prefix that it | |||
will not be discarded by commonly-deployed import policies. For IPv4 | will not be discarded by commonly-deployed import policies. For IPv4 | |||
skipping to change at page 10, line 45 | skipping to change at page 10, line 44 | |||
some experimentation may be prudent. Corresponding import policies | some experimentation may be prudent. Corresponding import policies | |||
for IPv6 prefixes also exist. See Section 4.5 for more discussion of | for IPv6 prefixes also exist. See Section 4.5 for more discussion of | |||
IPv6 Service Addresses and corresponding anycast routes. | IPv6 Service Addresses and corresponding anycast routes. | |||
The propagation of a single route per service has some associated | The propagation of a single route per service has some associated | |||
scaling issues which are discussed in Section 4.4.8. | scaling issues which are discussed in Section 4.4.8. | |||
Where multiple Service Addresses are covered by the same covering | Where multiple Service Addresses are covered by the same covering | |||
route, there is no longer a tight coupling between the advertisement | route, there is no longer a tight coupling between the advertisement | |||
of that route and the individual services associated with the covered | of that route and the individual services associated with the covered | |||
host routes. The resulting impact on signaling availability of | host routes. The resulting impact on signalling availability of | |||
individual services is discussed in Section 4.4.1 and Section 4.8. | individual services is discussed in Section 4.4.1 and Section 4.8. | |||
4.4.3 Equal-Cost Paths | 4.4.3. Equal-Cost Paths | |||
Some routing systems support equal-cost paths to the same | Some routing systems support equal-cost paths to the same | |||
destination. Where multiple, equal-cost paths exist and lead to | destination. Where multiple, equal-cost paths exist and lead to | |||
different anycast nodes, there is a risk that different request | different anycast nodes, there is a risk that different request | |||
packets associated with a single transaction might be delivered to | packets associated with a single transaction might be delivered to | |||
more than one node. Services provided over TCP [RFC0793] necessarily | more than one node. Services provided over TCP [RFC0793] necessarily | |||
involve transactions with multiple request packets, due to the TCP | involve transactions with multiple request packets, due to the TCP | |||
setup handshake. | setup handshake. | |||
For services which are distributed across the global Internet using | For services which are distributed across the global Internet using | |||
skipping to change at page 11, line 46 | skipping to change at page 11, line 45 | |||
are deployed within the routing system, and on where the PPLB is | are deployed within the routing system, and on where the PPLB is | |||
being performed: | being performed: | |||
1. PPLB across multiple, parallel links between the same pair of | 1. PPLB across multiple, parallel links between the same pair of | |||
routers should cause no node selection problems; | routers should cause no node selection problems; | |||
2. PPLB across diverse paths within a single autonomous system (AS), | 2. PPLB across diverse paths within a single autonomous system (AS), | |||
where the paths converge to a single exit as they leave the AS, | where the paths converge to a single exit as they leave the AS, | |||
should cause no node selection problems; | should cause no node selection problems; | |||
3. PPLB across links to different neighbour ASes where where the | 3. PPLB across links to different neighbour ASes where the neighbour | |||
neighbour ASes have selected different nodes for a particular | ASes have selected different nodes for a particular anycast | |||
anycast destination will, in general, cause request packets to be | destination will, in general, cause request packets to be | |||
distributed across multiple anycast nodes. This will have the | distributed across multiple anycast nodes. This will have the | |||
effect that the anycast service is unavailable to clients | effect that the anycast service is unavailable to clients | |||
downstream of the router performing PPLB. | downstream of the router performing PPLB. | |||
The uses of PPLB which have the potential to interact badly with | The uses of PPLB which have the potential to interact badly with | |||
anycast service distribution can also cause persistent packet | anycast service distribution can also cause persistent packet | |||
reordering. A network path that persistently reorders segments will | reordering. A network path that persistently reorders segments will | |||
degrade the performance of traffic carried by TCP [Allman2000]. TCP, | degrade the performance of traffic carried by TCP [Allman2000]. TCP, | |||
according to several documented measurements, accounts for the bulk | according to several documented measurements, accounts for the bulk | |||
of traffic carried on the Internet ([McCreary2000], [Fomenkov2004]). | of traffic carried on the Internet ([McCreary2000], [Fomenkov2004]). | |||
Consequently, in many cases it is reasonable to consider networks | Consequently, in many cases it is reasonable to consider networks | |||
making such use of PPLB to be pathological. | making such use of PPLB to be pathological. | |||
4.4.4 Route Dampening | 4.4.4. Route Dampening | |||
Frequent advertisements and withdrawals of individual prefixes in BGP | Frequent advertisements and withdrawals of individual prefixes in BGP | |||
are known as flaps. Rapid flapping can lead to CPU exhaustion on | are known as flaps. Rapid flapping can lead to CPU exhaustion on | |||
routers quite remote from the source of the instability, and for this | routers quite remote from the source of the instability, and for this | |||
reason rapid route oscillations are frequently "dampened", as | reason rapid route oscillations are frequently "dampened", as | |||
described in [RFC2439]. | described in [RFC2439]. | |||
A dampened path will be suppressed by routers for an interval which | A dampened path will be suppressed by routers for an interval which | |||
increases according to the frequency of the observed oscillation; a | increases according to the frequency of the observed oscillation; a | |||
suppressed path will not propagate. Hence a single router can | suppressed path will not propagate. Hence a single router can | |||
skipping to change at page 12, line 41 | skipping to change at page 12, line 41 | |||
For this reason, network instability which leads to route flapping | For this reason, network instability which leads to route flapping | |||
from a single anycast node ought not to cause advertisements from | from a single anycast node ought not to cause advertisements from | |||
other nodes (which have different AS_PATH attributes) to be dampened. | other nodes (which have different AS_PATH attributes) to be dampened. | |||
To limit the opportunity of such implementations to penalise | To limit the opportunity of such implementations to penalise | |||
advertisements originating from different Anycast Nodes in response | advertisements originating from different Anycast Nodes in response | |||
to oscillations from just a single node, care should be taken to | to oscillations from just a single node, care should be taken to | |||
arrange that the AS_PATH attributes on routes from different nodes | arrange that the AS_PATH attributes on routes from different nodes | |||
are as diverse as possible. For example, Anycast Nodes should use | are as diverse as possible. For example, Anycast Nodes should use | |||
the same origin AS for their advertisements, but might have different | the same origin AS for their advertisements, but might have different | |||
upstream ASs. | upstream ASes. | |||
Where different implementations of flap dampening are prevalent, | Where different implementations of flap dampening are prevalent, | |||
individual nodes' instability may result in stable nodes becoming | individual nodes' instability may result in stable nodes becoming | |||
unavailable. In mitigation, the following measures may be useful: | unavailable. In mitigation, the following measures may be useful: | |||
1. Judicious deployment of Local Nodes in combination with | 1. Judicious deployment of Local Nodes in combination with | |||
especially stable Global Nodes (with high inter-AS path splay, | especially stable Global Nodes (with high inter-AS path splay, | |||
redundant hardware, power, etc) may help limit oscillation | redundant hardware, power, etc) may help limit oscillation | |||
problems to the Local Nodes' limited regions of influence; | problems to the Local Nodes' limited regions of influence; | |||
2. Aggressive flap-dampening of the service prefix close to the | 2. Aggressive flap-dampening of the service prefix close to the | |||
origin (e.g. within an Anycast Node, or in adjcacent ASes of each | origin (e.g. within an Anycast Node, or in adjacent ASes of each | |||
Anycast Node) may also help reduce the opportunity of remote ASes | Anycast Node) may also help reduce the opportunity of remote ASes | |||
to see oscillations at all. | to see oscillations at all. | |||
4.4.5 Reverse Path Forwarding Checks | 4.4.5. Reverse Path Forwarding Checks | |||
Reverse Path Forwarding (RPF) checks, first described in [RFC2267], | Reverse Path Forwarding (RPF) checks, first described in [RFC2267], | |||
are commonly deployed as part of ingress interface packet filters on | are commonly deployed as part of ingress interface packet filters on | |||
routers in the Internet in order to deny packets whose source | routers in the Internet in order to deny packets whose source | |||
addresses are spoofed (see also RFC 2827 [RFC2827]). Deployed | addresses are spoofed (see also RFC 2827 [RFC2827]). Deployed | |||
implementations of RPF make several modes of operation available | implementations of RPF make several modes of operation available | |||
(e.g. "loose" and "strict"). | (e.g. "loose" and "strict"). | |||
Some modes of RPF can cause non-spoofed packets to be denied when | Some modes of RPF can cause non-spoofed packets to be denied when | |||
they originate from multi-homed site, since selected paths might | they originate from multi-homed site, since selected paths might | |||
skipping to change at page 13, line 32 | skipping to change at page 13, line 32 | |||
[RFC3704]. | [RFC3704]. | |||
A collection of anycast nodes deployed across the Internet is largely | A collection of anycast nodes deployed across the Internet is largely | |||
indistinguishable from a distributed, multi-homed site to the routing | indistinguishable from a distributed, multi-homed site to the routing | |||
system, and hence this risk also exists for anycast nodes, even if | system, and hence this risk also exists for anycast nodes, even if | |||
individual nodes are not multi-homed. Care should be taken to ensure | individual nodes are not multi-homed. Care should be taken to ensure | |||
that each anycast node is treated as a multi-homed network, and that | that each anycast node is treated as a multi-homed network, and that | |||
the corresponding recommendations in [RFC3704] with respect to RPF | the corresponding recommendations in [RFC3704] with respect to RPF | |||
checks are heeded. | checks are heeded. | |||
4.4.6 Propagation Scope | 4.4.6. Propagation Scope | |||
In the context of Anycast service distribution across the global | In the context of Anycast service distribution across the global | |||
Internet, Global Nodes are those which are capable of providing | Internet, Global Nodes are those which are capable of providing | |||
service to clients anywhere in the network; reachability information | service to clients anywhere in the network; reachability information | |||
for the service is propagated globally, without restriction, by | for the service is propagated globally, without restriction, by | |||
advertising the routes covering the Service Addresses for global | advertising the routes covering the Service Addresses for global | |||
transit to one or more providers. | transit to one or more providers. | |||
More than one Global Node can exist for a single service (and indeed | More than one Global Node can exist for a single service (and indeed | |||
this is often the case, for reasons of redundancy and load-sharing). | this is often the case, for reasons of redundancy and load-sharing). | |||
skipping to change at page 14, line 15 | skipping to change at page 14, line 14 | |||
Local Nodes advertise covering routes for Service Addresses in such a | Local Nodes advertise covering routes for Service Addresses in such a | |||
way that their propagation is restricted. This might be done using | way that their propagation is restricted. This might be done using | |||
well-known community string attributes such as NO_EXPORT [RFC1997] or | well-known community string attributes such as NO_EXPORT [RFC1997] or | |||
NOPEER [RFC3765], or by arranging with peers to apply a conventional | NOPEER [RFC3765], or by arranging with peers to apply a conventional | |||
"peering" import policy instead of a "transit" import policy, or some | "peering" import policy instead of a "transit" import policy, or some | |||
suitable combination of measures. | suitable combination of measures. | |||
Advertising reachability to Service Addresses from Local Nodes should | Advertising reachability to Service Addresses from Local Nodes should | |||
ideally be made using a routing policy that require presence of | ideally be made using a routing policy that require presence of | |||
explicit attributes for propagation, rather than reling on implicit | explicit attributes for propagation, rather than relying on implicit | |||
(default) policy. Inadvertant propagation of a route beyond its | (default) policy. Inadvertent propagation of a route beyond its | |||
intended horizon can result in capacity problems for Local Nodes | intended horizon can result in capacity problems for Local Nodes | |||
which might degrade service performance network-wide. | which might degrade service performance network-wide. | |||
4.4.7 Other Peoples' Networks | 4.4.7. Other Peoples' Networks | |||
When Anycast services are deployed across networks operated by | When Anycast services are deployed across networks operated by | |||
others, their reachability is dependent on routing policies and | others, their reachability is dependent on routing policies and | |||
topology changes (planned and unplanned) which are unpredictable and | topology changes (planned and unplanned) which are unpredictable and | |||
sometimes difficult to identify. Since the routing system may | sometimes difficult to identify. Since the routing system may | |||
include networks operated by multiple, unrelated organisations, the | include networks operated by multiple, unrelated organisations, the | |||
possibility of unforeseen interactions resulting from the | possibility of unforeseen interactions resulting from the | |||
combinations of unrelated changes also exists. | combinations of unrelated changes also exists. | |||
The stability and predictability of such a routing system should be | The stability and predictability of such a routing system should be | |||
skipping to change at page 14, line 42 | skipping to change at page 14, line 41 | |||
a distribution strategy for particular services and protocols (see | a distribution strategy for particular services and protocols (see | |||
also Section 4.1). | also Section 4.1). | |||
By way of mitigation, routing policies used by Anycast Nodes across | By way of mitigation, routing policies used by Anycast Nodes across | |||
such routing systems should be conservative, individual nodes' | such routing systems should be conservative, individual nodes' | |||
internal and external/connecting infrastructure should be scaled to | internal and external/connecting infrastructure should be scaled to | |||
support loads far in excess of the average, and the service should be | support loads far in excess of the average, and the service should be | |||
monitored proactively from many points in order to avoid unpleasant | monitored proactively from many points in order to avoid unpleasant | |||
surprises (see Section 5.1). | surprises (see Section 5.1). | |||
4.4.8 Aggregation Risks | 4.4.8. Aggregation Risks | |||
The propagation of a single route for each anycast service does not | The propagation of a single route for each anycast service does not | |||
scale well for routing systems in which the load of routing | scale well for routing systems in which the load of routing | |||
information which must be carried is a concern, and where there are | information which must be carried is a concern, and where there are | |||
potentially many services to distribute. For example, an autonomous | potentially many services to distribute. For example, an autonomous | |||
system which provides services to the Internet with N Service | system which provides services to the Internet with N Service | |||
Addresses covered by a single exported route, would need to advertise | Addresses covered by a single exported route, would need to advertise | |||
(N+1) routes if each of those services were to be distributed using | (N+1) routes if each of those services were to be distributed using | |||
anycast. | anycast. | |||
The common practice of applying minimum prefix-length filters in | The common practice of applying minimum prefix-length filters in | |||
import policies on the Internet (see Section 4.4.2) means that for a | import policies on the Internet (see Section 4.4.2) means that for a | |||
route covering a Service Address to be usefully propagated the prefix | route covering a Service Address to be usefully propagated the prefix | |||
length must be substantially less than that required to advertise | length must be substantially less than that required to advertise | |||
just the host route. Widespread advertisement of short prefixes for | just the host route. Widespread advertisement of short prefixes for | |||
individual services hence also has a negative impact on address | individual services hence also has a negative impact on address | |||
conservation. | conservation. | |||
Both of these issues can be mitigated to some extent by the use of a | Both of these issues can be mitigated to some extent by the use of a | |||
single covering prefix to accommodate multiple Service Addresses, as | single covering prefix to accommodate multiple Service Addresses, as | |||
described in Section 4.8. This implies a decoupling of the route | described in Section 4.8. This implies a de-coupling of the route | |||
advertisement from individual service availability (see | advertisement from individual service availability (see | |||
Section 4.4.1), however, with attendant risks to the stability of the | Section 4.4.1), however, with attendant risks to the stability of the | |||
service as a whole (see Section 4.7). | service as a whole (see Section 4.7). | |||
In general, the scaling problems described here prevent anycast from | In general, the scaling problems described here prevent anycast from | |||
being a useful, general approach for service distribution on the | being a useful, general approach for service distribution on the | |||
global Internet. It remains, however, a useful technique for | global Internet. It remains, however, a useful technique for | |||
distributing a limited number of Internet-critical services, as well | distributing a limited number of Internet-critical services, as well | |||
as in smaller networks where the aggregation concerns discussed here | as in smaller networks where the aggregation concerns discussed here | |||
do not apply. | do not apply. | |||
4.5 Addressing Considerations | 4.5. Addressing Considerations | |||
Service Addresses should be unique within the routing system that | Service Addresses should be unique within the routing system that | |||
connects all Anycast Nodes to all possible clients of the service. | connects all Anycast Nodes to all possible clients of the service. | |||
Service Addresses must also be chosen so that corresponding routes | Service Addresses must also be chosen so that corresponding routes | |||
will be allowed to propagate within that routing system. | will be allowed to propagate within that routing system. | |||
For an IPv4-numbered service deployed across the Internet, for | For an IPv4-numbered service deployed across the Internet, for | |||
example, an address might be chosen from a block where the minimum | example, an address might be chosen from a block where the minimum | |||
RIR allocation size is 24 bits, and reachability to that address | RIR allocation size is 24 bits, and reachability to that address | |||
might be provided by originating the covering 24-bit prefix. | might be provided by originating the covering 24-bit prefix. | |||
For an IPv4-numbered service deployed within a private network, a | For an IPv4-numbered service deployed within a private network, a | |||
locally-unused [RFC1918] address might be chosen, and rechability to | locally-unused [RFC1918] address might be chosen, and reachability to | |||
that address might be signalled using a (32-bit) host route. | that address might be signalled using a (32-bit) host route. | |||
For IPv6-numbered services, Anycast Addresses are not scoped | For IPv6-numbered services, Anycast Addresses are not scoped | |||
differently from unicast addresses. As such the guidelines presented | differently from unicast addresses. As such the guidelines presented | |||
for IPv4 with respect to address suitability follow for IPv6. Note | for IPv4 with respect to address suitability follow for IPv6. Note | |||
that historical prohibitions on anycast distribution of services over | that historical prohibitions on anycast distribution of services over | |||
IPv6 have been removed from the IPv6 addressing specification in | IPv6 have been removed from the IPv6 addressing specification in | |||
[I-D.ietf-ipv6-addr-arch-v4]. | [I-D.ietf-ipv6-addr-arch-v4]. | |||
4.6 Data Synchronisation | 4.6. Data Synchronisation | |||
Although some services have been deployed in localised form (such | Although some services have been deployed in localised form (such | |||
that clients from particular regions are presented with regionally- | that clients from particular regions are presented with regionally- | |||
relevant content) many services have the property that responses to | relevant content) many services have the property that responses to | |||
client requests should be consistent, regardless of where the request | client requests should be consistent, regardless of where the request | |||
originates. For a service distributed using anycast, that implies | originates. For a service distributed using anycast, that implies | |||
that different Anycast Nodes must operate in a consistent manner and, | that different Anycast Nodes must operate in a consistent manner and, | |||
where that consistent behaviour is based on a data set, that the data | where that consistent behaviour is based on a data set, that the data | |||
concerned be synchronised between nodes. | concerned be synchronised between nodes. | |||
The mechanism by which data is synchronised depends on the nature of | The mechanism by which data is synchronised depends on the nature of | |||
the service; examples are zone transfers for authoritative DNS | the service; examples are zone transfers for authoritative DNS | |||
servers and rsync for FTP archives. In general, the synchronisation | servers and rsync for FTP archives. In general, the synchronisation | |||
of data between Anycast Nodes will involve transactions between non- | of data between Anycast Nodes will involve transactions between non- | |||
anycast addresses. | anycast addresses. | |||
Data synchronisation across public networks should be carried out | Data synchronisation across public networks should be carried out | |||
with appropriate authentication and encryption. | with appropriate authentication and encryption. | |||
4.7 Node Autonomy | 4.7. Node Autonomy | |||
For an Anycast deployment whose goals include improved reliability | For an Anycast deployment whose goals include improved reliability | |||
through redundancy, it is important to minimise the opportunity for a | through redundancy, it is important to minimise the opportunity for a | |||
single defect to compromise many (or all) nodes, or for the failure | single defect to compromise many (or all) nodes, or for the failure | |||
of one node to provide a cascading failure bringing down additional | of one node to provide a cascading failure bringing down additional | |||
successive nodes until the service as a whole is defeated. | successive nodes until the service as a whole is defeated. | |||
Co-dependencies are avoided by making each node as autonomous and | Co-dependencies are avoided by making each node as autonomous and | |||
self-sufficient as possible. The degree to which nodes can survive | self-sufficient as possible. The degree to which nodes can survive | |||
failure elsewhere depends on the nature of the service being | failure elsewhere depends on the nature of the service being | |||
skipping to change at page 16, line 49 | skipping to change at page 16, line 48 | |||
general, from Local Node to Global Node; traffic that might sink one | general, from Local Node to Global Node; traffic that might sink one | |||
Local Node is unlikely to sink all Local Nodes, except in the most | Local Node is unlikely to sink all Local Nodes, except in the most | |||
degenerate cases. | degenerate cases. | |||
The chance of cascading failure due to a software defect in an | The chance of cascading failure due to a software defect in an | |||
operating system or server can be reduced in many cases by deploying | operating system or server can be reduced in many cases by deploying | |||
nodes running different implementations of operating system, server | nodes running different implementations of operating system, server | |||
software, routing protocol software, etc, such that a defect which | software, routing protocol software, etc, such that a defect which | |||
appears in a single component does not affect the whole system. | appears in a single component does not affect the whole system. | |||
4.8 Multi-Service Nodes | It should be noted that these approaches to increase node autonomy | |||
are, to varying degrees, contrary to the practical goals of making a | ||||
deployed service straightforward to operate. A service which is | ||||
over-complex is more likely to suffer from operator error than a | ||||
service which is more straightforward to run. Careful consideration | ||||
should be given to all of these aspects so that an appropriate | ||||
balance may be found. | ||||
4.8. Multi-Service Nodes | ||||
For a service distributed across a routing system where covering | For a service distributed across a routing system where covering | |||
prefixes are required to announce reachability to a single Service | prefixes are required to announce reachability to a single Service | |||
Address (see Section 4.4.2), special consideration is required in the | Address (see Section 4.4.2), special consideration is required in the | |||
case where multiple services need to be distributed across a single | case where multiple services need to be distributed across a single | |||
set of nodes. This results from the requirement to signal | set of nodes. This results from the requirement to signal | |||
availability of individual services to the routing system so that | availability of individual services to the routing system so that | |||
requests for service are not received by nodes which are not able to | requests for service are not received by nodes which are not able to | |||
process them (see Section 4.4.1). | process them (see Section 4.4.1). | |||
Several approaches are described in the following sections. | Several approaches are described in the following sections. | |||
4.8.1 Multiple Covering Prefixes | 4.8.1. Multiple Covering Prefixes | |||
Each Service Address is chosen such that only one Service Address is | Each Service Address is chosen such that only one Service Address is | |||
covered by each advertised prefix. Advertisement and withdrawal of a | covered by each advertised prefix. Advertisement and withdrawal of a | |||
single covering prefix can be tightly coupled to the availability of | single covering prefix can be tightly coupled to the availability of | |||
the single associated service. | the single associated service. | |||
This is the most straightforward approach. However, since it makes | This is the most straightforward approach. However, since it makes | |||
very poor utilisation of globally-unique addresses, it is only | very poor utilisation of globally-unique addresses, it is only | |||
suitable for use for a small number of critical, infrastructural | suitable for use for a small number of critical, infrastructural | |||
services such as root DNS servers. General Internet-wide deployment | services such as root DNS servers. General Internet-wide deployment | |||
of services using this approach will not scale. | of services using this approach will not scale. | |||
4.8.2 Pessimistic Withdrawal | 4.8.2. Pessimistic Withdrawal | |||
Multiple Service Addresses are chosen such that they are covered by a | Multiple Service Addresses are chosen such that they are covered by a | |||
single prefix. Advertisement and withdrawl of the single covering | single prefix. Advertisement and withdrawal of the single covering | |||
prefix is coupled to the availability of all associated services; if | prefix is coupled to the availability of all associated services; if | |||
any individual service becomes unavailable, the covering prefix is | any individual service becomes unavailable, the covering prefix is | |||
withdrawn. | withdrawn. | |||
The coupling between service availability and advertisement of the | The coupling between service availability and advertisement of the | |||
covering prefix is complicated by the requirement that all Service | covering prefix is complicated by the requirement that all Service | |||
Addresses must be available -- the announcement needs to be triggered | Addresses must be available -- the announcement needs to be triggered | |||
by the presence of all component routes, and not just a single | by the presence of all component routes, and not just a single | |||
covered route. | covered route. | |||
The fact that a single malfunctioning service causes all deployed | The fact that a single malfunctioning service causes all deployed | |||
services in a node to be taken off-line may make this approach | services in a node to be taken off-line may make this approach | |||
unsuitable for many applications. | unsuitable for many applications. | |||
4.8.3 Intra-Node Interior Connectivity | 4.8.3. Intra-Node Interior Connectivity | |||
Multiple Service Addresses are chosen such that they are covered by a | Multiple Service Addresses are chosen such that they are covered by a | |||
single prefix. Advertisement and withdrawal of the single covering | single prefix. Advertisement and withdrawal of the single covering | |||
prefix is coupled to the availability of any one service. Nodes have | prefix is coupled to the availability of any one service. Nodes have | |||
interior connectivity, e.g. using tunnels, and host routes for | interior connectivity, e.g. using tunnels, and host routes for | |||
service addresses are distributed using an IGP which extends to | service addresses are distributed using an IGP which extends to | |||
include routers at all nodes. | include routers at all nodes. | |||
In the event that a service is unavailable at one node, but available | In the event that a service is unavailable at one node, but available | |||
at other nodes, a request may be routed over the interior network | at other nodes, a request may be routed over the interior network | |||
skipping to change at page 19, line 7 | skipping to change at page 19, line 7 | |||
is disconnected from other nodes, continued advertisement of the | is disconnected from other nodes, continued advertisement of the | |||
covering prefix might cause requests to become black-holed. | covering prefix might cause requests to become black-holed. | |||
This approach allows reasonable address utilisation of the netblock | This approach allows reasonable address utilisation of the netblock | |||
covered by the announced prefix, at the expense of reduced autonomy | covered by the announced prefix, at the expense of reduced autonomy | |||
of individual nodes; the IGP in which all nodes participate can be | of individual nodes; the IGP in which all nodes participate can be | |||
viewed as a single point of failure. | viewed as a single point of failure. | |||
5. Service Management | 5. Service Management | |||
5.1 Monitoring | 5.1. Monitoring | |||
Monitoring a service which is distributed is more complex than | Monitoring a service which is distributed is more complex than | |||
monitoring a non-distributed service, since the observed accuracy and | monitoring a non-distributed service, since the observed accuracy and | |||
availability of the service is, in general, different when viewed | availability of the service is, in general, different when viewed | |||
from clients attached to different parts of the network. When a | from clients attached to different parts of the network. When a | |||
problem is identified, it is also not always obvious which node | problem is identified, it is also not always obvious which node | |||
served the request, and hence which node is malfunctioning. | served the request, and hence which node is malfunctioning. | |||
It is recommended that distributed services are monitored from probes | It is recommended that distributed services are monitored from probes | |||
distributed representatively across the routing system, and, where | distributed representatively across the routing system, and, where | |||
skipping to change at page 20, line 7 | skipping to change at page 20, line 7 | |||
Service [2] and the University of Oregon Route Views Project [3]. | Service [2] and the University of Oregon Route Views Project [3]. | |||
Monitoring the health of the component devices in an Anycast | Monitoring the health of the component devices in an Anycast | |||
deployment of a service (hosts, routers, etc) is straightforward, and | deployment of a service (hosts, routers, etc) is straightforward, and | |||
can be achieved using the same tools and techniques commonly used to | can be achieved using the same tools and techniques commonly used to | |||
manage other network-connected infrastructure, without the additional | manage other network-connected infrastructure, without the additional | |||
complexity involved in monitoring Anycast service addresses. | complexity involved in monitoring Anycast service addresses. | |||
6. Security Considerations | 6. Security Considerations | |||
6.1 Denial-of-Service Attack Mitigation | 6.1. Denial-of-Service Attack Mitigation | |||
This document describes mechanisms for deploying services on the | This document describes mechanisms for deploying services on the | |||
Internet which can be used to mitigate vulnerability to attack: | Internet which can be used to mitigate vulnerability to attack: | |||
1. An Anycast Node can act as a sink for attack traffic originated | 1. An Anycast Node can act as a sink for attack traffic originated | |||
within its sphere of influence, preventing nodes elsewhere from | within its sphere of influence, preventing nodes elsewhere from | |||
having to deal with that traffic; | having to deal with that traffic; | |||
2. The task of dealing with attack traffic whose sources are widely | 2. The task of dealing with attack traffic whose sources are widely | |||
distributed is itself distributed across all the nodes which | distributed is itself distributed across all the nodes which | |||
contribute to the service. Since the problem of sorting between | contribute to the service. Since the problem of sorting between | |||
legitimate and attack traffic is distributed, this may lead to | legitimate and attack traffic is distributed, this may lead to | |||
better scaling properties than a service which is not | better scaling properties than a service which is not | |||
distributed. | distributed. | |||
6.2 Service Compromise | 6.2. Service Compromise | |||
The distribution of a service across several (or many) autonomous | The distribution of a service across several (or many) autonomous | |||
nodes imposes increased monitoring as well as an increased systems | nodes imposes increased monitoring as well as an increased systems | |||
administration burden on the operator of the service which might | administration burden on the operator of the service which might | |||
reduce the effectiveness of host and router security. | reduce the effectiveness of host and router security. | |||
The potential benefit of being able to take compromised servers off- | The potential benefit of being able to take compromised servers off- | |||
line without compromising the service can only be realised if there | line without compromising the service can only be realised if there | |||
are working procedures to do so quickly and reliably. | are working procedures to do so quickly and reliably. | |||
6.3 Service Hijacking | 6.3. Service Hijacking | |||
It is possible that an unauthorised party might advertise routes | It is possible that an unauthorised party might advertise routes | |||
corresponding to anycast Service Addresses across a network, and by | corresponding to anycast Service Addresses across a network, and by | |||
doing so capture legitimate request traffic or process requests in a | doing so capture legitimate request traffic or process requests in a | |||
manner which compromises the service (or both). A rogue Anycast Node | manner which compromises the service (or both). A rogue Anycast Node | |||
might be difficult to detect by clients or by the operator of the | might be difficult to detect by clients or by the operator of the | |||
service. | service. | |||
The risk of service hijacking by manipulation of the routing sytem | The risk of service hijacking by manipulation of the routing system | |||
exists regardless of whether a service is distributed using anycast. | exists regardless of whether a service is distributed using anycast. | |||
However, the fact that legitimate Anycast Nodes are observable in the | However, the fact that legitimate Anycast Nodes are observable in the | |||
routing system may make it more difficult to detect rogue nodes. | routing system may make it more difficult to detect rogue nodes. | |||
7. Protocol Considerations | 7. Protocol Considerations | |||
This document does not impose any protocol considerations. | This document does not impose any protocol considerations. | |||
8. IANA Considerations | 8. IANA Considerations | |||
This document requests no action from IANA. | This document requests no action from IANA. | |||
9. Acknowlegements | 9. Acknowledgements | |||
The authors gratefully acknowledge the contributions from various | The authors gratefully acknowledge the contributions from various | |||
participants of the grow working group, and in particular Geoff | participants of the grow working group, and in particular Geoff | |||
Huston, Pekka Savola, Danny McPherson, Ben Black and Alan Barrett. | Huston, Pekka Savola, Danny McPherson, Ben Black and Alan Barrett. | |||
This work was supported by the US National Science Foundation | This work was supported by the US National Science Foundation | |||
(research grant SCI-0427144) and DNS-OARC. | (research grant SCI-0427144) and DNS-OARC. | |||
10. References | 10. References | |||
10.1 Normative References | 10.1. Normative References | |||
[I-D.ietf-ipv6-addr-arch-v4] | [I-D.ietf-ipv6-addr-arch-v4] | |||
Hinden, R. and S. Deering, "IP Version 6 Addressing | Hinden, R. and S. Deering, "IP Version 6 Addressing | |||
Architecture", draft-ietf-ipv6-addr-arch-v4-04 (work in | Architecture", draft-ietf-ipv6-addr-arch-v4-04 (work in | |||
progress), May 2005. | progress), May 2005. | |||
[RFC0793] Postel, J., "Transmission Control Protocol", STD 7, | [RFC0793] Postel, J., "Transmission Control Protocol", STD 7, | |||
RFC 793, September 1981. | RFC 793, September 1981. | |||
[RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 | [RFC1771] Rekhter, Y. and T. Li, "A Border Gateway Protocol 4 | |||
skipping to change at page 24, line 37 | skipping to change at page 24, line 37 | |||
[RFC2439] Villamizar, C., Chandra, R., and R. Govindan, "BGP Route | [RFC2439] Villamizar, C., Chandra, R., and R. Govindan, "BGP Route | |||
Flap Damping", RFC 2439, November 1998. | Flap Damping", RFC 2439, November 1998. | |||
[RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: | [RFC2827] Ferguson, P. and D. Senie, "Network Ingress Filtering: | |||
Defeating Denial of Service Attacks which employ IP Source | Defeating Denial of Service Attacks which employ IP Source | |||
Address Spoofing", BCP 38, RFC 2827, May 2000. | Address Spoofing", BCP 38, RFC 2827, May 2000. | |||
[RFC3704] Baker, F. and P. Savola, "Ingress Filtering for Multihomed | [RFC3704] Baker, F. and P. Savola, "Ingress Filtering for Multihomed | |||
Networks", BCP 84, RFC 3704, March 2004. | Networks", BCP 84, RFC 3704, March 2004. | |||
10.2 Informative References | 10.2. Informative References | |||
[Allman2000] | [Allman2000] | |||
Allman, M. and E. Blanton, "On Making TCP More Robust to | Allman, M. and E. Blanton, "On Making TCP More Robust to | |||
Packet Reordering", January 2000, | Packet Reordering", January 2000, | |||
<http://www.icir.org/mallman/papers/tcp-reorder-ccr.ps>. | <http://www.icir.org/mallman/papers/tcp-reorder-ccr.ps>. | |||
[Fomenkov2004] | [Fomenkov2004] | |||
Fomenkov, M., Keys, K., Moore, D., and k. claffy, | Fomenkov, M., Keys, K., Moore, D., and k. claffy, | |||
"Longitudinal Study of Internet Traffic from 1999-2003", | "Longitudinal Study of Internet Traffic from 1999-2003", | |||
January 2004, <http://www.caida.org/outreach/papers/2003/ | January 2004, <http://www.caida.org/outreach/papers/2003/ | |||
skipping to change at page 26, line 13 | skipping to change at page 27, line 5 | |||
(BGP) Route Scope Control", RFC 3765, April 2004. | (BGP) Route Scope Control", RFC 3765, April 2004. | |||
URIs | URIs | |||
[1] <http://dnsmon.ripe.net/> | [1] <http://dnsmon.ripe.net/> | |||
[2] <http://ris.ripe.net> | [2] <http://ris.ripe.net> | |||
[3] <http://www.route-views.org> | [3] <http://www.route-views.org> | |||
Authors' Addresses | ||||
Joe Abley | ||||
Internet Systems Consortium, Inc. | ||||
950 Charter Street | ||||
Redwood City, CA 94063 | ||||
USA | ||||
Phone: +1 650 423 1317 | ||||
Email: jabley@isc.org | ||||
URI: http://www.isc.org/ | ||||
Kurt Erik Lindqvist | ||||
Netnod Internet Exchange | ||||
Bellmansgatan 30 | ||||
118 47 Stockholm | ||||
Sweden | ||||
Email: kurtis@kurtis.pp.se | ||||
URI: http://www.netnod.se/ | ||||
Appendix A. Change History | Appendix A. Change History | |||
This section should be removed before publication. | This section should be removed before publication. | |||
draft-kurtis-anycast-bcp-00: Initial draft. Discussed at IETF 61 in | draft-kurtis-anycast-bcp-00: Initial draft. Discussed at IETF 61 in | |||
the grow meeting and adopted as a working group document shortly | the grow meeting and adopted as a working group document shortly | |||
afterwards. | afterwards. | |||
draft-ietf-grow-anycast-00: Missing and empty sections completed; | draft-ietf-grow-anycast-00: Missing and empty sections completed; | |||
some structural reorganisation; general wordsmithing. Document | some structural reorganisation; general wordsmithing. Document | |||
skipping to change at page 28, line 5 | skipping to change at page 27, line 27 | |||
draft-ietf-grow-anycast-01: This appendix added; acknowledgements | draft-ietf-grow-anycast-01: This appendix added; acknowledgements | |||
section added; commentary on RFC3513 prohibition of anycast on | section added; commentary on RFC3513 prohibition of anycast on | |||
hosts removed; minor sentence re-casting and related jiggery- | hosts removed; minor sentence re-casting and related jiggery- | |||
pokery. This revision published for discussion at IETF 63. | pokery. This revision published for discussion at IETF 63. | |||
draft-ietf-grow-anycast-02: Normative reference to [I-D.ietf-ipv6- | draft-ietf-grow-anycast-02: Normative reference to [I-D.ietf-ipv6- | |||
addr-arch-v4] added (in the RFC editor's queue at the time of | addr-arch-v4] added (in the RFC editor's queue at the time of | |||
writing; reference should be updated to an RFC number when | writing; reference should be updated to an RFC number when | |||
available). Added commentary on per-packet load balancing. | available). Added commentary on per-packet load balancing. | |||
draft-ietf-grow-anycast-03: Editorial changes and language clean-up | ||||
at the request of the IESG. | ||||
Authors' Addresses | ||||
Joe Abley | ||||
Internet Systems Consortium, Inc. | ||||
950 Charter Street | ||||
Redwood City, CA 94063 | ||||
USA | ||||
Phone: +1 650 423 1317 | ||||
Email: jabley@isc.org | ||||
URI: http://www.isc.org/ | ||||
Kurt Erik Lindqvist | ||||
Netnod Internet Exchange | ||||
Bellmansgatan 30 | ||||
118 47 Stockholm | ||||
Sweden | ||||
Email: kurtis@kurtis.pp.se | ||||
URI: http://www.netnod.se/ | ||||
Intellectual Property Statement | Intellectual Property Statement | |||
The IETF takes no position regarding the validity or scope of any | The IETF takes no position regarding the validity or scope of any | |||
Intellectual Property Rights or other rights that might be claimed to | Intellectual Property Rights or other rights that might be claimed to | |||
pertain to the implementation or use of the technology described in | pertain to the implementation or use of the technology described in | |||
this document or the extent to which any license under such rights | this document or the extent to which any license under such rights | |||
might or might not be available; nor does it represent that it has | might or might not be available; nor does it represent that it has | |||
made any independent effort to identify any such rights. Information | made any independent effort to identify any such rights. Information | |||
on the procedures with respect to rights in RFC documents can be | on the procedures with respect to rights in RFC documents can be | |||
found in BCP 78 and BCP 79. | found in BCP 78 and BCP 79. | |||
skipping to change at page 28, line 41 | skipping to change at page 29, line 41 | |||
This document and the information contained herein are provided on an | This document and the information contained herein are provided on an | |||
"AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS | "AS IS" basis and THE CONTRIBUTOR, THE ORGANIZATION HE/SHE REPRESENTS | |||
OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET | OR IS SPONSORED BY (IF ANY), THE INTERNET SOCIETY AND THE INTERNET | |||
ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, | ENGINEERING TASK FORCE DISCLAIM ALL WARRANTIES, EXPRESS OR IMPLIED, | |||
INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE | INCLUDING BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE | |||
INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED | INFORMATION HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED | |||
WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. | |||
Copyright Statement | Copyright Statement | |||
Copyright (C) The Internet Society (2005). This document is subject | Copyright (C) The Internet Society (2006). This document is subject | |||
to the rights, licenses and restrictions contained in BCP 78, and | to the rights, licenses and restrictions contained in BCP 78, and | |||
except as set forth therein, the authors retain all their rights. | except as set forth therein, the authors retain all their rights. | |||
Acknowledgment | Acknowledgment | |||
Funding for the RFC Editor function is currently provided by the | Funding for the RFC Editor function is currently provided by the | |||
Internet Society. | Internet Society. | |||
End of changes. 65 change blocks. | ||||
132 lines changed or deleted | 153 lines changed or added | |||
This html diff was produced by rfcdiff 1.28, available from http://www.levkowetz.com/ietf/tools/rfcdiff/ |