DCCP-Lite August 2003 DCCP-Lite Internet Draft T. Phelan Document: draft-phelan-dccp-lite-00.txt Sonus Networks Expires: February 2004 August 2003 Datagram Congestion Control Protocol - Lite (DCCP-Lite) Status of this Memo This document is an Internet-Draft and is in full conformance with all provisions of Section 10 of RFC2026. Internet-Drafts are working documents of the Internet Engineering Task Force (IETF), its areas, and its working groups. Note that other groups may also distribute working documents as Internet- Drafts. Internet-Drafts are draft documents valid for a maximum of six months and may be updated, replaced, or obsoleted by other documents at any time. It is inappropriate to use Internet-Drafts as reference material or to cite them other than as "work in progress." The list of current Internet-Drafts can be accessed at http://www.ietf.org/ietf/1id-abstracts.txt The list of Internet-Draft Shadow Directories can be accessed at http://www.ietf.org/shadow.html. Abstract DCCP-Lite is a simplified version of the Datagram Congestion Control Protocol (DCCP). It implements a congestion-controlled, unreliable flow of datagrams suitable for use by applications such as streaming media. Phelan Expires - February 2004 [Page 1] DCCP-Lite August 2003 Table of Contents 1. Introduction...................................................3 2. Concepts and Terminology.......................................3 2.1 Anatomy of DCCP-Lite Connection............................3 2.2 Congestion Control.........................................4 2.3 Connection Initiation and Termination......................5 2.4 DCCP-Lite Connection Sequence..............................5 3. Packet Formats.................................................6 3.1 Generic Packet Header......................................6 3.2 Sequence Number Validity...................................8 3.3 DCCPL-Request Packet Format................................9 3.4 DCCPL-Response Packet Format..............................10 3.5 DCCPL-Connect Packet Format...............................12 3.6 DCCPL-Data, DCCPL-Ack, and DCCPL-DataAck Packet Formats...14 3.7 DCCPL-Close Packet Format.................................16 3.8 DCCPL-Reset Packet Format.................................16 4. DCCP-Lite Operation...........................................17 4.1 Server State Diagram......................................18 4.2 Server State Table........................................19 4.3 Client State Diagram......................................20 4.4 Client State Table........................................21 5. Congestion Control IDs........................................21 6. Maximum Transfer Unit.........................................22 7. Security Considerations.......................................24 8. IANA Considerations...........................................24 9. Normative References..........................................24 10. Informative References.......................................25 11. Acknowledgments..............................................25 12. Author's Address.............................................25 Phelan Expires - February 2004 [Page 2] DCCP-Lite August 2003 1. Introduction The Datagram Congestion Control Protocol (DCCP), as defined in [DCCP], is a transport protocol that implements a congestion- controlled unreliable service. DCCP provides many features and options to users, and has been criticized for the resulting complexity. This document presents a simplified version of DCCP, DCCP-Lite. The design approach has been to start with [DCCP] and simplify by elimination. The simplifications were achieved through the following techniques: o Eliminate options (but not all of the features supported by options). o Eliminate back-and-forth negotiation. o Eliminate features with limited use or applicability. Decisions about what constitutes "limited use" are subjective. o Where similar results are supported by multiple features or methods, eliminate all but one. o Push congestion control specific features and topics to the CCID documents. While this does not simplify things overall, it can make an implementation that only supports one CCID simpler. This document assumes that the reader is familiar with [DCCP]. 2. Concepts and Terminology The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC 2119]. All multi-byte numerical quantities in DCCP-Lite, such as Sequence Numbers, are transmitted in network byte order (most significant byte first). 2.1 Anatomy of DCCP-Lite Connection Each DCCP-Lite connection runs between two endpoints, which we often name DCCPL A and DCCPL B. Data may pass over the connection in Phelan Expires - February 2004 [Page 3] DCCP-Lite August 2003 either or both directions. The DCCP-Lite connection between DCCPL A and DCCPL B consists of four sets of packets, as follows: (1) Data packets from DCCPL A to DCCPL B. (2) Acknowledgements from DCCPL B to DCCPL A. (3) Data packets from DCCPL B to DCCPL A. (4) Acknowledgements from DCCPL A to DCCPL B. We use the following terms to refer to subsets and endpoints of a DCCP-Lite connection. Subflows A subflow consists of either data or acknowledgement packets, sent in one direction. Each of the four sets of packets above is a subflow. (Subflows may overlap to some extent, since acknowledgements may be piggybacked on data packets.) Sequences A sequence consists of all packets sent in one direction, regardless of whether they are data or acknowledgements. The sets 1+4 and 2+3, above, are sequences. Each packet on a sequence has a different sequence number. Half-connections A half-connection consists of the data packets sent in one direction, plus the corresponding acknowledgements. The sets 1+2 and 3+4, above, are half-connections. Half-connections are named after the direction of data flow, so the A-to-B half-connection contains the data packets from A to B and the acknowledgements from B to A. HC-Sender and HC-Receiver In the context of a single half-connection, the HC-Sender is the endpoint sending data, while the HC-Receiver is the endpoint sending acknowledgements. For example, in the A-to-B half- connection, DCCPL A is the HC-Sender and DCCPL B is the HC- Receiver. 2.2 Congestion Control Each half-connection is managed by a congestion control mechanism. The endpoints negotiate these mechanisms at connection setup; the mechanisms for the two half-connections need not be the same. Conformant congestion control mechanisms correspond to single-byte congestion control identifiers, or CCIDs. The CCID for a half- connection describes how the HC-Sender limits data packet rates; how Phelan Expires - February 2004 [Page 4] DCCP-Lite August 2003 it maintains necessary parameters, such as congestion windows; how the HC-Receiver sends congestion feedback via acknowledgements; and how it manages the acknowledgement rate. Section 5 introduces the currently allocated CCIDs, which are defined in separate profile documents. 2.3 Connection Initiation and Termination Every DCCP-Lite connection is actively initiated by one DCCPL, which connects to a DCCPL socket in the passive listening state. We refer to the active endpoint as "the client" and the passive endpoint as "the server". The DCCP-Lite specification provides separate state machines for client and server, but most of DCCP-Lite is indifferent to whether a DCCPL is client or server. DCCP-Lite does not support TCP-style simultaneous open. In particular, a host MUST NOT respond to a DCCPL-Request packet with a DCCPL-Response packet unless the destination port specified in the DCCPL-Request corresponds to a local socket opened for listening. This preserves the invariant that every connection has one client and one server. DCCP-Lite shuts down both half-connections as a unit; it has no states analogous to TCP's FINWAIT and CLOSEWAIT states, where one TCP "half-connection" is closed and the other remains open. However, DCCP-Lite implementations SHOULD allow applications to declare that they are no longer interested in receiving data. This would allow DCCP-Lite implementations to streamline state for certain half- connections. 2.4 DCCP-Lite Connection Sequence The progress of a typical DCCP-Lite connection is as follows. (This description is informative, not normative.) 1. The client sends the server a DCCPL-Request packet specifying the client and server ports, the CCID for the client-to-server half connection, and a Connection Nonce the server can use for identification. 2. The server sends the client a DCCPL-Response packet specifying the loss window and CCID for the server-to-client half connection, any supporting data for the CCID, an Init Cookie that wraps up all the server state information necessary to complete the connection, and a Connection Nonce that the client can use to identify itself in some circumstances. At this point, the server does not hold any local state information concerning the connection. Phelan Expires - February 2004 [Page 5] DCCP-Lite August 2003 3. The client sends the server a DCCPL-Connect packet that echoes the Init Cookie received from the server and includes CCID support data. The DCCPL-Connect packet may also contain user data. The server instantiates local state information for the connection and sends a DCCPL-Ack or DCCPL-DataAck packet to complete the connection handshake. 4. The client and server exchange DCCPL-Data packets, and acknowledgements as prescribed by the CCIDs. Either the client or the server may initiate the connection close. 5. The closer sends a DCCPL-Close packet requesting a close. 6. The other DCCPL receives the DCCPL-Close packet, sends a DCCPL- Reset packet whose Reason field is set to "Closed", and holds the connection state for a reasonable interval of time to allow any remaining packets to clear the network 7. The closer receives the DCCPL-Reset packet and destroys its connection state. 3. Packet Formats Each packet starts with a generic header, followed by type-specific information. There are 8 packet types: DCCPL-Request DCCPL-Response DCCPL-Connect DCCPL-Data DCCPL-Ack DCCPL-DataAck DCCPL-Close DCCPL-Reset All reserved fields SHOULD be set to zero on transmit and ignored on receive. 3.1 Generic Packet Header All DCCP-Lite packets begin with a generic DCCPL packet header: Phelan Expires - February 2004 [Page 6] DCCP-Lite August 2003 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Dest Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Type Spec | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Source and Destination Ports: 16 bits each These fields identify the connection, similar to the corresponding fields in TCP and UDP. The Source Port represents the relevant port on the endpoint that sent this packet, the Destination Port the relevant port on the other endpoint. Sequence Number: 32 bits The sequence number field is initialized by a DCCPL-Request or DCCPL-Response packet, and increases by one (modulo 4294967296) with every packet sent. The receiver uses this information to determine whether packet losses have occurred. All packets, even packets containing no data, update the sequence number. Sequence numbers also provide some protection against old and malicious packets; see Section 3.2 on sequence number validity. The two subflows' initial sequence numbers are set by the first DCCPL-Request and DCCPL-Response packets sent, and SHOULD be chosen as for TCP. In particular, the initial sequence number choice MUST include a random or pseudorandom component to make it harder for attackers to complete sequence number attacks [RFC 1948]. The initial sequence number chosen for a given connection identifier (source address and port plus destination address and port) SHOULD increase over time, as TCP suggests [RFC 793], to prevent inappropriate delivery of old packets. Type: 8 bits The type field specifies the type of the DCCPL message. The following values are defined: Value Packet Type 0 DCCPL-Request 1 DCCPL-Response 2 DCCPL-Connect 3 DCCPL-Data 4 DCCPL-Ack 5 DCCPL-DataAck 6 DCCPL-Close 7 DCCPL-Reset 8-255 Reserved for future use Phelan Expires - February 2004 [Page 7] DCCP-Lite August 2003 Type Specific (Type Spec): 8 bits Reserved for packet type specific use. If not used by a particular packet type, it SHOULD be set to zero on transmit and ignored on receive. Checksum: 16 bits DCCP-Lite uses the TCP/IP checksum algorithm. The checksum field equals the 16 bit one's complement of the one's complement sum of all 16 bit words in the DCCPL header, a pseudoheader taken from the network-layer header, and all of the payload. When calculating the checksum, the checksum field itself is treated as 0. If a packet contains an odd number of header and text bytes to be checksummed, 8 zero bits are added on the right to form a 16 bit word for checksum purposes. The pad byte is not transmitted as part of the packet. The pseudoheader is calculated as for TCP. For IPv4, it is 96 bits long, and consists of the IPv4 source and destination addresses, the IP protocol number for DCCP-Lite (padded on the left with 8 zero bits), and the DCCPL length as a 16-bit quantity (the length of the DCCPL header, plus the length of any data); see Section 3.1 of [RFC 793]. For IPv6, it is 320 bits long, and consists of the IPv6 source and destination addresses, the DCCPL length as a 32-bit quantity, and the IP protocol number for DCCP-Lite (padded on the left with 24 zero bits); see Section 8.1 of [RFC 2460]. Packets with invalid checksums MUST be ignored. 3.2 Sequence Number Validity DCCPL endpoints SHOULD ignore packets with invalid sequence numbers, which may arise if the network delivers a very old packet or an attacker attempts to hijack a connection. TCP solves this problem with its window. In DCCP-Lite, however, sequence numbers change with each packet sent, even pure acknowledgements. Thus, a loss event that dropped many consecutive packets could cause two DCCPLs to get out of sync relative to any window. DCCP-Lite uses a Loss Window mechanism to determine whether a given packet's sequence number is valid. Each HC-Sender gives the corresponding HC-Receiver a loss window width W; see sections 3.4 and 3.5. This reflects how many packets the sender expects to be in flight. Only the sender can anticipate this number. One good guideline is to set it to about 3 or 4 times the maximum number of packets the sender expects to send in any round-trip time. Too-small values increase the risk of the endpoints getting out sync after bursts of loss; too-large values increase the risk of connection hijacking. A suggested default value for W is 1024. Phelan Expires - February 2004 [Page 8] DCCP-Lite August 2003 The HC-Receiver sets up a loss window of W consecutive sequence numbers centered on the GSN, the Greatest Sequence Number it has received on any valid packet from the sender. ("Consecutive" and "greatest" are measured in circular sequence space.) Sequence numbers outside this loss window are invalid. Packets with invalid sequence numbers are themselves invalid. The receiving DCCPL SHOULD ignore invalid packets. In particular, it SHOULD NOT pass any enclosed data to the application, update its congestion control or feature state, or close the connection. A DCCPL sender SHOULD monitor the current number of unacknowledged packets (the difference between its current sequence number and the greatest acknowledgement number received). If that number approaches half of the loss window W (say within five or so percent), the connection SHOULD be closed by sending a DCCPL-Close packet with reason set to "Loss Window exceeded" to avoid permanent lack of sequence number synchronization. Some congestion control mechanisms MAY choose to close the connection earlier. 3.3 DCCPL-Request Packet Format A DCCPL connection is initiated by sending a DCCPL-Request packet from the client to the server. In this phase, the client specifies the CCID to use for the client-to-server half-connection and the Connection Nonce the server will use to identify itself. The format of a DCCPL request packet is: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCPL Header (12 bytes) / / Type=0 (DCCPL-Request) / / Type Spec = CCID / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Connection Nonce + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Congestion Control ID (CCID): 8 bits This indicates the congestion control mechanism to use for the client-to-server direction of data transfer. Connection Nonce: 64 bits This is a transparent value created by the client that the server can use to prove identity in certain packets. Normally, this will be a randomly generated, unguessable, string of 8 bytes. To prevent spoofing, this string MUST NOT have any trivially predictable value. For example, it MUST NOT be set Phelan Expires - February 2004 [Page 9] DCCP-Lite August 2003 deterministically to zero, and it SHOULD change on every connection. To proceed with connection establishment, the server responds with a DCCPL-Response packet. A server in the LISTEN state MAY refuse the connection by ignoring the DCCPL-Request, or by responding with a DCCPL-Reset packet. Relevant Reset Reasons for refusing a connection are "CCID Rejected", if the proposed CCID is not supported; and "Too Busy", when the server is currently too busy to respond to requests. The server SHOULD limit the rate at which it generates these resets. If the client does not receive a DCCPL-Response packet in response to a DCCPL-Request, it MAY retransmit the DCCPL-Request after a suitable timeout. The timeout SHOULD be exponentially increased after each retry. 3.4 DCCPL-Response Packet Format In the second phase of the connection handshake, the server sends a DCCPL-Response message to the client. In this phase, a server will specify the CCID and Connection Nonce to use for the server-to-client half-connection, and, to avoid keeping local state for an unverified client, will give the client an "Init Cookie" that encapsulates the information necessary for the server to continue with the connection when the client returns the cookie in the DCCPL-Connected packet. The server SHOULD NOT retransmit DCCPL-Response packets; the client will retransmit the DCCPL-Request if necessary. The server has no need to detect that the retransmitted DCCPL-Request applies to a previously received Request; it simply answers each DCCPL-Request with a new DCCPL-Response and a new Init Cookie. Every valid DCCPL- Request received in the LISTEN state SHOULD elicit a new DCCPL- Response, if the server desires to proceed with the connection. Each new DCCPL-Response sets a new (possible) starting Sequence Number for the server-to-client half-connection. Phelan Expires - February 2004 [Page 10] DCCP-Lite August 2003 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCPL Header (12 bytes) / / Type=1 (DCCPL-Response) / / Type Spec = CCID / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ICLEN | Reserved | Loss Window | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Connection Nonce + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Init Cookie (variable length) / / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / CC Data (variable length) / / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Congestion Control ID (CCID): 8 bits This indicates the congestion control mechanism to use for the server-to-client direction of data transfer. Init Cookie Length (ICLEN): 8 bits This indicates the length, in 32-bit words, of the Init Cookie field. Loss Window: 16 bits This value, times 1024, is the value to use for the loss window in the server-to-client direction. Connection Nonce: 64 bits This is a transparent value created by the server that the client can use to prove identity in certain packets. Normally, this will be a randomly generated, unguessable, string. As with the DCCPL- Request Connection Nonce, this MUST NOT be a trivially predictable value and SHOULD change for each new connection. Init Cookie: variable length The purpose of this field is to allow a DCCPL server to avoid holding connection state until the connection setup handshake has completed. The server wraps up the server and client ports, and any other information it cares about from both the DCCPL-Request and DCCPL-Response, in an opaque cookie. Typically the cookie will be encrypted using a secret known only to the server and include a cryptographic checksum or magic value so that correct decryption can be verified. When the server receives the cookie back in the DCCPL-Connect packet, it can decrypt the cookie and instantiate all Phelan Expires - February 2004 [Page 11] DCCP-Lite August 2003 the state it avoided keeping. A DCCPL-Connect packet with an invalid Init Cookie SHOULD be considered an invalid packet. Congestion Control Data (CC Data): variable length This data is for interpretation by the congestion control algorithm. Sequence Number (from generic header): 32 bits This indicates the starting value for sequence numbers in the server-to-client half-connection. A client in the REQUEST state that receives a DCCPL-Response packet responds with a DCCPL-Connect packet to continue the connection setup. To abort the connection setup the client SHOULD ignore the DCCPL-Response packet. 3.5 DCCPL-Connect Packet Format In the third phase of the connection handshake, the client sends a DCCPL-Connect message to the server, echoing the Init Cookie received in the DCCPL-Response packet and including congestion control- specific data. The DCCPL-Connect packet MAY contain user data. 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCPL Header (12 bytes) / / Type=2 (DCCPL-Connect) / / Type Spec = ICLEN / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgement Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | CCDLEN | Reserved | Loss Window | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Init Cookie (variable length) / / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / CC Data (variable length) / / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Data (variable length) / / ... / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Acknowledgement Number: 32 bits The Acknowlegement Number field, which appears in several packet types, acknowledges the greatest valid sequence number received so far on this connection. ("Greatest" is, of course, measured in circular sequence space). In this case, this field will equal the Sequence Number field of the DCCPL-Response packet that this packet Phelan Expires - February 2004 [Page 12] DCCP-Lite August 2003 is responding to. It indicates acceptance of the starting value for sequence numbering in the server-to-client half-connection. Init Cookie Length (ICLEN): 8 bits This indicates the length, in 32-bit words, of the Init Cookie field. It is taken directly from the ICLEN field in the DCCPL- Response packet. Congestion Control Data Length (CCDLEN): 8 bits This indicates the length, in 32-bit words, of the CC Data field. Loss Window: 16 bits This value, times 1024, is the value to use for the loss window in the client-to-server direction. Init Cookie: variable length This data is taken directly from the Init Cookie field in the DCCPL-Response packet. Congestion Control Data (CC Data): variable length This data is for interpretation by the congestion control algorithm. Sequence Number (from generic header): 32 bits This indicates the starting value for sequence numbers in the client-to-server half-connection. A server in the LISTEN state that receives a DCCPL-Connect packet with a valid Init Cookie instantiates local state for the connection and responds with a DCCPL-Ack or DCCPL-DataAck packet to finish the connection setup. The Acknowledgement Number field in the DCCPL-Ack or DCCPL-DataAck packet is taken from the Sequence Number field of the DCCPL-Connect packet and indicates acceptance of the starting sequence number for the client-to-server half-connection. If the server wishes to abort the connection it MAY ignore the DCCPL- Connect packet, or send a DCCPL-Reset. Valid reset Reasons are "Bad CC Data", if the CC Data field contains information not acceptable to the CCID specified in the DCCPL-Request, or "Connection Refused" for any other cause, but not including an invalid Init Cookie. DCCPL- Connect packets with invalid Init Cookies SHOULD be ignored. The server SHOULD limit the rate at which it generates these resets. The client SHOULD retransmit the DCCPL-Connect (increasing the Sequence Number) if no DCCPL-Ack or DCCPL-DataAck is received from the server within a suitable timeout. The timeout value SHOULD be exponentially increased after each timeout. Phelan Expires - February 2004 [Page 13] DCCP-Lite August 2003 3.6 DCCPL-Data, DCCPL-Ack, and DCCPL-DataAck Packet Formats The payload of a DCCP-Lite connection is sent in DCCPL-Data and DCCPL-DataAck packets, while DCCPL-Ack packets are used for acknowledgements when there is no payload to be sent. DCCPL-Data packets look like this: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCPL Header (12 bytes) / / Type=3 (DCCPL-Data) / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Data (variable length) / / ... / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ DCCPL-Ack packets dispense with the data, but contain an acknowledgement number and CCID-specific data: 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCPL Header (12 bytes) / / Type=4 (DCCPL-Ack) / / Type Spec = CCDLEN / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgement Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / CC Data (variable length) / / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ DCCPL-DataAck packets contain data, an acknowledgment number, and CCID-specific data: acknowledgement information is piggybacked on a data packet. Phelan Expires - February 2004 [Page 14] DCCP-Lite August 2003 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCPL Header (12 bytes) / / Type=5 (DCCPL-DataAck) / / Type Spec = CCDLEN / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgement Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / CC Data (variable length) / / / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Data (variable length) / / ... / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Congestion Control Data Length (CCDLEN): 8 bits This field indicates the length, in 32-bit words, of the CC Data field. Acknowledgement Number: 32 bits This field contains the greatest sequence number (in circular space) seen in a valid packet received from the other half- connection. Congestion Control Data (CC Data): variable length This data is for interpretation by the congestion control algorithm. DCCPL A sends DCCPL-Data and DCCPL-DataAck packets to DCCPL B due to application events on host A. These packets are congestion- controlled by the CCID for the A-to-B half-connection. In contrast, DCCPL-Ack packets sent by DCCPL A are controlled by the CCID for the B-to-A half-connection. Generally, DCCPL A will piggyback acknowledgement information on data packets when acceptable, creating DCCPL-DataAck packets. DCCPL-Ack packets are used when there is no data to send from DCCPL A to DCCPL B, or when the link from A to B is so congested that sending data would be inappropriate. The server in the LISTEN state sends an initial DCCPL-Ack or DCCPL- DataAck packet in response to a valid DCCPL-Connect packet to finish the connection handshake. Because the network might lose this final handshake packet, and the client will then retransmit the DCCPL- Connect, the server SHOULD also send a DCCPL-Ack or DCCPL-DataAck in response to a valid DCCPL-Connect packet received in the SERVER-OPEN state. Phelan Expires - February 2004 [Page 15] DCCP-Lite August 2003 3.7 DCCPL-Close Packet Format Either side of a DCCP-Lite connection may close the connection by sending a DCCPL-Close packet. The normal response to a DCCPL-Close packet is a DCCPL-Reset packet with Reason "Closed". 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCPL Header (12 bytes) / / Type=6 (DCCPL-Close) / / Type Spec = Reason / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Connection Nonce + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Reason: 8 bits The Reason field represents the reason that the sender is closing the DCCPL connection. The following reasons are currently defined: Section Reason Name Reference ------ ---- --------- 0 Unspecified 1 Normal 4.2 and 4.4 2 Loss Window exceeded 3.2 3 Connect fail 4.2 Connection Nonce: 64 bits This field echoes the connection nonce supplied by the other end of the connection in the DCCPL-Request or DCCPL-Response packet that initiated the connection. A received DCCPL-Close packet with an invalid Connection Nonce SHOULD be considered an invalid packet. 3.8 DCCPL-Reset Packet Format A DCCPL-Reset, with Reason "Closed", is normally sent as an acknowledgement to a received DCCPL-Close packet. DCCPL-Reset packets can also be sent in certain circumstances to abnormally terminate a connection (see sections 3.3, 3.4, and 3.5) Phelan Expires - February 2004 [Page 16] DCCP-Lite August 2003 0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ / Generic DCCPL Header (12 bytes) / / Type=7 (DCCPL-Reset) / / Type Spec = Reason / +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + Connection Nonce + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ Reason: 8 bits The Reason field represents the reason that the sender reset the DCCPL connection. The following Reasons are currently defined: Section Reason Name Reference ------ ---- --------- 0 Unspecified 1 Closed 4.1 and 4.2 2 Connection Refused 3.5 3 Too Busy 3.3 4 CCID Rejected 3.3 5 Bad CC Data 3.5 Connection Nonce: 64 bits This field echoes the connection nonce supplied by the other end of the connection in the DCCPL-Request or DCCPL-Response packet that initiated the connection. A received DCCPL-Reset packet with an invalid Connection Nonce SHOULD be considered an invalid packet. 4. DCCP-Lite Operation In this section we present state diagrams and tables showing how a DCCP-Lite connection should progress, and the proper responses for packets or timeout events in various connection states. The state diagrams are illustrative; the state tables and text should be considered definitive. All receive events in the diagrams and tables represent receipt of valid packets. The state diagrams do not contain all possible events or state transitions. Refer to the state tables and the text for complete information. Each cell in the state tables describes the actions taken when an event (the row headers) occurs in a connection state (the column headers). Events include valid received packets and other local Phelan Expires - February 2004 [Page 17] DCCP-Lite August 2003 events. "Timeout" means that the state's timer has expired. "Backoff fail" means that the maximum number of timeouts/retries for a state has been reached. "App action" means that the application has opened or closed the socket. Actions include the packet type to send in response and the next state for the connection. "Accept" means to process the congestion control information and pass any data to the application. "Backoff" means to restart the timer with an exponentially increasing value. 4.1 Server State Diagram +--------+ | CLOSED |<----------------------------------+ +--------+ ^ | app listens | v | rcv Request +->+--------+ app closes | [send Response] +--| LISTEN |---------------------------------->+ +--------+ ^ | rcv Connect with | | valid Init Cookie | | [new socket] | | [send Ack or DataAck] | v | rcv Connect +->+-------------+ rcv Close +-------------+ | with valid +--| SERVER-OPEN |------------->| SERVER-WAIT | | Init Cookie +-------------+ [send Reset] +-------------+ | [send Ack or | [set timer] | timeout | DataAck] | app closes +-------->+ | [send Close] ^ | [set timer] | timer expires v | [send Close] +->+--------------+ rcv Reset or backoff fails | [backoff timer] +--| SERVER-CLOSE |---------------------------->+ +--------------+ Phelan Expires - February 2004 [Page 18] DCCP-Lite August 2003 4.2 Server State Table The SERVER-WAIT and SERVER-CLOSE states each have a timer associated with it that is started whenever the state is entered and cancelled when the state is exited. All Resets have Reason "Closed". SERVER- SERVER- Event CLOSED LISTEN SERVER-OPEN WAIT CLOSE ----- ------ ------ ----------- ------- ------- Request Ignore Response Ignore Ignore Ignore Response Ignore Ignore Ignore Ignore Ignore Connect Ignore Ack [1], Ack [1] Ignore Ignore SERVER- OPEN Data [2] Ignore Ignore Accept Ignore Ignore Close Ignore Ignore Reset, Reset Reset, SERVER-WAIT CLOSED Reset Ignore Ignore Ignore Ignore CLOSED Timeout NA NA NA CLOSED Close, Backoff Backoff NA NA NA NA CLOSED Fail App action LISTEN CLOSED Close, No No action SERVER- action CLOSE Notes: [1] Ack or DataAck packet. [2] Ack, DataAck or Data packet. Phelan Expires - February 2004 [Page 19] DCCP-Lite August 2003 4.3 Client State Diagram +--------+ | CLOSED |<----------------------------+ +--------+ ^ | app opens | | [send Request] | | [set timer] | v | timer expires +->+---------+ backoff fails or | [send Request] +--| REQUEST |---------------------------->+ [backoff timer] +---------+ rcv Reset ^ | | | rcv Response | | [send Connect] | | [set timer] | v | backoff fails +---------+<-+ timer expires | +<---------------| CONNECT |--+ [send Connect] | | [set timer] +---------+ [backoff timer] | | [send Close] | rcv Ack | | v | | +<----------+ | | | | | v | | +-------------+ rcv Close +--------------+ timer | | | CLIENT-OPEN |------------>| CLIENT-WAIT |--------->+ | +-------------+ [set timer] +--------------+ expires ^ | | [send Reset] | ^ | | | | | | | | app closes +--+ | | | [send Close] rcv Close | | | [set timer] [send Reset] | | v | v +--------------+<-+ timer expires | +->| CLIENT-CLOSE |--+ [send Close] | +--------------+ [backoff timer] | | rcv Reset or | v backoff fails | +--------------------------------------------->+ Phelan Expires - February 2004 [Page 20] DCCP-Lite August 2003 4.4 Client State Table The REQUEST, CONNECT, CLIENT-CLOSE and CLIENT--WAIT states each have a timer associated with it that is started whenever the state is entered and cancelled when the state is exited. All resets have Reason "Closed". CLIENT- CLIENT- CLIENT- Event CLOSED REQUEST CONNECT OPEN CLOSE WAIT ----- ------ ------- ------- ------- ------- ------- Request Ignore Ignore Ignore Ignore Ignore Ignore Response Ignore Connect, Ignore Ignore Ignore Ignore CONNECT Connect Ignore Ignore Ignore Ignore Ignore Ignore Ack or Ignore Ignore CLIENT- Accept Ignore Ignore DataAck OPEN Data Ignore Ignore Ignore Accept Ignore Ignore Close Ignore Ignore Ignore Reset, Reset Reset CLIENT- CLIENT- WAIT WAIT Reset Ignore CLOSED CLOSED Ignore CLOSED Ignore Timer NA Request, Connect, NA Close[3], CLOSED Backoff Backoff Backoff Backoff NA CLOSED Close[1], NA CLOSED NA Fail CLIENT- CLOSE App Request, CLOSED Close[2], Close[2], No action NA action REQUEST CLIENT- CLIENT- CLOSE CLOSE [1] Close Reason equal to "Connect fail". [2] Close Reason equal to "Normal" [3] Close Reason same as Close Reason on state entry. 5. Congestion Control IDs Each congestion control mechanism supported by DCCP-Lite is assigned a congestion control identifier, or CCID: a number from 0 to 255. Phelan Expires - February 2004 [Page 21] DCCP-Lite August 2003 During connection setup the endpoints specify the congestion control mechanism for each half-connection. The CCID for the client-to- server half-connection is specified in the DCCPL-Request packet (section 3.3). The CCID for the server-to-client half-connection is specified in the DCCPL-Response packet (section 3.4). All CCIDs standardized for use with DCCP-Lite will correspond to congestion control mechanisms previously standardized by the IETF. We expect that for quite some time, all such mechanisms will be TCP- friendly, but TCP-friendly is not an explicit DCCP-Lite requirement. CCIDs for use with DCCP-Lite are defined in separate documents. These documents define the contents and use of the CC Data fields in DCCPL-Request, DCCPL-Response, DCCPL-Ack and DCCPL-DataAck packets. The initial set of CCID values is: CCID Meaning ---- ------- 0 Reserved 1 Reserved 2 TCP-like Congestion Control 3 TFRC Congestion Control A DCCP-Lite implementation intended for general use---in a general- purpose operating system kernel, for example---SHOULD implement at least CCIDs 1 and 2. The intent is to make these CCIDs broadly available for interoperability. Application-specific implementations of DCCP-Lite---in a DSP for IP telephony, for example---MAY implement only the needed CCID. 6. Maximum Transfer Unit A DCCP-Lite implementation MUST maintain its idea of the current maximum transfer unit (MTU) for each active DCCPL session. The particular MTU may be influenced by congestion control mechanisms, as well as Path MTU (PMTU) discovery [RFC 1191]. Any API to DCCP-Lite MUST allow the application to discover DCCPL's current MTU. DCCP-Lite applications SHOULD use the API to discover the MTU, and SHOULD NOT send datagrams that are greater than the MTU. A DCCP-Lite API MAY choose to let applications indicate that large packets should be fragmented; if an application not using that API tries to send a packet bigger than the MTU, the DCCP-Lite implementation MUST drop the packet and return an appropriate error. The PMTU SHOULD be initialized from the interface MTU that will be used to send packets. The MTU will be initialized with the minimum of the PMTU and any MTU set by the relevant CCID. Phelan Expires - February 2004 [Page 22] DCCP-Lite August 2003 To perform PMTU discovery, the DCCPL sender sets the IP Don't Fragment (DF) bit. However, it is undesirable for MTU discovery to occur on the initial connection setup handshake, as the connection setup process may not be representative of packet sizes used during the connection, and performing MTU discovery on the initial handshake might unnecessarily delay connection establishment. Thus, DF SHOULD NOT be set on DCCPL-Request, DCCPL-Response, and DCCPL-Connect packets. In addition DF SHOULD NOT be set on DCCPL-Reset packets, although typically these would be small enough to not be a problem. On all other DCCPL packets, DF SHOULD be set. As specified in [RFC 1191], when a router receives a packet with DF set that is larger than the PMTU, it sends an ICMP Destination Unreachable message to the source of the datagram with the Code indicating "fragmentation needed and DF set" (also known as a "Datagram Too Big" message). When a DCCP-Lite implementation receives a Datagram Too Big message, it decreases its PMTU to the Next-Hop MTU value given in the ICMP message. If the MTU given in the message is zero, the sender chooses a value for PMTU using the algorithm described in Section 7 of [RFC 1191]. If the MTU given in the message is greater than the current PMTU, the Datagram Too Big message is ignored, as described in [RFC 1191]. (We are aware that this may cause problems for DCCP-Lite endpoints behind certain firewalls.) If the DCCP-Lite implementation has decreased the PMTU, and the sending application attempts to send a packet larger than the new MTU, the API MUST cause the send to fail returning an appropriate error to the application, and the application SHOULD then use the API to query the new value of the PMTU. When this occurs, it is possible that the kernel has some packets buffered for transmission that are smaller than the old PMTU, but larger than the new PMTU. The kernel MAY send these packets with the DF bit cleared, or it MAY discard these packets; it MUST NOT transmit these datagrams with the DF bit set. DCCP-Lite currently provides no way to increase the PMTU once it has decreased. A DCCPL sender MAY optionally treat the reception of an ICMP Datagram Too Big message as an indication that the packet being reported was not lost due congestion, and so for the purposes of congestion control it MAY ignore the DCCP receiver's indication that this packet did not arrive. However, if this is done, then the DCCPL sender MUST check the ECN bits of the IP header echoed in the ICMP message, and only perform this optimization if these ECN bits indicate that the packet did not experience congestion prior to reaching the router whose MTU it exceeded. Phelan Expires - February 2004 [Page 23] DCCP-Lite August 2003 7. Security Considerations DCCP-Lite does not provide cryptographic security guarantees. Applications desiring hard security should use IPsec or end-to-end security of some kind. Nevertheless, DCCP-Lite is intended to protect against some classes of attackers. Attackers cannot hijack a DCCP-Lite connection (close the connection unexpectedly, or cause attacker data to be accepted by an endpoint as if it came from the sender) unless they can guess valid sequence numbers and connection nonces. Thus, as long as endpoints choose initial sequence numbers well, a DCCP-Lite attacker must snoop on data packets to get any reasonable probability of success. The sequence number validity (section 3.2) and connection nonce (sections 3.3, 3.4, 3.7 and 3.8) mechanisms provide this guarantee. We also avoid leaking sequence numbers and connection status to possibly malicious endpoints. This is why many invalid packets are ignored. The Init Cookie mechanism allows servers to avoid keeping state for connection requests until the "liveness" of the source has been proved. 8. IANA Considerations DCCP-Lite introduces three sets of numbers whose values should be allocated by IANA. o 8-bit DCCPL-Reset Reasons (section 3.8). o 8-bit DCCPL-Close Reasons (section 3.7). o 8-bit DCCP-Lite Congestion Control Identifiers (CCIDs) (section 5). In addition, DCCP-Lite requires a Protocol Number to be added to the registry of Assigned Internet Protocol Numbers. Experimental implementors should use Protocol Number 33 for DCCP-Lite, but this number may change in the future. 9. Normative References [RFC 2119] S. Bradner, Key Words For Use in RFCs to Indicate Requirement Levels, RFC 2119. [RFC 793] J. Postel, editor. Transmission Control Protocol. RFC 793. [RFC 2460] S. Deering and R. Hinden. Internet Protocol, Version 6 (IPv6) Specification. RFC 2460. Phelan Expires - February 2004 [Page 24] DCCP-Lite August 2003 [RFC 1191] J. C. Mogul and S. E. Deering. Path MTU Discovery. RFC 1191. 10. Informative References [DCCP] E. Kohler, M. Handley, S. Floyd, J. Padhye, Datagram Congestion Control Protocol (DCCP), June 2003, draft- ietf-dccp-spec-04.txt, work in progress. [RFC 1948] S. Bellovin. Defending Against Sequence Number Attacks. RFC 1948. 11. Acknowledgments This document depends heavily on [DCCP]. Large blocks of text were lifted with little or no changes. 12. Author's Address Tom Phelan Sonus Networks 5 Carlisle Rd. Westford, MA 01886 Phone: 978-589-84546 Email: tphelan@sonusnet.com Phelan Expires - February 2004 [Page 25]