$Id: E2E-spec.txt,v 1.5 2003/07/14 22:15:59 nickm Exp $

                                MIX3:2
         Type-III Remailers: End-to-end Encoding and Delivery

                            George Danezis
                           Roger Dingledine
                            Nick Mathewson
                             (who else?)

Status of this Document

   This draft document ("E2E-spec.txt") describes a proposed
   specification for Type III remailers.  It is not a final version.
   It has not yet been submitted to any standards body.

   TODO: See open issues section.

Abstract

   This document describes a formats and algorithms used by client
   software for Type III mixes; and by Type III mixes that support
   delivery.  It does not contain information for the formats and
   algorithms that these applications share with relay-only remailers
   -- for these, see "minion-spec.txt".

   Although this document discusses security issues in implementing
   Type III mix software, it is not comprehensive, nor does it discuss
   all implementation issues.

Table of Contents

            Status of this Document                                    X
            Abstract
            Table of Contents
   1.       Introduction
   1.1.     Terminology

   2.       End-to-end message encoding
   2.1.     Building blocks
   2.1.1.   Building blocks: Compression
   2.1.2.   Building blocks: K-of-N fragmentation
   2.1.3.   Building blocks: Whitening
   2.2.     Generating messages
   2.2.1.   Bare message format
   2.2.2.   Generating plaintext forward messages
   2.2.3.   Generating encrypted forward messages
   2.2.4.   Generating reply messages
   2.2.5.   Generating stateless SURBs
   2.3.     Decoding messages
   2.3.1.   Decoding algorithm
   2.3.2.   Overcompressed messages
   2.3.3.   Reconstruction issues

   3.       Message Delivery
   3.1.     General issues
   3.1.1.   ASCII armor
   3.2.     MBOX
   3.2.1.   Formatting: Routing information
   3.2.2.   Formatting: Message body
   3.2.3.   Delivery
   3.2.4.   Server descriptor section
   3.3.     SMTP
   3.3.1.   Formatting: Routing information
   3.3.2.   Formatting: Message body
   3.3.3.   Delivery
   3.3.4.   Server descriptor section

   A.1.     Appendix: Versioning and alphas

1. Introduction

   This document is an adjunct to the main Type III (Mixminion) Mix
   protocol specification in "minion-spec.txt".  Whereas the main
   specification describes formats and algorithms which all compliant
   Type III mixes and clients users must generate and process, this
   document describes formats and algorithms for use at the edges of
   the system: by user software that generates anonymous messages; by
   user software that generates reply blocks; by exit nodes that
   deliver Type III messages; and by user software that decrypts
   encrypted Type III messages.

   Although it is possible for Type III mixes to support other
   delivery methods, and possible for clients to encode end-to-end
   messages in different ways, we _strongly_ encourage new
   implementations to remain compatible with these methods.  Since
   users receive anonymity from the cover traffic provided by other
   users, additional methods and options for delivery and encoding
   come at the expense of decreasing the anonymity provided by the
   system.

1.1. Terminology

      * "Packet" - A Type III mix packet.  Fixed in size to 32 KB.
      
      * "Message" - A variable size end-to-end message, transmitted
        from one location from another.  May be larger or smaller
        than a single packet.

      * "Forward message" - A message sent anonymously to a known
        recipient.
        
      * "Direct reply message" - A message sent to an known recipient,
        with no attempt to maintain sender anonymity.

      * "Anonymous reply message" - A message sent anonymously to an
        unknown recipient".

      * "Forward plaintext message" - A forward message sent in the
        clear.

      * "Forward encrypted message" - A forward message encrypted to
        the public key of its recipient.

      * "Corrupted packet" - A packet which has had its payload 
        corrupted on the second leg of its path.  [Other forms of
        corruption will result in the message's not being delivered.]

2. End-to-end message encoding

   This section describes a method for encoding messages of any size
   into payloads for one or more Type III packets, using compression
   to reduce bandwidth and K-of-N message fragmentation to improve
   reliability with large messages.  This method limits opportunities
   for end-to-end traffic analysis by making corrupted packets,
   packets from forward encrypted messages, and packets from reply
   messages indistinguishable to anyone besides their intended
   recipients.  It allows forward plaintext messages, however, to be
   read by exit servers, so that they can deliver them to recipients
   who aren't running Type III client software.

   This form of End-to-End encoding is used for the delivery types
   SMTP and MBOX, and will likely be appropriate for other delivery
   methods with similar needs.

2.1. Building blocks
   [XXXX]

2.1.1. Compression
   
   We define a compression primitive based on ZLIB, as defined in
   RFC-1950 and RFC-1951.  Because these standards describe only a
   message format, and do not mandate a single compression algorithm,
   we must restrict the allowable means of compression further.  (If
   we allowed message encoders to choose among valid ZLIB compression
   algorithms, they would become partitionable.)

   Specifically, we define COMPRESS(M) to equal the result of
   compressing M using zlib 1.1.4 (as available from www.gzip.org) with
   the following parameters, not forcing any explicit flushes until the
   message is compressed:
         Compression level    = 9 (maximum)
         Compression method   = DEFLATED (default)
         Window size          = 32K (default)
         Memory level         = 8 (default)
         Compression strategy = DEFAULT (default)

   Implementation note: Any software may be used, so long as it gives
   the same result as zlib 1.1.4 with these parameters.  Mark Adler
   from the Gzip team has averred that this is so with all zlib
   versions from 1.0.4 through 1.1.4, but *may* change in some future
   version.
   
   We also define DECOMPRESS as the inverse of COMPRESS: namely, ZLIB
   decompression as described in RFCs 1950 and 1951.  Note that not
   DECOMPRESS is not defined for every sequence of bytes.
   
2.1.2. K-of-N fragmentation

   We define a primitive, FRAGMENT, that breaks a K-packet message
   into N>K packets, such that any K of those packets are sufficient
   to reconstruct the original message with high probability.
   FRAGMENT(M,K,N,i) is the i'th such packet.
   
   We also define a primitive RECONSTRUCT(K,N,P_i1, P_i2,...P_iK) that
   reconstructs the original message.

   Note that these primitives only need to provide the above property
   of erasure correction, and need not provide "secret splitting": It
   is harmless if the message can be reconstructed from less than K
   packets.

   Currently, we use the algorithm described by Luigi Rizzo in several
   papers and implemented in software at
             http://info.iet.unipi.it/~luigi/fec.html . 
   This algorithm has
   several advantages: first, it has freely-available implementations
   in C and Java under a modified-BSD style license.  Second, it runs
   with acceptable performance for modest values of K (less than 30 or
   so).  Third, it does not seem to be patent encumbered.

   Of course, this algorithm has disadvantages.  First, it lacks a
   complete, byte-level specification beyond that given at the URL
   above.  Second, it performs poorly with very large K.  Sadly, it
   seems to be about the best we can do without touching patented
   algorithms.

   To avoid large K, we split the message into chunks, and do K-of-N
   fragmentation on the chunks.  With this scheme, a message can be
   reconstructed if and only if K packets from each chunk arrive
   intact. Because packet loss is likely to be nonuniformly concentrated
   at specific remailers and links, this is not so dangerous to
   reliability as it might initially appear. 

   [NOTE: I am not happy with this current situation: there needs to
    be _some_ unpatented probabilistic O(N) algorithm out there! -NM]

   We divide and fragment as follows:

       PROCEDURE: Divide a message into packets.  [DIVIDE(M,PS,EXF)]
       ARGUMENTS
           M: the message to send
           PS: payload sized (fixed)
           EXF: expansion factor (fixed: everyone must use the same EXF,
                                 see below.)
   
       Let M_SIZE = CEIL(LEN(M) / PS)

       Let K = Min(16, 2**CEIL(Log2(M_SIZE)))
       Let NUM_CHUNKS = CEIL(M_SIZE / K)
   
       Let M = M | PRNG(Len(M) - NUM_CHUNKS*PS*K)
   
       For i from 1 to NUM_CHUNKS:
          Let CHUNK_i = M[(i-1)*PS*K : i*PS*K]
       End
   
       Let N = Ceil(EXF*K)
   
       For i from 0 to NUM_CHUNKS-1:
         For j from 0 to N-1:
           FRAGMENTS[i*N+j] = FRAGMENT(CHUNK_i, K, N, j)
         End loop
       End loop
   
       Return FRAGMENTS.

  [If we find an unpatented O(N) algorithm, we use it instead of this
   junk. -NM]

2.1.3. Building blocks: whitening

   While some fragments of a message are stored, but before the entire
   message has been received, a window of vulnerability exists on the
   exit server.  To prevent any portion of a message from being read in
   the clear before enough packets from the message have arrived,
   the following whitening formula to messages before
   fragmentation:

   WHITEN(M) = SPRP_Encrypt(K_whiten, "WHITEN", M)
   UNWHITEN(M) = DPRP_Decrypt(K_whiten, "WHITEN", M)

   where K_whiten is equal to the octet sequence {57 48 49 54 45 4E}.

   Note that applying K_whiten and DIVIDE together has the effect of
   turning DIVIDE from an erasure correcting code into a full secret
   sharing encoding: If insufficient packets are received to reconstruct
   the whole message, none of the message can be reconstructed.
   
2.2. Generating messages

   When sending a message as packets, a sender follows these steps:

      1. Determine the type of the message: encrypted forward, plaintext
         forward, or reply.
      2. Compress and possibly fragment the message into a set of
         payloads.  (The size of the payloads will depend on the type
         of the message.)
      3. Annotate each payload with a payload header.  (The payload
         header includes size, integrity, and fragmentation
         information.)
      4. According to the type of the message, encode each payload into 
         a final 28KB paylaod and (possibly) 20-octet decoding handle.
      5. For each payload, select a list of servers to form a path
         through the network.
      6. Using the decoding handle, payload contents, and route for
         each payload, generate a 32KB type III packet.
      7. Deliver each packet 

   This section will describe steps 1, 2, 3, 4.  Step 5 is described
   more fully in "path-spec.txt".  Steps 6 and 7 are described in
   "minion-spec.txt".   
      
2.2.1. Packet format

   As described in minion-design.txt, every mixminion packet contains a
   28KB playload and two 2KB headers.  In this design, the routing
   information for the final hop in the final header contains a 20-octet
   decoding handle or 'tag'.

2.2.1.1. Decoding handles

   The decoding handle is used by the recipient to determine how to
   decode or decrypt the final message.

   (Because it is part of the header, this decoding handle must
   be generated by the same entity that creates the second leg of the
   path: the message sender in the case of forward messages, and the
   SURB generator in the case of reply messages.)

   In all types of messages, the decoding handle looks like 159-bit
   random number, preceeded by a single zero bit.

   In the various packet types, the decoding handle is used as follows:

      * Plaintext forward - the handle is unused; its value is random.

      * Encrypted forward - the handle holds the first 20 octets of the
        RSA-encrypted session key for this packet.  

      * Reply - the handle (generated by the SURB creator) contains a 
        random seed used to seed the RNG that generated the master
        secrets for this SURB header.

2.2.1.2. Payload formats

   The payload of every packet, when decrypted, begins with the one of
   the two following headers.  

   SINGLE-PACKET-MESSAGE Payload:  [Header size is 22 octets]
   
      (Single-packet message flag) [1 bit: 0]
      (Length of packet contents)  [15 bits]
      (Hash of packet contents)    [20 octets]

   The length field encodes the size of the contents of the packet
   after compression. 

   FRAGMENT Header:  [Header size is 47 octets]
   
      (Multi-packet message flag)       [1 bit: 1]
      (Packet index)                    [23 bits]
      (Hash of remaining fields and packet
       contents)                        [20 octets]
      (Message ID)                      [20 octets]
      (Message size)                    [4 octets]

   The packet index contains the position of the packet within the
   FRAGMENTS list generated by DIVIDE.  The Message ID is a random
   20-octet sequence.  The Message Size is the size of the whole
   message after compression.

   We define the constants FRAGMENT_HEADER_LEN = 47 and
   SINGLETON_HEADER_LEN=22.
   
   To break a message into packets with headers (steps 2 and 3 in 2.2
   above), an implementation follows these steps:

      PROCEDURE: PACKETIZE_MESSAGE(M, OVERHEAD)
      ARGUMENTS
          M: the message to send
          OVERHEAD: overhead needed for message type

      Let M_C = COMPRESS(M).

      If LEN(M_C)+SINGLETON_HEADER_LEN+OVERHEAD <= 28KB :
          Let PADDING_LEN = 28KB - LEN(M_C) - SINGLETON_HEADER_LEN
                            - OVERHEAD

          Let PADDING = Rand(PADDING_LEN)
          
          Return a singleton payload containing:
              Flag 0 | Int(15,LEN(M_C)) | Hash(M_C | PADDING) | M_C | PADDING

      Otherwise:
          Let FRAGMENTS = DIVIDE(M_C, 28KB-OVERHEAD-FRAGMENT_HEADER_LEN)
   
          Let ID = Rand(TAG_LEN)

          Let SZ = Int(32,Len(M_C))

          For every FRAGMENT_i in FRAGMENTS:
             
             Let PAYLOAD_i = a fragment payload containing:
                Flag 1 | Int(23,i) | Hash(ID | SZ | FRAGMENT_i ) | ID
                   | SZ | FRAGMENT_i

          return every PAYLOAD_i.

2.2.2. Generating plaintext forward messages

   To send a plaintext forward message M, we first let PAYLOADS =
   PACKETIZE(M, 28KB).  For every PAYLOAD_i, we set TAG_i = to a
   randomly chosen 159-bit integer.

   We then transmit each payload with its corresponding tag.

2.2.3.   Generating encrypted forward messages

   To send an encrypted forward message M to a user with an RSA public
   key PK with length PKLEN (in octets), we set PAYLOADS =
   PACKETIZE(M, 28KB-OAEP_OVERHEAD+TAG_LEN-SPRP_KEY_LEN).  (We lose 42
   octets to OAEP padding and 20 to encode the session key, but gain
   20 by spilling the encrypted data into the decoding tag.)

   For every payload PAYLOAD_i:

      Repeat:
           Let K = Rand(SPRP_KEY_LEN).
           Let P = K | PAYLOAD_i
           Let P0 = PK_Encrypt(PK, P[0:PKLEN-OAEP_OVERHEAD])
      Until the most significant bit of P0[0] is equal to 1.
      Let P1 = SRPR_Encrypt(K, "END-TO-END ENCRYPT",
                            P[PKLEN-OAEP_OVERHEAD: Len(P)-PKLEN-OAEP_OVERHEAD])
      Let TAG_i = P0[0:TAG_LEN]
      Let EPAYLOAD_i = P0[TAG_LEN:Len(P0)-TAG_LEN] | P1

   We then transmit every payload EPAYLOAD_i with the corresponding tag TAG_i.

2.2.4.   Generating reply messages

   To send a reply message M to an anonymous recipient, we set
   PAYLOADS = PACKETIZE(M, 28KB).   We send each PAYLOAD_i with a
   separate SURB_i -- we must have enough to use a different SURB for
   each message.  We do not need to include TAG (decoding handle)
   fields: they are a part of the SURB.

   SURB users SHOULD keep track of which SURBs they have used to
   prevent multiple use, at least until the SURBs have expired.

2.2.5.   Generating stateless SURBs

   In order to avoid storing a set of keys for every outstanding
   SURB, SURB generators use the following SURB generation
   procedure.  To use this method, SURB generators must store a
   separate long-term secret for each identity they wish to associate
   with a chain of SURBs.
   
   (Client software MUST support multiple identities, and MUST make
   it clear to the user which identity has been associated with each
   incoming SURB.)

   To generate a SURB for a path of length PATH_LEN, using a long-term
   secret SEC:
      
      Repeat:
         Let SEED = a random 159-bit seed.
      Until Hash(SEED | SEC | "Validate") ends with a 0 byte.

      Let K = Hash(SEED | SEC | "Generate")[0:KEY_LEN]

      Let STREAM = Encrypt(K, Z(KEY_LEN*(PATH_LEN + 1)))

      Let SHARED_SECRET = STREAM[PATH_LEN*KEY_LEN:KEY_LEN]

      For i in 1 .. PATH_LEN
         Let MS_i = STREAM[(PATH_LEN-i)*KEY_LEN : KEY_LEN]

      Generate a reply block using MS_i as the master secret for the
      i'th node in the hop, SEED as the tag, and SHARED_SECRET as the
      end-to-end shared secret.

2.3. Decoding messages

   When a Type III mix receives an exit packet, it tries to decode it
   (if it can) before delivery, and otherwise delivers it undecoded.
   When a client receives an undecoded exit packet, it tries to
   decode it before presenting it to the user.

2.3.1. Decoding algorithm

   Message decoders recognize plaintext (singleton or fragment)
   payloads by checking whether the hash fields match the calculated
   hash of the rest of the packet.  

   If a message decoder knows one or more SURB secrets, it then checks
   the decoding handle 'TAG' to see whether Hash(TAG | SEC |
   "Validate") ends with a zero byte for any secret SEC.  If so, the
   decoder generates secrets from TAG | SEC as in SURB generation, and
   successively decrypts the payload with up to MAX_PATH of them,
   checking each time for a plaintext payload.
   
   If no SURB secrets are known, or if no SURB secrets yield a
   plaintext payload, and the decoder knows one or more secret keys
   SK_i, it then checks whether PK_Decrypt(SK_i, TAG |
   P[0:PK_LEN-TAG_LEN]) has valid OAEP padding for some SK_i.  If so,
   it extracts K from the first 20 bytes of the decrypted value, and
   uses K to LIONESS-decrypt the rest of the payload.

   If none of these approaches works, the decoder has failed.  Upon
   failure, an exit node simply delivers the undecoded message and
   decoding handle to the message's recipient, in hopes that the
   recipient will have SEC or SK values.  A client, however, marks
   the message as JUNK.


   PROCEDURE: DECODE_PLAINTEXT_PAYLOAD
   ARGUMENTS:
       P: a payload to decode

   If the first bit of P[0] is 0:
      If P[2:HASH_LEN] = Hash(P[2+HASH_LEN:Len(P)-2-HASH_LEN]):
         SZ = P[0:2]
         Return "Singleton", P[2+HASH_LEN : SZ]
      Otherwise, 
         Return "Unknown"
   Otherwise, if the first bit of P[0] is 1:
      If P[3:HASH_LEN] = Hash(P[3+HASH_LEN:Len(P)-3-HASH_LEN]):
         IDX = P[0:3] & 0x7fffff
         MSG_ID = P[3+HASH_LEN:20]
         MSG_SZ = P[3+HASH_LEN+20:4]
         FRAG = P[3+HASH_LEN+20+4:Len(P)-3-HASH_LEN-20-4]
         Return "Fragment", IDX, MSG_ID, MSG_SZ, FRAG
      Otherwise:
         Return "Unkown"

   PROCEDURE: DECODE_PAYLOAD
   ARGUMENTS:
       P: a payload to decode
       TAG: decoding handle for the payload
       SK_1 ... SK_n: Optionally, a list of RSA secret keys
       SEC_1 ... SEC_n: Optionally, a list of SURB secrets.

   If DECODE_PLAINTEXT_PAYLOAD(P) is not "unknown", return it.

   For all SEC_i:
      If H(TAG | SEC_i | "Validate") ends with a zero byte:
         K = H(TAG | SEC_i | "Generate")
         STREAM = ENC(K, Z(MAX_PATH * KEY_LEN))
         Let P_t = P.
         For j in 0 ... MAX_PATH-1:
            Let P_t = SPRP_Encrypt(STREAM[j * KEY_LEN : KEY_LEN],
                                   "PAYLOAD_ENCRYPT", 
                                   P_t)
            If DECODE_PLAINTEXT_PAYLOAD(P_t) is not "Unknown", return it.
   
   For all SK_i:
      Let E0 = TAG | P[0:Len(SK_i)-TAG_LEN]
      Let P0 = PK_Decrypt(SK_i, E0).
      If the OAEP padding is valid:
         Let K = P0[0:KEY_LEN]
         Let P0' = P0[KEY_LEN:Len(P0)-KEY_LEN]
         Let P1 = SPRP_Decrypt(K, "END-TO-END ENCRYPT", 
                               P[Len(SK_i)-TAG_LEN : Len(P)-Len(SK_i)+TAG_LEN])
         If DECODE_PLAINTEXT_PAYLOAD(P0'|P1) is not "Unknown", return it.

   Otherwise, return "Unknown".

2.3.2. Overcompressed messages

   Because zlib allows up to 1000-fold compression, using zlib for
   message compression creates opportunities for serious mailbombing.

   When decoding a message, decoders MUST check whether the
   decompressed size of the message will be "far longer" than the
   compressed size.  In general, if C = COMPRESS(P), and Len(P) > 20K,
   and Len(P)/Len(Z) > 20, then P SHOULD BE considered overcompressed.

   Decoders MUST decode incrementally, so that they can notice
   overcompressed messages without using too much space.  Upon
   encountering an overcompressed message, an exit node MUST mark it
   as such and deliver it to the user without uncompressing it.

   Upon encountering an overcompressed message, client software SHOULD
   alert the user and require explicit confirmation before
   decompressing the message.

3. Delivery

   This section describes the standard message delivery types provided
   with the Mixminion Type III mix implementation.  Other
   implementations MAY implement other types in other ways, though
   they SHOULD avoid adding new types in ways that would exacerbate
   partitioning attacks.

   This document *does not* describe routing types or transfer methods
   used for mix-to-mix communication; see "minion-spec.txt" for those.

3.1. General issues

3.1.1. ASCII armor

   When encoding an overcompressed or undecodeable Type III message,
   exit nodes MUST apply OpenPGP ASCII armor, as defined in RFC2440,
   section 6.2.  The header text is "BEGIN TYPE III ANONYMOUS
   MESSAGE".  There are two armor headers: "Message-type" (required)
   and "Decoding-handle" (optional).  The value of "Message-type" must
   be "encrypted" for an undecodeable message, and "overcompressed"
   for an overcompressed message.  For an undecodeable message, the
   decoding handle MUST be included, base-64-encoded, as the value
   of "Decoding-handle".  Otherwise, the "Decoding-handle" header
   MUST be omitted.

   When encountering a plaintext fragment, an exit node that does not
   support message reconstruction, or that is not willing to
   reconstruct a message of the given size, SHOULD deliver the
   fragment as-is, with armor described above, and "Message-type" set
   to "fragment".

   When encoding a plaintext Type III message, exit nodes MAY apply
   OpenPGP ASCII armor if the message contains characters other than
   printing ASCII, and no encoding is specified in the message.  When
   doing so, the "Message-type" header must be "binary".

   Otherwise, exit nodes MAY format the message with OpenPGP armor
   headers and dash-escaped text.  In this case, the "Message-type"
   header MUST be "plaintext".

   [XXXX Right now, there's no way to specify an encoding in a
   message.  Don't worry--you didn't misread. -NM] 

3.1.2. RFC822 headers

   Delivery types that deliver messages via email or news protocols
   need to support setting limited set of headers from message
   payloads.

   Headers can fall in 4 classes:
      1. Set by exit node, not by message sender.  (Example: "Date")
      2. Set in packet header by path generator.  (Example: "To")
      3. May be set by message sender. (Example: "Subject")
      4. May be set partially by message sender. (Example: "From")

   To encode header values, we use the following message format:
      MESSAGE ::= HEADERS DATA
      HEADERS ::= HEADER HEADERS | HEADER_END
      DATA ::= (any sequence of octets)
      HEADER ::= HEADER_NAME COLON HEADER_VAL NL
      HEADER_END ::= NL
      NL ::= (ascii NL, hex 0A).
      COLON ::= (ascii ':', hex 3A).
      HEADER_VAL ::= HEADER_VAL_CHAR HEADER_VAL |
      HEADER_VAL_CHAR ::= (any character in the range hex 20 through
                           hex 7E inclusive)
      HEADER_NAME ::= HEADER_NAME_CHAR HEADER_NAME | HEADER_NAME_CHAR
      HEADER_NAME_CHAR ::= (any character in the range hex 21 through
                           hex 7E inclusive, excluding hex 3A.)

   Design note: We explicitly decline to implement full RFC[2]822.  This
   would add to the implementation complexity of Type III
   implementations, and endanger anonymity by allowing nonuniformity
   among client software packages.

   Unlike RFC[2]822, clients MUST use only recognized header names,
   and SHOULD normalize header values by removing leading or trailing
   space.  Unlike RFC[2]822, servers MUST remove unrecognized headers.

   To prevent distinguishability between clients, headers MUST appear
   in lexical (alphabetical) order.  Servers MUST NOT use out-of-order
   headers.

   To help implementations comply with RFC2822, each header MUST NOT
   be longer than 900 characters.

3.2. MBOX

   The routing type 0x101 corresponds to MBOX delivery.  Conceptually,
   an MBOX is an internally visible, Type III-only delivery address,
   specific to a single exit node.

3.2.1. Formatting: Routing information

   The routing info for an MBOX header MUST contain a 20-octet
   decoding handle, followed by a variable width MBOX name.  Exit
   nodes MUST drop packets addressed to unknown MBOXes, or packets
   with malformed routing info fields.

   The interpretation of the MBOX name is left to the exit node, and
   will vary between exit nodes.  Typically, exit nodes map from
   'username' to 'username@localhost' for a limited set of their user
   names, and deliver messages via sendmail.  Exit nodes MAY implement
   other schemes.

3.2.2. Formatting: Message body

  Header encoding is as described in 3.1.2 above.  

  The following headers are allowed:
        "SUBJECT"  (any.  Must be no more than 900 characters long.)
        "FROM"     (any sequence of printing ASCII characters
                    excluding '"', '[', ']', and ':'. )
 
        "IN-REPLY-TO" (an RFC2822 msg-id)
        "REFERENCES" (a list of RFC2822 msg-ids)

  [XXXX Are msg-ids really what we want? Should we say more?  Should
        we restrict encoding? -NM]

  Unrecognized or malformatted headers MUST be removed.  

3.2.3. Delivery

   When delivering an MBOX message via email, an exit node MUST
   construct an RFC2822 message as follows:

   The "To" line is the mailbox of the corresponding recipient.

   The "Subject" line is taken from the contents of the "SUBJECT"
   header, removing trailing and leading whitespace.  OPTIONALLY,
   implementations may prepend a short marker (e.g., "[ANON]" or
   "[MBOX]").  If no "Subject" line is provided in the message, exit
   nodes SHOULD include preconfigured one, such as "Type III Anonymous
   Message".

   The "From" line is generated as follows:
      Let F be the contents of the "FROM" header in the message.
      Remove all leading or trailing whitespace from F.
      Replace all sequences of 2 or more space characters in F with a
        single space.

      Prepend a double quote, a preconfigured marker (e.g., "[ANON]"),
        and a preconfigured exit node mailbox (e.g.,
        <nobody@____.com>).

      (Thus, if the sender specifies a "FROM" header of 'Lance
      Cottrell', an implementation could generate a 'From' header of
      the form:  "From: "[ANON] Lance Cottrell" <nobody@___.org>".)
 
   The "Date" line should be the current date.

   The "In-Reply-To" and "References" lines should be taken verbatim
   from the corresponding headers, if those headers are present.
   [XXXX Is this sensible? -NM]

   The "X-Anonymous" line should be present, and set to "yes".

   Note again that all unrecognized or misformatted headers MUST be
   rejected.

   The payload SHOULD be excaped as described in 3.1.1.

3.2.4. Server descriptor section

   Servers that support MBOX delivery MAY include a [Delivery/MBOX]
   section, containing the entry "Version: 1.0".  Other servers
   MUST NOT include a [Delivery/MBOX] section.
   
   If the server supports message reconstruction, the section MAY
   include a "Maximum-size" line, containing the maximum permitted
   message size in KB (before compression).

3.3. SMTP

   The routing type 0x100 corresponds to SMTP (email) delivery.

3.3.1. Formatting: Routing information

   The routing information for an SMTP header MUST contain a 20-octet
   decoding handle, followed by a variable-width mailbox.

   A mailbox MUST be the "username@host" part of an RFC2821 mailbox.
   (Using full RFC2822 allows too much distinguishability between
   senders, and makes blacklisting hard.)  A mailbox MUST obey the
   following format:

      MAILBOX ::= LOCALPART AT HOSTPART
      LOCALPART ::= ATOM | LOCALPART DOT ATOM
      HOSTPART ::= ATOM | LOCALPART DOT ATOM
      ATOM ::= ATOMCHAR | ATOM ATOMCHAR
      ATOMCHAR ::= Any character in the range hex 21 through hex 7E,
             excluding '[', ']', '(', ')', '<', '>', '@', ',', '.', 
             ';', ':', '\', and '"'.
      AT ::= '@' (ASCII hex 40)
      DOT ::= '.' (ASCII hex 2E)

   Additionally a HOSTPART MUST NOT be an IP address -- it would make
   blacklisting hard, and encourage senders to resolve target hosts.

3.3.2. Formatting: Message body
       
   The message body format is exactly as the MBOX format, as
   described above in 3.2.2.

3.3.3. Delivery

   To deliver an SMTP message, an exit node that supports the SMTP
   delivery type SHOULD construct an RFC2822 message as described in
   3.2.3 above, additionally setting the 'To' line to the mailbox
   given in the message header.  

   Implementations SHOULD allow exit node operators to configure
   additional fields, and to block specific 'To' addresses.
       
3.3.4. Server descriptor section

   Servers that support SMTP delivery MAY include a [Delivery/SMTP]
   section, containing the entry "Version: 1.0".  Other servers
   MUST NOT include a [Delivery/SMTP] section.
   
   If the server supports message reconstruction, the section MAY
   include a "Maximum-size" line, containing the maximum permitted
   message size in KB (before compression). [XXXX is "before"
   reasonable?]

A.1. Apendix: versioning and alphas

   Today's alpha code does not publish its version as '1.0'; it uses
   '0.x' instead (currently '0.1' for all versions in this document).
   Production versions MUST NOT retain backward compatibility
   with pre-production releases.