$Id: dir-spec.txt,v 1.8 2003/07/14 22:15:59 nickm Exp $ Mix3:3 Type III (Mixminion) Mix Directory Specifications George Danezis Roger Dingledine Nick Mathewson (who else?) Status of this Document This draft document ("dir-spec.txt") describes a proposed specification for Type III Remailers. It is not a final version. It has not yet been submitted to any standards body. TODO: - See open issues section - Resolve XXXXs - Final directory operation. Abstract This document describes the protocols and message formats used by Type III mix servers and Type III directories compatible with the Mixminion reference implementation, and with the forthcoming Mixmaster 4. See "minion-spec.txt" for information about the Type III protocol itself. Table of Contents Status of this Document X Abstract Table of Contents 1. Introduction 1.1. Terminology 2. Type-III information exchange format 2.1. Message format 2.2. Processing unrecognized information 2.3. Representing data 2.4. Calculating digests and signatures 3. Server descriptor format 3.1. Server identity 3.2. Descriptor liveness 4. Directory format 5. Directory protocols 5.1. Retrieving a directory 5.2. Publishing a server descriptor 6. Generating server descriptors 7. Generating directories 8. Downloading directories A.1. Appendix: Versioning and alphas A.2. Appendix: Suggested path selection algorithm 1. Introduction For a Mix network to provide anonymity to its users, it is vital that those users to provide cover traffic to one another by behaving as similarly as possible when choosing paths and servers for their messages. Because of this, it is vital that users have a means to learn about usable Mixes -- and that this means yield identical results for all users. Furthermore, because the Type-III Mix protocol relies on regular key rotation in order to limit the size of replay caches and mitigate the effect of key compromise, it is necessary for servers to be able to propagate information about their state to users. This document specifies the message formats and protocols used by the type-III remailer network to exchange information about servers. 1.1. Terminology * Server Descriptor - A human-readable text message describing a set of keys and capabilities for a single Mix. * Directory - A human readable text message describing a list of Mixes and their keys. * Identity Key - A long-term signing RSA key used for authenticating server descriptors and directories. * Nickname - A human-readable unique identifier for a Mix. This document uses the terms "MUST", "SHOULD", "MAY", "MUST NOT", "SHOULD NOT", and "MAY NOT" as defined in RFC 2119. 2. Type-III Information Exchange format To simplify the work required to write a parser for message formats, we base server descriptors on the following extensible meta-format. 2.1. Mix Information Exchange format Informally, a Mix Information Message is a sequence of newline-separated key-value pairs, with a colon between keys and values. These pairs are divided into sections; each section begins with a square-bracketed section identifier. Blank lines are not allowed. Formally, a Mix Information Message is a sequence of ASCII characters, consisting of zero or more Sections. Each Section contains a Header, and one or more Entries. Each Header consists of an left square-bracket ('[', ASCII 91), an Identifier, a right square-bracket (']', ASCII 93), and an EOL. Each Entry consists of an Identifier, a colon (':', ASCII 58), a Space, a Value, and an EOL. An Identifier is a sequence of one or more printable nonspace characters other than colon (ASCII 33-57,59-126 inclusive). A Space is a sequence of one or more space (' ', ASCII 32) or tab ('\t' ASCII 9) characters. A Value is a sequence of zero or more printing ASCII characters, excluding NL and CR (ASCII 9, 32-126 inclusive), and not beginning with a space or a tab. An EOL is an optional Space, followed by either a CR ('\r', ASCII 13), an NL ('\n', ASCII 10), or a CR-NL sequence. [XXXX Is there any reason to handle non-ASCII here? I don't want to walk down the road of having a dozen competing charsets and encodings. -NM] Here is a grammar, using C syntax for characters: Message ::= Section | Message Section Section ::= Header | Section Entry Header ::= '[' Identifier ']' EOL Entry ::= Identifier ':' Space OptValue EOL Identifier ::= IdentifierChar | Identifier IdentifierChar IdentifierChar ::= [Any character from ASCII 33 through ASCII 126 inclusive, excluding ASCII 58.] OptValue ::= Value | Value ::= NonSpaceValueChar | Value ValueChar NonSpaceValueChar ::= [Any character from ASCII 33 through ASCII 126 inclusive.] ValueChar ::= NonSpaceValueChar | ' ' | '\t' Space ::= SpaceChar | Space SpaceChar SpaceChar ::= ' ' | '\t' OptSpace ::= Space | EOL ::= OptSpace '\n' | OptSpace '\r' | OptSpace '\r' \n' An example follows (indented by a uniform number of spaces): [Section1] Key1: Value1 [Empty_Section] [Section-Three] Key-two: the second value is this value Key3: 2.2. Processing unrecognized information To enable backward-compatible extensions of the Exchange format, all processors of Mix Information Exchange Messages MUST behave as follows when encountering unrecognized headers or entries. When processessing a section with an unrecognized identifier, the processor must ignore the section completely. When processing a section with a recognized identifier, the processor must check whether it recognizes the version number of that section (usually encoded in an entry with the identifier 'Version'). If it does not recognize the version number, the processor must ignore the section completely. When encountering an entry with an unrecognized identifier, the processor must ignore the entry. 2.3. Representing data All formats use the following conventions to convert encoded values to and from their underlying semantic meaning: - Trailing whitespace is always ignored; sequences of whitespace are collapsed to a single space. - All numeric quantities are represented in decimal. - All dates are represented in YYYY-MM-DD format. - All times are represented in YYYY-MM-DD HH:MM:SS or YYYY-MM-DD HH:MM:SS.mmmm format, relative to UTC. [Compatibility note: Mixminion through 0.0.4 generates and accepts only US-style YYYY/MM/DD dates. To transition to ISO-style YYYY-MM-DD dates, version 0.0.5 will accept both styles and generate only US style. Version 0.0.6 will accept both styles and generate only ISO-style. Version 0.0.7 will accept and generate only ISO style.] - All binary data is base-64 encoded, with all linebreaks and space removed. - All boolean values are encoded as 'yes' or 'no'. - All RSA public keys are first encoded to binary with ASN.1, then encoded in base-64, with all linebreaks and space removed. - 'Address Patterns' are encoded according to the following grammar: AddressPattern ::= Address OptPortSpec Address ::= '*' | IP OptMask OptMask ::= '/' IP | IP ::= OptPortSpec ::= | Port | Port '-' Port An omitted mask defaults to 255.255.255.255. '*' is a synonym for 0.0.0.0/0.0.0.0. An omitted PortSpec defaults to 48099 for 'allow' entries and 0-65535 on 'deny' entries. 2.4. Calculating digests and signatures Several places in this specification require Messages to be self-signed with a given identity key. The digest of a message is computed with the following steps: - First, any trailing whitespace on any line is removed, and every EOL is converted to a single NL character. - Second, the message is converted to a "stub" format: the values of any unsigned entries in the message are replaced with the empty string. (Their entry lines now contain an identifier, a colon, a single ' ' character, and single a NL character.) - Third, a SHA-1 digest is computed over the resulting stub message. When signing a message, the signature is computed by taking the RSA signature of the digest with OAEP/PKCS1 padding and encoding, as described in "minion-spec.txt". RSA signatures are encoded in base-64. 3. Server Descriptor format This section describes the format of server descriptors, as uploaded to and downloaded from directory servers. A server descriptor is a promise, by a mix's administrators, to provide a given set of services, keys, and exit policies over a set period of time. The first section must be a 'Server' section. This section MUST include each of the following entries in any order, exactly once. 'Descriptor-Version': the string "1.0". 'Nickname': A human-readable identifier for this server. It MUST be no more than 128 characters. It MUST contain only the characters [A-Za-z0-9_@] and '-'. 'Identity': This Mix node's identity key, represented in ASN.1, and encoded in BASE64. The modulus of this key should be at least 2048 bits long and no more than 4096 bits long. The exponent of this key must be 65537. 'Digest': The digest of this descriptor. When generating the digest, the Digest entry is excluded as unsigned. 'Signature': The signed digest of this block, signed by the Identity key. When generating the digest, the Signature entry is excluded as unsigned. 'Published': The time when this block was generated. 'Valid-After': A date. After midnight GMT on this date, this server SHOULD support the operations listed in this descriptor. 'Valid-Until': A date. Until midnight GMT on this date, this server SHOULD support the operations listed in this descriptor. This date MUST be at least one day after the date in Valid-After. 'Packet-Key': The public key used to encode encode subheaders for Type-III packets. 'Packet-Versions': A comma-separated list of allowable major.minor versions for packets this server will process. In a production network, only one value should be used for this field. [Added in Mixminion 0.0.3] The 'Server' section MAY contain the following entries, at most once each: 'Contact': An email address that may be used to contact the administrator of this server. Must be no more than 256 characters. 'Contact-Fingerprint': Fingerprint of the server administrator's PGP key. Must be no more than 128 characters. 'Comments': Human-readable information about this server. Must be <1024 bytes long. It *must not* be necessary to read this information to use the server properly. 'Software': A string description of the software this server is running. Must be less than 256 characters. Softare SHOULD NOT take any action based on this field, other than to display it. [Added in Mixminion 0.0.3] 'Secure-Configuration': A boolean value. If true, the server must not be running in an insecure operating mode. [XXXX list these modes. Added in Mixminion 0.0.4] 'Why-Insecure': A human-readable string. This string SHOULD be present if and only Secure-Configuration is 'no'. If present, it SHOULD contain an explanation of why the operating mode is insecure. [Added in Mixminion 0.0.5] [Note: before computing the digest, all implementations must normalize CR and CR-LF style newlines to a single NL, and remove any spaces and tabs that may have been introduced at the ends of lines.] If this server accepts incoming MMTP connections, it MAY have an 'Incoming/MMTP' section, with the following entries, exactly once each: 'Version': The string '1.0' 'IP': An IPv4 address, in dotted-quad format. 'Port': A port at which IP accepts incoming MMTP connections. 'Key-Digest': The KEYID of this server, encoded in BASE64. This is the same as the SHA-1 digest of the ASN.1 encoding of the identity key. (See "minion-spec.txt" for more information.) 'Protocols': A comma-separated list of the versions of MMTP this server accepts. The 'Incoming/MMTP' section MAY contain any number of entries of and any number of entries of the form: 'Allow': AddressPattern 'Deny': AddressPattern If this server supports outgoing MMTP connections, it MAY have a 'Outgoing/MMTP' section, with one entry each of the form: 'Version': The string '1.0' 'Protocols': A comma-separated list of versions of MMTP this server supports for outgoing connections. The 'Outgoing/MMTP' section MAY contain any number of entries of the form: 'Allow': AddressPattern 'Deny': AddressPattern These entries are order-significant; the first one to match wins. The default policy is 'Deny: *'. The 'Testing' section MAY be generated some type III remailers to describe other information about their configuration that may be useful for debugging. Implementations MUST NOT require any specific entries or within 'Testing'; implementations also MUST NOT require any specific format for entries that may be present. If this server supports outgoing delivery mechanisms, it MAY have corresponding delivery sections. See 'E2E-spec.txt' for more details on specific types, including SMTP and MBOX. Other services provided by this server SHOULD each have their own section. Note: A server MAY omit some of its capabilities from its descriptor. It is permissible (for example) for a server that supports incoming MMTP connections to omit the Incoming/MMTP section.) A server MUST NOT, however, advertise any capabilities it does not support. 3.1. Server Identity Every server descriptor contains two fields that identify the corresponding mix: the Identity public key, and the Nickname. Because only the Mix has the private key corresponding to the Identity key, the identity key works as a unique identifier for the Mix. For user convenience, a mix's Nickname also serves as a unique identifier. Every nickname SHOULD correspond to a single identity key: directory servers and clients SHOULD reject descriptors that use use the same nickname as a previously encountered descriptor but change the identity key. All nickname matches MUST be case insensitive. 3.2. Descriptor liveness When choosing between multiple server descriptors for the same Mix valid at the same time, implementations SHOULD choose the most recently published descriptor. The interval of time between a descriptor's 'Valid-After' and 'Valid-Until' dates is called its 'Lifetime'. If some descriptor's lifetime is in the past, that descriptor is said to be 'Expired'. If a descriptor's lifetime is all either in the past or contained within the lifetimes of more recently published descriptors for the same server, that descriptor is said to be 'superceded'. 4. Directory Format A directory contains a list of Mixminion servers which are believed to be operational at a given time. A directory MUST contain all of the following, in order: - A 'Directory' section, - A 'Signature' section, - A 'Recommended-Software' section, - One or more server decriptors (see section 3 above). The 'Directory' section MUST contain the following entries: - 'Version': The string '1.0' - 'Published': The time when this directory was generated. - 'Valid-After' : A date. This directory SHOULD NOT be used before midnight GMT on this date. - 'Valid-Until' : A date. This directory SHOULD NOT be used after midnight GMT on this date. This date SHOULD be exactly one day after the date in 'Valid-After'. - 'Recommended-Servers' : A comma-separated list of server nicknames. Clients SHOULD NOT depend on servers whose nicknames are not on this list to be reliable or trustworthy. The 'Signature' section MUST contain the following fields: - 'DirectoryIdentity' : The Identity key of the directory server that generated this directory. This modulus of this key must be between 2048 and 4096 bits long, and the exponent must be 65537. - 'DirectoryDigest' : The digest of the entire directory. The value of this entry is unsigned. - 'DirectorySignature' : The signature of the directory digest with the directory server's identity key. The value of this entry is unsigned. The 'Recommended-Software' section MUST contain the following entries: - 'MixminionClient' : A comma-separated list of up-to-date versions versions of Mixminion. If a client is running a version more recent than any on the list, it SHOULD issue a warning. If a client is running a version not on the list, and some version on the list is more recent than the client's version, the client SHOULD issue a warning, and MAY refuse to run. - 'MixminionServer' : A comma-separated list of up-to-date versions of Mixminion. Servers should interpret this list as clients interpret 'MixminionClient'. Entries in 'MixminionClient' and 'MixminionServer' are in decreasing order of preference. Because the version numbering scheme will be different for each implementation, lines within 'Recommended-Software' are version specific. Other implementations of Type-III should generate similar entries in 'Recommended-Software'. 5. Directory Protocols Compliant directory servers MUST provide HTTP URLs to download a current directory or upload a descriptor for inclusion in a directory. 5.1. Retrieving a directory A directory server SHOULD provide one or more well-known HTTP URLs at which its directory can be downloaded. Retrieving such a URL MUST yield a currently valid directory, or one which will be valid 'very soon' [XXXX how soon?]. The contents of such a URL MAY be compressed with GZIP to save bandwidth and speed downloads. Directory servers SHOULD also make their identities well-known out of band. 5.2. Publishing a server descriptor A directory server SHOULD provide one or more well-known HTTP URLs to publish server descriptors to the directory. To upload a descriptor block, a client performs an HTTP POST request to the upload URL, with the server block as the contents of a single parameter, 'desc'. The server MUST reply to an upload with a message of Content-Type test/plain, and contents of the form UploadReply ::= StatusLine MessageLine StatusLine ::= "Status: " Bit EOL Bit ::= '0' | '1' MessageLine ::= "Message: " Value EOL If the upload is successful and the descriptor will be accepted into the directory, the status MUST be 1, and the message MUST be 'Accepted.'. Otherwise, if the upload was successful and the descriptor will not be accepted into the directory, the status MUST be 0, and the message SHOULD be a description of why the server descriptor was not accepted. Finally, if the upload was successful, but the descriptor will only be accepted into the directory when manually approved by the administrator, the status MUST be 1, and the message MUST be a description of the status of the desctiptor, and MUST NOT be 'Accepted.'. 6. Generating server descriptors Servers SHOULD generate at least a [[two weeks]] of keys in advance, and SHOULD allow about [[2.5 days]] for newly published keys to appear in the directory. Servers SHOULD continue to accept packets encrypted to old keys at least [[20 hours]] after their published Value-Until date, and SHOULD NOT accept new keys until their published Valid-After date. [XXXX These ranges above are all guesses. -NM] Servers MAY refrain from publishing their keys entirely. When a server's capabilities or configuration changes in such a way as to render a previous server descriptor incorrect, it SHOULD immediately generate a new server descriptor for each of the existing server descriptors it has published, using the same keys as used in the existing published descriptors. To advertise a planned outage, a server SHOULD publish a server descriptor valid over the time of the entire planned outage, with all sections except 'Server' absent. 7. Generating directories Directory servers SHOULD change their directories only at midnight GMT. For every mix, the directory SHOULD either include no descriptors from that mix; or include all of that mix's published descriptors that are not expired or superceded, and that do not become valid before a given cutoff time. [[2 weeks]]. A directory server decides, for every mix whose descriptors descriptors it has received, whether that mix is 'Trustworthy' and 'Reliable'. A 'trustworthy' mix is one that is believed to protect its users anonymity. A 'reliable' mix is one that delivers packets with high probability and reasonable latency. Directories SHOULD include only trustworthy and reliable mixes on their 'Recommended-Servers' entry. 8. Downloading directories Any client which has not downloaded a directory since before midnight GMT, SHOULD download a fresh directory before generating any packets. A.1. Versioning and Alphas Today's alpha code does not publish its version as '1.0'; it uses '0.x' instead (currently '0.2' for Descriptor-Version, '0.2' for directory Version, and '0.1' for everything else.) Production versions MUST NOT retain backward compatibility with pre-production releases. When generating Recommended-Software entries for Mixminion, we follow the following policies: A development version of Mixminion (pre 1.0) will only be declared obsolete when it is either too insecure or too buggy to use, when backward compatibility is broken, or when a new stable release comes out. Stable releases will be taken off the list only for security or privacy reasons. A.2. Generating paths Given a valid directory, clients MAY generate paths of any length when they send messages. Nonetheless, client implementors SHOULD prefer consider the following approach to path selection, and SHOULD be aware of anonymity issues if another algorithm is chosen. Implementations MAY allow users to specify paths of their own. If they do, implementations SHOULD at least warn users who generate paths that would not be generated by the standard algorithm. Implementations that allow path selection SHOULD allow partial path selection as well. Implementations SHOULD at least allow users to specify 'random path of length N', and 'random path of random length (E[length]=N)'. [This algorithm is implemented in Mixminion 0.0.5] The inputs to this algorithm are: 1) The earliest time at which the path must be valid. (T_Start) 2) The latest time at which the path must be useable. (T_End) 3) A template consisting of two lists of server specifiers, one list for each leg on the path. A server specifier is either a nickname, or the token 'ANY' to indicate a randomly chosen server. When generating a SURB, the first list is empty. When using a SURB, the second list is empty. 4) An exit address, unless using a SURB. First, the client builds a list of 'current' server descriptors. This list contains every server descriptor that meets these criteria: A) Is in the current directory. B) Is valid continuously from T_Start through at least [[24 hours] from T_End. C) Is published more recently than any other server descriptor with the same nickname that meets criteria A and B. (Remember, all Nickname comparisons are case-insensitive.) D) Expires later than any other server descriptor with the same nickname that meets criteria A and B and C. The client also builds a list of 'preferred' server descriptors This list contains every 'current' server descriptor that also meets these criteria: E) Has its nickname the 'Recommended-servers' list. Next, the client resolves explicitly specified servers: every server specifier with a provided nickname resolves into the 'current' server descriptor with that nickname. If there is no such server, the client gives an error message. Next, the client picks the exit server (if not using a SURB). If the last entry of the second path specifier is ANY, the client chooses an element at random from among those 'preferred' server descriptors that support delivery to the exit address, trying to avoid any sequence of two consecutive server descriptors with the same nickname. Finally, the client picks servers for each of the 'ANY' server specifiers. The client picks 'preferred' server descriptors at random, with replacement, avoiding any sequence of two consecutive server descriptors with the same nickname.