Mixminion directory agreement protocol -- draft Nick Mathewson 13 August 2003 Status This is a draft document describing a directory agreement protocol for Type III remailers. If people like it, we'll merge in into Mix3:3 "dir-spec.txt". It's substantially identical to what Nick was blathering about at Defcon. It doesn't have as much detail as a final specification should. In particular, it doesn't describe data formats. It isn't perfect, but it's better than Type II. :) 0. Motivation and design goals Currently, we have specifications for most of the type III directory system, including: server descriptor structure, directory structure, publishing to directories, building directories, downloading directories, and using directories. One critical piece remains: directory agreement. In today's Type II and Type III code, each directory server publishes an independent description of the network's state. This enables the following attacks: (1) An attacker who controls a directory server can redirect that server's clients toward a restricted set of mixes with little risk of detection. (2) An attacker who controls a directory server can feed target users a modified directory that would make those users highly distinguishable. Since directories are signed, this attack leaves a trail--if its victims notice it's happening. (3) An attacker who eavesdrops on a user or a directory can tell who is using a particular directory. Knowing who is using which directory helps the attacker mount statistical partitioning attacks on the network based on client knowledge. (If users try to avoid this attack by using multiple directories, they only make their situation worse: Instead of dividing themselves into D possible states of knowledge, they divide themselves into (2^D-1) possible states depending on which directories they trust.) (4) An attacker who wants to DOS a user can do so by knocking down that user's favorite directory server. Below, I describe a synchronization protocol that allows multiple directory servers to arrive at a single directory, signed by all the directory servers. My requirements are as possible: - Central coordination between directory servers should not be an absolute requirement. In particular, the protocol should succeed even when the directory servers do not agree on who is a directory server. - The protocol should allow old directory servers to shut down, and new directory servers to join, but should not allow an attacker to control the final directory by signing up a large number of directory servers. - When the protocol succeeds, it should create a directory signed by a large quorum of directory servers. If at least half of the quorum has signed a directory, clients should be able to trust that directory. - If an attacker controls less than half of the directory servers in a quorum, the attacker should not be able to sign up an arbitrary number of nodes. - If at least half of the directory servers in a quorum are running, cooperative, and able to communicate, then they should produce a signed directory. - The protocol should discourage social engineering attacks aiming to fragment a quorum of directory servers. Although it is impossible to *force* only a single directory to exist (because people can always modify the software), it would be best to minimize the number of fragmented directories. - If a directory server do not obey the protocol, other servers should be able to prove it. - Parties other than directory servers should be able to tell whether the directory servers are obeying the protocol. Broadly, the protocol works like this: - Every mix periodically uploads its server descriptors to all of the directory servers it knows about. - Every directory server continuously publishes its knowledge of the network (for consumption by other directory servers). - Every directory server periodically downloads from other directories it knows about their knowledge of the network, and updates the list of mixes it knows about. - Once a day, directory servers produce a signed directory as follows: - Every directory server publishes a signed statement describing its current view of the network state, and the list of other directory servers it would like to vote with. - The directory servers retrieve each others' signed statements. - Based on these signed statements, each directory server finds a quorum to vote with. All cooperative servers will agree about who should be in the same quorum with them. - Based on the signed statements of the other directory servers in the quorum, each directory server computes a reconciled directory. This process is deterministic, so that all cooperative directory servers in the same quorum will compute the same reconciled directory. - Each directory server signs its reconciled directory, and exchanges these signatures with the other members of its quorum. - At this point, every quorum has computed a uniform directory signed by every cooperative member of the quorum. The directory servers publish this signed directory. Below, I discuss this protocol in more detail. 1. Terminology Mix -- a Type III remailer server. Reliable mix -- A mix is 'reliable' if it implements the Type III remailer protocol correctly, by correctly processing and delivering the messages it receives. Honest mix -- A mix is 'honest' if it tries to be reliable, and it does not attempt to attack the network or break users' anonymity. Credible mix -- A mix is 'credible' to a directory server, if that directory server thinks the mix is probably honest, so far as it knows. Pinger -- a source of liveness and reliability information about mixes. Directory server -- a host that follows the protocol described in this document. Cooperative server -- A directory server is 'cooperative' if it follows the agreement protocol correctly. (It may still be attempting to subvert the _contents_ of the final directory by lying about network state.) Honest server -- A directory server is 'honest' if it is cooperative, and it attempts to make accurate statements about network state. Publication -- In order to 'publish' a file, a directory server makes that file available at a well-known HTTP URL. Network view -- A directory server's knowledge and opinions about mixes and other directory servers. (More information is in 2.2 below.) 2. The protocol in more detail 2.1. Uploading server descriptors to directories Mixes upload their server descriptors to directory servers as before. The only change is that every Mix now knows about a number of directory servers, and uploads every descriptor it generates to each one of them. 2.2. Pinging Pinging is outside the scope of this document, except as follows: Every directory server acts as a pinger in order to obtain reliability information about all the mixes it can. Additionally, the directory administrator may decide to incorporate liveness information from other pingers. The directory distills this information down to a single "is-reliable" bit for each mix. 2.3. What directory servers know Every directory tracks the following information about every mix: - The mix's identity key. - A list of all valid, non-superseded server descriptors it knows for the mix. - Whether it considers the mix reliable (based on pinging). - Whether it considers the mix 'credible'. (A mix is 'credible' if the directory server thinks it is honest. Different directory servers may have different criteria here: Some directory servers will consider new mixes as automatically 'credible' unless they have evidence of their dishonesty; others will consider new mixes as 'non-credible' at first, and list them as 'credible' after a probationary period.) Directory servers should publish the criteria they use to judge credibility. [Rationale: we add 'credibility' for two reasons: 1) Adding a manual step before we include a new mix in the directory prevents pseudospoofing attacks. (If an attacker could sign up a hundred reliable mixes and have them all included in the directory automatically, the attacker would trivially be able to break user anonymity. 2) This protocol does not provide a way for directory server to 'unlist' dishonest mixes---if any server knows about a mix, the final signed directory will include it. In this protocol, instead of flushing a bad mix down the memory hole, you just mark it as 'noncredible'.] Every directory tracks the following information about every other directory server: - The directory server's identity key and URLs. - Whether it uses that directory server's pinging results in determining mix reliability. - Whether it believes that directory server should be part of a voting quorum. All of the above information, taken together, forms a directory's "Network View". 2.3. Directory agreement protocol (All times in this section are completely arbitrary, and given in GMT. This protocol assumes loosely synchronized clocks.) Phase 0: Propagation Every hour or so, every directory server publishes a signed, non-binding statement of its Network View (as defined in 2.2 above). Directory servers periodically retrieve one another's Network Views in order to learn about new mixes, new directory servers, and new server descriptors. Phase 1: Declarations (22:00) Once a day, at 22:00, every directory server publishes (at some well known URL) a signed Declaration. A Declaration includes: 1) The date 2) The directory's current World View. Phase 2: Declarations are collected (22:00-22:50) Between 22:00 and 22:50, every directory server retrieves every other directory server's Declaration. The directory server then re-publishes those declarations at local URLs, along with signed statements averring that the directory server has received them. Even after retrieving one another's declarations, directory servers continue to retrieve the re-published declarations, in order to assure no directory server has given different declarations to different servers. Phase 3: Agreement (23:00) At 23:00, every directory server calculates what quorum it is in, calculates how that quorum will vote, and signs the result of the vote. This predicted-vote-result is called a 'pre-directory'. Because this process is deterministic, every cooperative directory in the same quorum will reach the same result. (The details of this process are described in 2.4 and 2.5 below.) For auditing purposes: Along with their pre-directories, directory servers also publish a signed copy of the declarations they have used to compute the pre-directory. Phase 4: Pre-directories are collected (23:00-23:50) Directories collect one another's signed pre-directories as in Phase 2. Phase 5: A directory is published (00:00) Every directory collects the pre-directories from step 4 from the members of its quorum. If they have cooperated, every directory server in the quorum will have signed the same pre-directory. This pre-directory, plus the signatures of each member of the quorum, forms the final directory. Phase 6: Clients download the directory (00:00--23:59) The first time in any day that a client originates a message, it downloads the most recent directory from some source that it trusts. If it signed by over half of the expected directory servers, it uses that directory. Otherwise, it alerts the user. (See section 3 below.) 2.4. How to compute quorums In phase 3 of the directory agreement protocol, directory servers need to compute which quorum they will vote with. This process needs to be deterministic, so that every cooperating directory server in a quorum arrives at the same quorum. [[[Remember: the end goal is for there to be a single most trustworthy quorum that all clients use. This quorum-building algorithm is only designed to minimize the number of competing quorums when we can't come up with a single most trustworthy one; and to discourage directory severs from fragmenting the quorum. ]]] To choose a quorum, directory servers iteratively calculate the "best" quorum of directory servers, and see if they belong to it. If they do, they use that quorum. Otherwise, they remove the servers in that quorum from consideration, and iterate. (Below, we use "A trusts B" to mean "A believes B should be a part of a voting quorum" in the interest of brevity.) To find the "best" quorum from a set of directory servers, we use the following algorithm: 1. Find the largest set of mutually trusting directory servers. (A set of servers is 'mutually trusting' if every server in that set trusts every other server in that set.) If there is a unique largest set of mutually trusting directory servers, that set is best. Otherwise, if we have a tie, proceed to step 2. 2. Let N be the size of the tied sets under consideration. If N>=5, we compute a 'friendliness' relation as follows: Set A is "friendly" to set B if A and B share at least N/2 elements in common, and the number of pairs such that A_i trust B_j is greater than the number of pairs such B_i trusts A_j. (Note that 'friendliness' can be cyclical, but cannot be symmetric. That is, we can have A-friendly-to-B-friendly- to-C, but we cannot have A-friendly-to-B-friendly-to-A.) We then compute the 'index' of each set by taking the SHA1 hash of the sorted public keys of all the directory servers in the hash. We compute a 'transitive friendliness' relationship as follows: First, we break cycles in the friendliness graph, from largest cycle to smallest (breaking cycle size ties by highest element index), by removing the 'is-friendlier-than' relation from the smallest-indexed set in the friendliness graph. Secondly, we take the transitive closure of the resulting graph. Finally, compute the best set as follows: start with "S_0", the set with the highest index. If there are one or more sets that are "transitively friendlier" than S_0, choose the one of those sets with the highest index. [Discussion: The graph theory here is kinda funky. The idea behind 'friendliness' is to prefer quorum-builders to quorum-breakers, so that if we start with a quorum containing Dirserv-1 through Dirserv-N, and then Dirserv-1 stops trusting Dirserv-2, Dirserv-1 is booted from the next day's quorum. Also note that the idea is not to ensure that the "best" quorum is the most trustworthy, but that the "best" quorum to which an honest server belongs is the most trustworthy.] 2.5 How to compute a pre-directory A 'pre-directory' is the signed portion of a signed directory. It contains: - a list of server descriptors for all known mixes - a list of directory servers in a voting quorum - and a list of which mixes are recommended To compute a pre-directory, a directory server first determines what quorum it will vote with, as described in 2.4 above. It then takes the Declarations from those directories, and its own Declaration, and computes the pre-directory as follows: - It chooses the most recently published live non-superseded server descriptors for each mix (as in "dir-spec.txt"). - It lists mixes as "recommended" iff they are considered reliable by enough members of the quorum and they are considered credible by enough members of the quorum. [If the quorum has an odd number N members, "enough" is FLOOR(N/2)+1.] 3. Client behavior Every client comes pre-installed with a list of directory servers. This list should be identical with the servers in whatever long-standing quorum the software maintainer judges the most trustworthy and reliable. If any user Alice trusts a different set of directory servers, she can configure her client manually to use those directory servers instead. If the client can retrieve a single directory signed by a majority of its trusted directory servers, it uses that directory. Otherwise, the client software alerts the user, and has the user choose. 4. If stuff goes wrong Here we examine opportunities for the protocol to go wrong, and see how bad stuff gets. Below, for "an attacker", read "an attacker, a malfunctioning directory server, a network outage, or Murphy's Law." Phase 1. Declarations - An attacker might publish a dishonest declaration. [If an attacker controls less than half of the quorum to which it belongs, it cannot force mixes on to or off of the final list. If the attacker tries to break the quorum building process, it cannot force honest servers to join a mostly dishonest quorum, unless the honest servers trust enough dishonest directory servers.] - An attacker might not publish a declaration. [If the attacker does this, it will not be part of any quorum.] Phase 2. Declarations are collected - An attacker might give different declarations to different directory servers. [Since all declarations include a date and are signed, any two cooperative servers will have proof that they have received different declarations as soon as one has received the other's re-published version of the uncooperative directory's declaration.] - An attacker might not give its declarations to some servers. [So long as any cooperative directory server receives a copy, the other directory servers will eventually retrieve a copy from that server.] - An attacker might not give its declaration to any servers. [If the attacker does this, it will not be part of any quorum.] Phase 3. Agreement - An attacker might pretend not to have received certain declarations. [If most directory servers have received and re-published a server's declarations, any attacker who does this may not be believed. If the attacker is included in a quorum, the other members of the quorum will all sign the same pre-directory, but will not include the attacker's signature.] - An attacker might compute the pre-directory incorrectly. [If the attacker does this, the cooperative members of the quorum will all sign the same pre-directory, while the attacker's signature of a different pre-directory will not be included.] - An attacker might not publish a pre-directory. [If the attacker does this, the cooperative members of the quorum will all sign the same pre-directory, while the attacker's non-formed signature will not be included.] Phase 4. Pre-directories are collected - As in phase 2 above Phase 5. A directory is published - As in phase 3 above