Mixminion/Type III Remailer FAQ

This document contains frequently asked questions about the Type III remailer protocol, and about Mixminion, the reference implementation of that protocol. This is not a FAQ about remailers or anonymity in general.

Nick Mathewson maintains this document; email nickm@freehaven.net with comments or suggestions. All entries are by me unless otherwise noted.

1. General

Mixminion? Type III? What are you talking about?

"Type III" is the name for an improved anonymous remailer design. It was designed to address known flaws in earlier deployed remailers, and to resist all known anonymity-breaking attacks as well as possible.

"Mixminion" is the name for the project that designed the type III protocol, and is also the name for the reference implementation of the protocol. (The protocol used to be named "Mixminion" too, but when the Mixmaster team decided to use the "Mixminion" protocol as the basis for Mixmaster version 4, we changed the name of the protocol to be more project neutral.)

What exactly do you mean by "anonymous"?

When you communicate on the Internet, even when you use encryption, anybody receiving or intercepting your communication can tell which addresses are talking to which addresses. By "anonymous communication system", we mean a system in which it is difficult for anybody—even the people communicating with each other—to tell who is talking to whom.

How difficult? Mix-net designs attempt to prevent an adversary from learning who is talking to whom, even if the adversary:

Can observe all traffic on the entire network at all times.
Controls some fraction of the servers on the network.
Can alter traffic on the network.
Can insert traffic into the network.

Mix-nets also try to provide unlinkability: that is, to prevent an adversary from learning that two messages were sent by the same person.

(This usage isn't universal. Some projects use "anonymous" to describe any system that doesn't record your name. This kind of "anonymity" is easy to achieve, and we won't mention it any further.)

Back up—what's a remailer? A mix-net?

Back in 1981, David Chaum wrote a paper describing an anonymity server called a "mix." In Chaum's design, users encrypt messages to a mix's public key, and send those messages to the mix. The mix delays and re-orders the messages in order to prevent an eavesdropper from correlating messages entering and exiting. When it's time to deliver a message, the mix decrypts it, removes any information identifying its sender, and relays it to its recipient.

In order to prevent a single malicious mix from revealing the sender's identity, senders route each of their messages through a sequence (or "chain") of mixes. This way, if even a single mix in the sender's chain is honest, that mix prevents an attacker from linking the sender to the recipient. The complete set of intercommunicating mixes is called a "mix-net."

The term 'remailer' comes with a more application-oriented pedigree. It originally referred to anonymity servers that accepted messages via email and delivered them via email, without necessarily supporting encryption, batching, or chaining. Later generations of remailer added these features, and thus have become true mixes.

(You might argue that since Type III doesn't use email for its underlying mix-to-mix communications, it shouldn't really be called a 'remailer' protocol. On the other hand, since 'mail' is an abstract concept, you might argue that Type III's store-and-forward design is close enough to email as to make it a 'remailer'. Finally, you might decide that 'remailer' has become the generic term for 'high-latency anonymity server,' thus making Type III a remailer regardless.)

In theory, how much anonymity does Type III give me?

That depends on how paranoid you are. The Type III mix design is almost certainly secure for most kinds of users, against most adversaries. It is in many respects more secure than currently deployed mix-nets.

In my opinion, which could well be wrong, users of Type III mixes will probably be secure against anybody without the resources of a major telecom or government. People who only use the system occasionally are (I think) probably secure against telecoms and governments too, but that's harder to quantify.

On the other hand, you should be aware of a certain class of long-term intersection attacks that are effective against all currently known practical mix-net designs: if you send enough traffic to the same people over a long enough time, and if an adversary with a little mathematical ability is eavesdropping on you and your recipients, that adversary can statistically deduce whom you're communicating with. (These attacks require a lot of traffic—on the order of hundreds of messages.) Quantifying the severity of these attacks is an area of active research.

So how does Type III improve on other deployed remailer designs?

There are currently two widely deployed mix protocols: the Type I or "Cypherpunks" protocol, and the Type II or "Mixmaster" protocol.

Starting in 1992, Eric Hughes, Hal Finney, and others wrote the first Type I remailers. Type I uses email for a transport, and PGP for its encryption. Later implementations of the Type I protocol added new features in an attempt to mitigate some of the protocol's weaknesses, with limited success. Lance Cottrell's Type II (1995) software fixed most of Type I's insecurities, but dropped support for reply blocks.

The following table outlines the feature-differences and security differences between the three protocols. Parenthesis indicate partial support for a feature; or partial resistance to an attack. (Note: Type I is not formally specified, and exists in several versions. Claims for Type I below refer to the state of the art in Type I, such as it is.)

Feature	Supported by
Forward delivery	1	2	3
Reply blocks	1		3
(Multiple-use reply blocks)	1
(Single-use reply blocks)			3
Nymservers	1		3
Automatic directory retrieval			3
Distributed, coordinated directories			3
Automatic server key rotation			3
Loss-tolerant message fragmentation			3
Forward-secure encrypted transfer protocol			3
Attack	Resisted by
Forward message replay	(1)^a	2	3
Trace messages by size	^b	2	3
Distinguish among message formats sent by different clients		2	3
Distinguish forward packets from reply packets before delivery		n/a	3
Distinguish encrypted forward packets from reply packets		n/a	3
Reply block flooding		n/a	3
Path distinguishability based on client knowledge			3
Compromise keys later to read traffic recorded today			3
Run a directory server and mislead specific users			3
Parameter	Value
Public key length (bits)	Variable	1024	2048
Payload length^c	Variable	10K	28K
Packet length^c	Variable	20K	32K
Maximum path length	None	20	~30^d
Ciphers used	as in PGP	RSA 3DES	RSA AES

XXX What else?

Notes

a.: Some type I implementations support a XXX directive to limit the number of times a given message can be replayed. Using this directive is optional, however, and users of this directive are distinguishable from non-users. It does not solve reply-block flooding attacks.
b.: Some type I implementations support message padding in an attempt to prevent an attacker from correlating messages based on their size. This feature is of little or no security benefit, however, because: [1] The use of message padding and choice of padding regime are up to the end user, thus making senders distinguishable. (In fact, multiple padding strategies may make padded messages even more linkable than unpadded ones.) [2] Mixes can trivially distinguish added padding from message material (!!!).
c.: As used here, a "Packet" is a single piece of data transmitted across the mix-net, and a "Payload" is the portion of a packet containing the sender's data. If a user's message is larger than can fit in a single payload, it may be divided over multiple payloads.
d.: The number of a path that can fit in a Type III header depends on the length of the mixes' addresses. If all of the mixes in a path have static IPv4 addresses, the maximum path length is 34. On the other, if all the mixes have 20-character hostnames, the maximum path length is 30.

And how is Type III different from other deployed anonymity systems?

Unlike 'low-latency' anonymity systems such as Freedom, Onion Routing, Anonymizer, Crowds, Web Mixes, and the Java Anon Proxy (???), Type III mixes sacrifice latency for higher anonymity. Low-latency systems aim to provide fast enough end-to-end connectivity to support interactive web browsing (and sometimes other protocols such as SSH). This property, however, makes it fairly easy for an attacker observing or controlling both ends of the communication to correlate the timing of data sent along the channel.

Unlike 'single-hop' anonymity systems like Anonymizer, various anonymous web-email forms, and the now-defunct penet.fi, mix-nets route messages through a chain of relays. Using a single-hop system depends on a single trusted anonymity provider to keep you anonymous. With a chain of mixes, if any mix in the chain is honest, the connection between sender and recipient remains hidden.

XXX what else?

And how is Type III different from research/theoretical anonymity systems?

(The systems in this section are listed as "research" or "theoretical" because they are better known through the research literature than via any widely deployed implementation.)

Gülcü and Tsudik's Babel mix design (1996) addresses Type I's size-correlation and PGP issues, while adding multiple-use reply blocks. Unlike Babel, Type III's reply blocks are single-use, and are not vulnerable to end-to-end replay attacks. Also, in Babel, non-exit mixes can distinguish forward messages from reply messages, whereas in Type III they cannot. Babel has a neat feature in which mixes can insert inter-mix detours into the reply path. A limited version of this technique is possible in Type III, but cannot extend beyond adding single-mix detours.

[XXX Mention stop-and-go mixes, flash mixes, DC nets, and so on.]

How can I learn more?

We've got a design paper [PDF|PS] explaining the overall Type III design. ("Mixminion: Design of a Type III Anonymous Remailer Protocol" by Danezis, Dingledine, and Mathewson.)
We have byte-level specifications for the protocol. See: minion-spec.txt for information about the packet format and mix-to-mix transfer protocol, E2E-spec.txt for information about end-to-end message encoding and delivery, and dir-spec.txt for information about server descriptors and directory capabilities.
The specifications are not yet finalized. See spec-issues.txt for a list of open issues. There's also a draft design for directory agreement, and a draft nymserver specification.
If you want to learn more about anonymity, see the Freehaven.net anonymity bibliography.

How can I get involved?

Check out the Mixminion website for information on joining the mixminion-dev mailing list.

2. Design

This section answers questions related to the design of the type III protocol.

What's a SURB? Why doesn't Type III have multiple-use reply blocks?

"SURB" stands for "single-use reply block": In Type III, reply blocks can only be used once.

All multiple-use reply block designs that we're aware of suffer from a common problem: anybody who gets ahold of a MURB can use it to send an arbitrary pattern of traffic to the recipient. An eavesdropper or compromised exit remailer can use this property to trace a MURB's recipients with end-to-end traffic analysis.

Some MURB designs (such as Type I) have additional vulnerabilities to replay attacks and flooding attacks.

Why don't you just use SMTP with TLS for mix-to-mix transfer?

Older remailer designs use SMTP not only for delivering messages to their recipients, but also for transferring packets from one remailer to the next. Type III, however, uses MMTP (a simple TLS-based protocol) when transferring packets to or from remailers. This has several advantages, including:

Email has become complicated. Misconfigured MTAs, spam filters, and virus scanners have been known to block or corrupt Type I/II packets as they travel between one server and the next. Such bugs are often hard to diagnose and address, since the offending systems are typically operated by ISPs out of the remops' control. Even when these systems relay Type I/II packets correctly, they often keep logs and backups that can help an adversary analyze the mix network.
MMTP is always encrypted, authenticated, and forward-secure. SMTP is not necessarily so. For SMTP to be encrypted, Type I/II remailer operators need to install and configure a TLS-capable MTA such as Postfix/TLS; for authentication, remops must exchange certificates manually; and for forward secrecy, they must configure their MTA's TLS module to only use forward-secure ciphersuites. Not all remops have done these steps. (As of this writing, it seems that roughly half of the Type I/II remailers have TLS support, and that roughly three fourths of those are using forward-secure cypersuites.)
Because Type III remailers use MMTP, they are free to set their own policies for access control, for retrying failed deliveries, and so on, independently of their policies for mail delivery.
Because email is not always 8-bit clean, packets must be encoded before transmitting them as email messages. This adds a non-negligible (typically about 35%) bandwidth overhead.

There are, however, a few drawbacks to not using SMTP for relaying messages. We believe that they are outweighed by the benefits of a uniform, integrated, encrypted protocol, but we list them for completeness:

Some people have argued that using encrypted SMTP helps remailer traffic blend in with non-remailer traffic, if a remailer's MTA also handles regular email. This is probably of less benefit than it might seem at first: since type III packets are fixed-size, an attacker can probably distinguish them from regular email by size, even if encryption is used.
MMTP adds complexity to every Type III client and server; if we just used SMTP, our code would be simpler and smaller. (Currently, the MMTP implementation and the store-and-forward together take about 9% of the Mixminion codebase.)

Why do you use AES128 and SHA1? Ferguson and Schneier's Practical Cryptography says...

The relevant quote (from p. 65) is:

A 128-bit key would be great except for one problem: collision attacks. Time and time again we find systems that can be attacked by a birthday attack or a meet-in-the-middle attack. We know these attacks exist. Sometimes designers just ignore them, and sometimes they think they are safe but somebody finds a new, clever way of using them. Most block cipher modes allow meet-in-the-middle attacks of some form. We've had enough of this race, so here is one of our design rules.

Design rule 3. For a security level of n bits, every cryptographic value should be at least 2n bits long.

Because of this principle, Ferguson and Schneier recommend AES-256 as a block cipher and double-SHA-256 as a digest function. The current type III specification, on the other hand, calls for 128-bit AES and 160-bit SHA1: if there were a way to exploit birthday or meet-in-the-middle attacks against our protocol, then instead of 128 bits of security, we would only have 64 or 80.

(Collision attacks exploit the property that "If an element can take on N different values, then you can expect the first collision after choosing about SQRT(N) random elements." Thus, one only needs to perform about 2^80 SHA-1 operations to find two different strings that hash to the same value.)

We do not now believe that there are any useful collision attacks against the Type III protocol. Here's why:

First, we can exclude attacks where an attacker waits for collisions in AES128 keys or SHA1 digests: to do so, the attacker would need at least 37 zettabytes (!) of storage to store just a single-leg 2K header for each message. Thus, the attacker's best hope is to find collisions on their own and then somehow exploit them. But even if an attacker does know a set of collisions, they could at worst use them to send packets of their own which would interfere with each other, not to compromise others' anonymity.

F&S also recommend double-SHA (that is, SHA(SHA(x))) as a digest algorithm, since SHA has the regrettable property that SHA(x|y) can be deduced from SHA(x) and y. We use variable-length SHA in 6 places, all of which we believe to be safe:

The OAEP padding we use with RSA uses plain SHA—but OAEP has been well analyzed by many people, and is believed to be safe.
The SSL ciphersuites we recommend use plain SHA—but again, SSL is well-analyzed, and it is believed to use SHA safely.
In MMTP and in end-to-end message encoding, we send each "msg" along with SHA(msg) to check whether the message has been corrupted—but in these cases, we are only concerned with accidental corruption, and have other approaches to prevent deliberate modification.
Our keystore implementation uses SHA(salt|password|salt) as an AES key to encrypt the actual stored secret keys on the disk. But this value is secret (since the password is secret), so the SHA vulnerability has no application.
The LIONESS variable-length block cipher uses SHA as part of its Feistel-like structure. But again, the SHA-extending attack doesn't allow an attacker to break LIONESS.
Directories and server descriptors sign SHA digests. But in this case there is no problem with the attacker computing a digest for a longer descriptor/directory, so long as he can't forge an RSA signature.

So there doesn't seem to be any need to move to AES-256 and double-SHA-256 right now.

Why doesn't Type III have (insert feature here)?

(XXXX list a bunch of features.)

Is the design paper still accurate? What's changed since then?

Changes since the publication of the original design paper are:

We no longer perform session key rotation on MMTP connections. MMTP connections are typically so short-lived that it doesn't impose any significant overhead to close and re-open any connections that stay open for a long time.
The RSA key length has increased from 1024 bits to 2048 bits, which led us to change the way we pack subheaders into headers.

Features in the specification not described in the original design paper are:

End-to-end encoding to make corrupted payloads, reply payloads, and encrypted forward payload indistinguishable from one another except by their recipients. (XXXX this wasn't in the paper, was it?)

Features described in the original paper but not yet fully specified are:

Incoming email gateways.
Automatic opt-out.
Coordination between multiple directories. (But see dir-agreement.txt.)
Nymservers. (But see nym-spec.txt.)

3. Implementation

Where's the code? Does it work?

You can always find a link to the latest release of the code at http://mixminion.net/.

If you want to access the CVS repository, there's a regularly updated sandbox with anonymous pserver access. To use it, run:

cvs -d :pserver:guest@cvs.seul.org:/home/minion/cvsroot login
cvs -d :pserver:guest@cvs.seul.org:/home/minion/cvsroot co src doc

So far as we know, the code works fine. As of 15 December 2003, there are 27 servers running, including 11 with exit support.

Can I use Mixminion to send anonymous messages today?

It depends how anonymous you need to be. For casual use, Mixminion may meet your needs.

For the moment, however, we do not recommend using Mixminion for messages that require real anonymity. This is for the following reasons:

The code is still under development. There may be unknown bugs that could compromise your anonymity. (We do not know of any such bugs.)
In order to test the code, many servers are running in configurations that could harm your anonymity. For example, some servers are configured to log verbosely. Others are configured to use the "timed-pool" mixing algorithm rather than the more robust "timed dynamic-pool" mixing algorithm. While these configurations help us debug Mixminion, they also make it easier for an eavesdropper or a compromised server to trace your messages. The final Mixminion release will not support these configurations.
Some features that are necessary for high security, robustness, anonymity are not yet implemented. These include:
- Distributed directories. (The current centralized directory is a single point of failuure.)
- Automatic generation of dummy messages
- Built-in network reliability testing ("pinging")
There aren't enough people using Type III today. Even if the software works perfectly, you aren't hidden unless you have a large number of people to hide among.

So in summary, feel free to play with Mixminion and use it for casual anonymity, but don't count on it for strong anonymity yet.

What do I need in order to run a Mixminion client?

Right now, the requirements to build and run a Mixminion client on a Unix-like operating system are:

A Unix-like operating system. Mixminion is known to work on Linux, FreeBSD, Macintosh OS X, and Cygwin. It is believed to run on Solaris, OpenBSD, and so on, but has not been thoroughly tested on those platforms. If you find any bugs, please let us know so that we can fix them.
A working version of Python. Currently, all Python versions since 2.0 are supported. As of 15 December 2003, the stable versions of Python are: 2.0.1, 2.1.3, 2.2.3, and 2.3.2.
If you're using Solaris, be aware that some versions of Solaris come installed with mis-compiled versions of Python that don't have networking support. Don't worry—compiling your own Python installation is dead easy. Go to python.org and start from there. A basic Python installation should take 20-30 MB of disk space.
A working C compiler and a copy of Make, in order to compile the Mixminion C extensions.
Enough disk space. The installed code takes about 5 MB.
A working version of OpenSSL 0.9.7 or higher. If you don't have one, the Mixminion build process can download and compile one for you.

To run Mixminion on Windows, you'll need:

Windows 98 or later.
Optionally, Python 2.3. (If you don't have it, you can download the standalone Mixminion distribution.)
Enough disk space.

What do I need in order to run a Mixminion server?

To build and run a Mixminion server, your system must meet the requirements listed above for building and running a client. Additionally, you'll need all of the following to run a server:

A DNS entry for your server, or a fixed IP address.
Enough disk space to hold logs and pending messages. 20-30 MB should be enough for pending messages for now. If you're running with verbose logs, you'll need about 2MB per day to hold them. You'll probably want to rotate the logs regularly.
Enough bandwidth. A cable modem or DSL line seems to be plenty; a dial-up connection probably isn't. [Also, if you pay by the megabyte, be careful: the code doesn't currently try to do any bandwidth throttling or smoothing, and tends to generate large traffic spikes.]
A fairly reliable network connection. 90% uptime is probably good enough; 10% uptime probably isn't.
Possibly, an MTA (such as Sendmail or Postfix). This is only necessary if you're planning to run an exit server. If you're running in middleman mode, you don't need an MTA.
Permission. If your terms of service allow you to run a remailer, you're set. If not, you probably want to come to an arrangement with your service provider's abuse department. (On the other hand, running in middleman mode is a good away to avoid abuse complaints.)

How do I use the code?

There are instructions in each version of the release notes. (That's the file entitled "README" in the Mixminion distribution.)

Do any GUI clients support it yet?

Not yet. The only client is the built-in CLI client.

How can I add Type III support to my GUI client?

Right now, you have two options:

Use Mixminion as an external program; invoke it as a separate process, and parse the output.
If you're writing in Python, invoke the functions in the module mixminion.ClientMain directly. (Be aware that in some later version, these functions will change in order to more closely support the proposed Client API.)

The Mixmaster team currently plans for Mixmaster version 4 to include a C library to support Type III messages. This code, however, is not yet written.

Why is it written in Python? Isn't that slow?

Actually, no. Mixminion is written in a mixture of Python and C for portability, rapid development, and robustness. For more information about Python's advantages, peruse the documentation at python.org.

As for speed concerns: If Mixminion were written entirely in Python, it would indeed be slow. Fortunately, the application's performance-critical sections are almost entirely concentrated in cryptography and I/O, and by using C for these operations, we get close to ideal performance.

(For example, RSA decryption is so slow that the other operations a server performs when decrypting and processing a Mixminion packet take less than 5% of the CPU resources expended per packet. Assuming that OpenSSL's (C) implementation of RSA is already close to optimal, we would expect no more than a 5% performance improvement from rewriting packet processing in C.)

This is a general case of the 90/10 rule: programs usually spend about 90% of their time in 10% of their code. Optimizing this code is sufficient to optimize the application; further optimization gives diminishing returns. (This is just a rule of thumb, but oddly enough, it turns out that 11.3% of the code in Mixminion 0.0.5 is written in C.)

How much of the protocol is implemented?

When will it have (insert feature here)?

The TODO file in the CVS repository has a tentative schedule for features to appear in future releases.

The schedule in the TODO file is tentative—and so are all other statements that anybody makes, ever, about dates and features for future releases. This is a volunteer project, and progress depends heavily on how much spare time Nick has from week to week.

With that in mind, the historical wait between releases has been:

From first CVS commit to 0.0.1	~6.5 months
From 0.0.1 to 0.0.2	~21 days
From 0.0.2 to 0.0.3	~1.5 months
From 0.0.3 to 0.0.4	~3.75 months
From 0.0.4 to 0.0.5	~2.75 months
From 0.0.5 to 0.0.6	~3 months

Nick says that he's trying to shoot for a two month release cycle, but this may be too ambitious.

How do I report a bug in the code?

Preferably, go to the Mixminion bugzilla page at bugs.noreply.org.

If this isn't feasible, send email to the list at mixminion-dev@freehaven.net, or to Nick Mathewson at nickm@freehaven.net.

When reporting a bug, please include all of the following:

What version of Mixminion you're using.
What version of Python you're using.
What operating system you're using. (Include OS and version).
A transcript of the interaction between you and the program.
A description of how the program's behavior was different from what you expected, if the expected behavior isn't completely obvious.
If you get an error message with a stack trace, the complete stack trace.
If you're reporting a bug in the server, a copy of your server's configuration, and the last few hundred lines of your server's log before the error.

Who else is working on Type III implementations?

The Mixmaster team plans to include Type III support in Mixmaster version 4.

Others have expressed interest in writing Type III client libraries in C, but no announcements have been made and code has yet been publicly released.

Is there a backdoor in the Mixminion code? Could there be?

There is no backdoor in the Mixminion code. The code is publicly available, so anybody who reads Python can check for themselves.

I will never willingly add a backdoor in the Mixminion code. (Exceptions: Any testing infrastructure that endangers anonymity will be labeled as such, and will be removed before the first beta release. The code will cause any testing servers running in non-anonymous configurations to advertise themselves as such.)

Of course, if I'm legally compelled to add a backdoor, I probably won't be allowed to tell you about it. Continue to review the diffs between releases, and don't trust any release that doesn't come with source.

Also, you should probably get worried if this question disappears from the FAQ. :)
—Nick