FuzzyTags

I spent a lot of time this last couple of months playing with FuzzyTags based on

the FMD2 scheme described in “Fuzzy Message Detection”, published by Gabrielle Beck, Julia Len, Ian Miers, Matthew Green..

Much of the below has been archived from a newsletter we put out at Open Privacy

Anonymous messaging systems (and other privacy-preserving applications) often require a mechanism for one party to learn that another party has messaged them.

Many schemes rely on a bandwidth-intensive “download everything and attempt-decryption” approach. Others rely on a trusted 3rd party, or non-collusion assumptions, to provide a “private” service.

It would be awesome if we could get an untrusted, adversarial server to do some of that work for us without compromising

metadata-resistance.

Recently a paper called “Fuzzy Message Detection” was published by Gabrielle Beck, Julia Len, Ian Miers, Matthew Green.

They presented a number of novel “fuzzy message detection” schemes.

These schemes “allow a recipient to derive a specialized message detection key that can identify correct messages, while also incorrectly identifying non-matching messages with a specific and chosen false positive rate p. This allows recipients to outsource detection work to an untrustworthy server, without revealing precisely which messages belong

to the receiver”.

This is really cool! Traditionally we’ve only been able to narrow down bandwidth requirements by “bucketing” recipients - this comes at the cost of partitioning the anonymity set for a round. Fuzzy message detection schemes allow us to keep one large anonymity set, and don’t require senders to know anything about the false positive rate of the receiver (or derive any kind of special bucket to put messages in).

Introducing Fuzzytags

To kickstart our investigations we released fuzzytags an experimental probabilistic

cryptographic tagging structure based on the FMD2 scheme described in Fuzzy Message Detection paper, written in Rust.

Having a concrete open source implementation has forced us to start exploring what a secure, misuse-resistant API for fuzzy message detection might look like, in addition to various integration concerns like serialization of fuzzytags and keys.

A Brief Introduction

With the help of several volunteers we have arrived at some specific terminology for the various parts that make up the fuzzytag system. Some of this terminology differs from the original paper, so we will provide a brief introduction here:

Every party generates a RootSecret, from which they can derive a DetectionKey and a TaggingKey. These keys will be generated with a parameter γ that relates to the minimum false-positive probability 2^-γ.

When submitting messages to the server for an intended recipient, the sender will generate a new Tag from the recipients TaggingKey.

All parties will extract a DetectionKey from their key pair. This key will be of length n and provide a false positive detection probability of 0 <= 2^-n <= 2^-γ. This DetectionKey can be given to an adversarial server.

When fetching new messages from the adversarial server, the server first runs a test of the Tag of the message against the parties’ detection key. If the Tag passes the test, the message (along with the tag) is provided to the recipient.

Simulations, Choosing Parameters, and Breaking Fuzzytags

The privacy and security properties of fuzzytags is highly dependent on selecting a false positive rate p and scheme

constant γ for your system. There is no one-size-fits-all approach.

If p is too low, then the probability of false positives for a given party will be very high.

If p is too high, then an adversarial server will be able to link messages to recipients with probability approaching 1.

Likewise a large γ means higher bandwidth costs, but a small γ reveals more of the root secret to the server while also increasing the change of perfect (but false) matches across all parties.

This is why we are releasing fuzzytags-sim, a simulator for fuzzytags deployments.

We built fuzzytags-sim to better understand how different parties choosing different false positive rates impact the overall anonymity of the system, and explore intersection and statistical attacks against various deployment scenarios.

fuzzytags-sim can play through real datasets or draw events from a distribution. Through experimentation, we’ve been able to provide a few high level guidelines on deploying fuzzytags securely.

Entangling Tags

We’ve also been investigating some novel properties of fuzzytags that we have dubbed “entangling”. Through search it is possible to construct special kinds of tags that can match to more than one detection key. These tags have a number of possible applications, including:

We believe that these strategies are likely essential to mitigating several forms of statistical analysis on fuzzytag deployment that are possible when parties choose non-optimal false positive rates. We have start collecting notes about these entangling strategies and simulating these strategies in our simulator against realistic datasets.

The Future of Fuzzytags and Integrations with Cwtch

We are still exploring the security bounds of fuzzytags. In particular we believe that with the right combination

of entangling strategies it will be possible to significantly cut down the bandwidth involved in using Cwtch servers

for offline message delivery. This would greatly increase the robustness of the platform.

One possible deployment approach that we are currently investigating is to have a one or more known popular services

using fuzzytags to receive messages and for all parties to entangle their messages to both the intended recipient

and a popular service.

This approach provides a number of distinct benefits:

The deniability properties still need to be explored through simulation and analysis - we will continue to update the

fuzzytags crates, documentation, and notebooks with progress in these areas.

More Information

For more information about fuzzytags, or to get involved, check out these resources: