Introducing Yakker: an open, secure and distributed alternative to WhatsApp

Did I get your attention with the title? Good. In this post I will outline something that I have been thinking about for some months:

I believe it is possible to build a system with the simplicity and functionality of WhatsApp or Viber, which provides end-to-end encryption, is built on free software and open protocols, that supports federation and is almost decentralised, and that would allow interested companies to turn a profit without compromising any of these principles.

Introduction

This is the first post of a series that I will be publishing in the next few days. Many parts of these posts will be technical, but I expect that the main concepts can be understood by a wider audience.

What I am proposing seems like a bold statement, I know. Maybe there is some fatal flaw somewhere in my thinking, and that is why I am publishing this: I hope to get constructive feedback, and maybe get enough traction to start implementing it Real Soon Now™.

I have been thinking about this problem since February, when I discussed this extensively with friends at FOSDEM. I have already published a critique of Telegram, which had way more impact than I ever imagined, showing that there is people out there interested in this kind of stuff. The last posts about DNSSEC and DANE were part of my musings about this, too.

There are many components that need to be built for this to happen. But more importantly, this can only be useful if it gains a critical mass. And that's why I think making this a viable business tool is very important. At the same time, that means I need to think extra carefully to make it impossible for any for-profit company to mutate this effort into Just Another Walled Garden.

My goals for this architecture are:

  • First and foremost, target the same people that nowadays is using a plethora of walled gardens for their instant communication needs. That is WhatsApp, Viber, Skype, Facebook messenger, etc.
  • It must focus on mobile, that is what people care about, without forgetting about other use cases.
  • Creating an account and placing the first call/text message should be as easy as it is currently with the competition.
  • All communication must be encrypted end-to-end with public-key cryptography; nobody but the user has access to the private keys.
  • Most components must be decentralised, and allow for competition.
  • There should be as little trust as possible placed on any part of the system.
  • Anybody can set up a compatible service provider and offer it to its users, while having full interoperability with other providers.
  • Compatible services which are not part of the network must be able to interoperate.
  • Contacting a person should happen even if they are not subscribed to the service. The client application must fall-back seamlessly to using interoperability gateways, PSTN termination, or the mobile network.
  • Interoperability with the competition is desirable, but possibly is better left to be implemented by the client applications.

Components

I have identified a few components needed for this to work, I will expand on each one later.

  1. A flagship mobile application for Android and iOS, based on Lumicall or CSIPSimple, but with several important modifications.
  2. One or many directory and authentication services, based on ENUM and DNSSEC. These are the most critical piece of this idea, and possibly must only be operated by community-governed non-profits.
  3. One or many service providers, that offer simple account creation, registration, and optionally PSTN termination (which can be the main way of generating profit). An API needs to be defined for operations that are not part of the communications protocol, like account creation, credit purchasing, and balance querying.
  4. A network governing charter, and a trusted non-profit organisation that oversees that any participating parties are following the charter. This organisation defines which directory services are to be trusted (and possibly operates one of them), and which service providers the client application can use to create accounts.

Key points

Some of the issues that need to be solved are:

  • How to handle and distribute public keys securely without the user understanding anything about security.
  • How to make registration painless and password-free, while offering an acceptable level of security.
  • How to fund development of the client application, and maintenance of the directory services.
  • How to get companies interested in this, so them would bring users to the network.
  • How to allow the user to migrate from one service provider to another, to improve competition.
  • How to prevent any party from subverting the spirit of the network.
  • How to make the client application work everywhere and have reasonably quality.

To be continued

I think I have answers to most of these problems. I will elaborate in the next few days, stay tuned! :-)

Footnotes

The name is something I've chosen a name in less than 2 minutes, while starting this post, so probably is awful.

The distributed part is only half true, as the directory services need to be centralised, but I think it's good enough.

I am aware that Lumicall seems to be trying to build something similar. I only found about that recently, when I was thinking about this design. Sadly, I think it has several shortcomings, but it is definitely one of the building blocks of this project.

Yakker, part 2: the directory service

This is the second post in a series of posts describing a secure alternative to applications like WhatsApp. I started with the following statement:

I believe it is possible to build a system with the simplicity and functionality of WhatsApp or Viber, which provides end-to-end encryption, is built on free software and open protocols, that supports federation and is almost decentralised, and that would allow interested companies to turn a profit without compromising any of these principles.

In this post, I will outline the concepts behind the most critical component of the architecture: the directory service.

Outline

I say this is the most critical component, because here lies what makes this architecture different, but also because this is the weakest link in the whole idea. I expect criticism, specially on some security trade-offs, and I hope that people that know better than me can help me improve it.

The directory service is what allows the users to register easily, without using passwords, to be able to receive calls and messages from other users, even if they are not part of the network, and to do all this with a reasonable security model.

Let's get technical

The directory service is basically a DNSSEC-protected DNS zone serving ENUM records, along with public keys associated with each user identifier.

A TLS-enabled API will enable account creation and validation, DNS records publishing, and encrypted records querying.

Applications not supporting that API use standard ENUM querying, and the user manually uses traditional web-based methods for account and records management. This allows interoperability with any existing clients and SIP services.

The service will authenticate users by using the usual methods to probe ownership of phone numbers: sending an encrypted SMS with a secret that the client application uses then to claim the phone number.

This same method can be used to validate ownership of identifiers that are not phone numbers, like pre-existing SIP or email addresses.

Once the user is authenticated, the directory service will publish the user's public key and SIP address, both associated with the phone number.

When a user wants to place a call or send a message, it uses a DNSSEC-enabled resolver to get securely the other party's public key and SIP address. The user can also perform bulk look-ups, to discover which people on theirs address book is already in the system.

These operations disclose an important amount of private data, and I don't think this can be mitigated in an acceptable manner, and therefore the directory service needs to forget the queries as soon as possible, and not to store any logs of these.

Also, DNS queries are not encrypted, and are thus vulnerable to snooping by third parties. To mitigate this, the service needs to implement an encrypted but anonymous API to perform queries, and thus offer extra privacy to clients that support the API.

The idea is to have more than one service running on different domain names, but not many: they need to keep consistency and replicate among them, and the security and privacy implications of one service not being properly implemented or administered are too big.

Therefore, there must not be more than a handful of these, they need to be properly audited, and must not be operated by any for-profit organisation.

Security

The Web of Trust is hard. End users don't like hard. Traditional PKI is prohibitely costly, and broken. OTR is good, but is still not hassle-free.

Using one of the proposed extensions to DANE, we can solve this problem: using DNSSEC, we can have each users' public key published where everybody can retrieve it securely.

This published key can then be used for end-to-end encryption and client authentication with all the components of this architecture.

Who creates the key pair? In the simple case, the client application creates the key pair, and stores it in an appropriate container in the mobile device. It then chooses one of many cooperating directory services, and gets the public key associated with the phone number.

If the private key is lost (lost your phone?), a new pair is generated and published, after passing the same checks of number ownership, and the old keys are discarded. This opens the door for some attacks, but those can be mitigated by having the client application verify the public records periodically, and the service requiring extra checks for key replacement. For example, by also using a challenge sent by email.

What if I want to do things my way? Perfect, you create your keys, and then use the same mechanisms to register with the identity service. Also, you use the service to tell other users to connect to your own SIP server when they want to talk to your phone number.

Does this sound like ENUM? It is ENUM, but better. Just by adding DNSSEC, an unified API, and the capability of publishing key material along with the routing information, you got yourself a reasonably secure way of distributing keys and locating users.

Interaction with other components

Once the user records are published, the SIP server can use the public key to authenticate the user, and removes the need for passwords. The same principle applies to account creation: if the user has published key material under a phone number's record, the SIP server must accept account creation requests for the same phone number, provided these are authenticated with the private key.

If the user wants to switch providers, it is as simple as creating a new account in the new provider, and then updating the ENUM record.

The client application queries the service (and possibly other ENUM providers) before placing a call or sending a message. If the peer has keys published, the client can refuse to communicate if the keys don't match, or the peer is not offering call encryption.

When public keys are not found, the client can downgrade to traditional unauthenticated encryption, or unencrypted communications.

To be continued

In the next posts, I will talk about the overseeing organisation, the SIP service providers, and the client application, and how they all fit together. Stay tuned!

Yakker, part 3: the service providers

This is the third post in a series of posts (part 1, part 2) describing a secure alternative to applications like WhatsApp. I started with the following statement:

I believe it is possible to build a system with the simplicity and functionality of WhatsApp or Viber, which provides end-to-end encryption, is built on free software and open protocols, that supports federation and is almost decentralised, and that would allow interested companies to turn a profit without compromising any of these principles.

In this post, I will discuss the SIP service providers, how to enable them to make a profit, while keeping the network open and secure.

Basics

Service providers are expected to be third parties, either for-profit or not. PSTN termination would be an optional feature, possibly only offered by for-profit services.

The services would consist in just a standard SIP service, with STUN/ICE for NAT traversal, plus an API for getting PSTN termination rates, credit balance, account creation, and for buying credit.

All authentication will be performed with public key cryptography, using the data already available in the directory service.

The idea would be that companies can make a profit by selling PSTN termination and SMS sending. In this way, users get good rates for calling/texting people who are not part of the network, and free SIP-to-SIP calls don't impose a big load on the service.

It is expected that the providers would offer some free calls, to allow the users to test the service. Public providers like Betamax already offer free calls to land lines in many countries.

To make this simple for the users, the client application would use the API to provide an unified interface to the service providers, without the need for passwords or a web browser.

To support multiple services, the overseeing authority must be in charge of compiling a list of authorised providers, which then the client application would use when creating an account. Provider selection can be made automatically, depending on commercial agreements or user location, and for power users, the option to manually choose a provider should be offered.

To ensure control of the providers, they should not be able to control the directory servers used. The client application should be careful in exposing as little private data as possible to them.

Account creation

With the components described so far, we can think already of how account creation and first calls would be handled:

  1. The client application creates a public/private key pair.
  2. It requests the directory service to create an account for the public key and the phone number.
  3. The directory service uses an encrypted SMS challenge or other similar mechanism to authenticate the request.
  4. The public key is published, associated with the phone number.
  5. The client app gets a list of service providers, which includes configuration parameters and PSTN rates, and offers the user the chance to select one, or it is automatically selected.
  6. The client app then requests an account creation using the phone number as identification, while authenticating with the published key.
  7. The provider uses DNSSEC to validate the request, and an account is created.
  8. The client then creates an ENUM record in the directory service, associating the phone number with the SIP account.
  9. The client registers with the SIP service, again using the public key to authenticate, and can start making calls right away.
  10. When the user wants to make a non-free call, the application can offer to buy credit, or use the GSM network.

How is this different

As you can see, there is not much difference with the current status. In fact, I would expect that traditional VoIP providers could become part of this network without much additional cost.

The big differences are:

  • Authentication is delegated to the directory service.
  • The provider API, wich handles all administrative tasks, usually offered by web applications.
  • Account creation is 100% automated and immediate.
  • The provider must offer federated SIP service, and use encryption for all SIP transactions.
  • PSTN termination would ideally accept encrypted RTP streams, but that's probably too much to ask.

What's next

In the next post I will describe what's possibly the most challenging part of this project: the client application.

Yakker, part 4: the client application

This is the fourth post in a series of posts (part 1, part 2, part 3) describing a secure alternative to applications like WhatsApp. I started with the following statement:

I believe it is possible to build a system with the simplicity and functionality of WhatsApp or Viber, which provides end-to-end encryption, is built on free software and open protocols, that supports federation and is almost decentralised, and that would allow interested companies to turn a profit without compromising any of these principles.

Now that most of the infrastructure has been described, in this post I will talk about the user-visible part: a mobile SIP client specially tailored for this architecture.

Features

For Yakker to be successful, an application that is visually attractive and simple to use, while providing excellent call quality and stability, is critical. This idea is useless if only geeks adopt it: I want my parents to use it, I want my non-techie friends to use it. I want to tell them to switch from any of the other applications, not only because it is more secure, open and community-based, but also because it is better.

For Android, there is already some excellent free applications that could be used as a starting point: Lumicall, CSipSimple and Linphone. I haven't tried it yet, but Linphone has a port to iOS too.

Apart from the quality considerations, the following features must be added:

  • Certificate creation and proper storage in the mobile device.
  • An account creation wizard that interacts with the directory service and the account providers. Lumicall currently does something like this already, but only for their own ENUM and SIP server only.
  • Proper DNSSEC validating resolver to securely get the callee's certificate and ENUM record.
  • Optionally, use the directory service API to query records instead of DNS, to enhance privacy.
  • Periodic verification of the user's own ENUM and certificate records in DNS.
  • Use of those certificates to set up the SRTP stream, instead of unauthenticated encryption. Text messages must be encrypted in the same way, but included in the SIP message. SIPs encryption is not enough, as the proxies and service providers can read them.
  • Integration with the system's address book.
  • ENUM lookup, and SIP SRV lookup to detect phones with an associated SIP account.
  • When the called party does not have a SIP account, the client must offer to call using the PSTN gateway at service provider, if it provides one, and to call using the GSM network. It needs to be fast and reliable, so people can use it as the default texting and calling application.
  • Provide an interface to query account's balance, calling rates, and to buy credit. Possibly, offer this by opening a web browser, but negotiating authentication first, so the user does not need to enter an user name or password.
  • Capabilities to migrate to a different service provider.
  • A secure way to share the key pair with another trusted device.
  • A way to import a key pair created by the user manually.

I am probably missing a bunch of other capabilities that need to be implemented. It is a lot of work.

Some of these features would only be useful for participants of the Yakker network, but there are others that could be useful for every SIP user, and therefore, implemented first.

For example, SIP accounts that are not associated with a phone number could publish certificates under the same domain, and have the client use them to have secure communications with legacy infrastructure.

Funding

I am aware that this amount of work is not going to happen overnight. In fact, without a bunch of people from the community interested in the project, it is never going to happen.

The good news is that I think that companies might be interested in investing in this. A company could create their own branded version of the client, and use advertisements or service provider preference (the provider the user gets unless they choose one manually) to generate income.

As long as the code remains free, and the client is compatible with the whole system, there could many competing clients out there. Their own promotion schemes would work for the benefit of the whole system, by bringing more users.

Risks

The biggest threat would be of one client monopolising the network, and then changing the protocols to make the system a walled garden. It has already happened with GTalk, so there is precedent for this.

I don't think there is any way to stop a big company from doing this, but one strategy to mitigate the risk a bit would be to create a brand (Yakker or whatever this ends up being called) and have it managed by a trusted community organisation, which can revoke the right to use the brand when a party is not behaving.