Tuesday, April 17, 2012

Making Security Simple

One of the major features we are working on for the Serval Project right now is security.  We want secure phone calls, secure MeshMS/SMS and secure data transfer. But the reality is that we have been working on security from the outset, because we know that it is a vital component of any modern communications system, particularly if we are to preserve people's privacy.

One of the very early design decisions was to use public keys are the mechanism for identifying phone users at the network level (at the human levels, it is all phone numbers, so that it remains easy to use).

By using public keys as the network identifier, we get some very nice properties.  By carefully choosing the cryptographic algorithm we are gain the ability to perform a Diffie-Hellman shared secret agreement process.  This makes it possible for any two parties to encrypt their mutual communications, without them having to first establish contact. This provides the same major security benefits (confidentiality, authenticity of communications) of insanely complex systems such as IPsec, but without the pain.

We have now reached the point where we can perform secure communications over the Serval Overlay Mesh, which is our implementation of this concept, using 256-bit Curve25519 public keys as network addresses, and with a defaults-secure Mesh Datagram Protocol.  This is based on the NaCl crypto library, and uses CryptoBox authenticated-encryption for unicast traffic, and CryptoSign verified signing for publicly readable broadcast traffic.

Curve25519 makes this feasible, due to the relatively short key length.  In contrast, using RSA we would need 3,072 bit keys for similar security (~128 bits of security).  Three addresses are required in each packet: source, destination and next-hop.  Using RSA 3,072 bit keys this would be 1,152 bytes -- just for addresses.  Even using insecure RSA 1,024 bit keys the addresses would require 384 bytes. With Curve25519 we need just 768 bits (96 bytes) for addresses. We further improve on this through address abbreviation, where we only send the full addresses occasionally, and use a unique prefix the remainder of the time.  This allows us to reduce the address overhead by 75% or more, thus resulting in a lower address overhead than IPv6, but while gaining the attractive security properties that we have been talking about.

An important distinction between our MDP security layer and IPsec is that that we do not need any key exchange, as the keys are implicitly communicated since the network address is the key.  This is a luxury that IPsec could not make use of, because IPsec was required to be able to be retro-fit into existing IP networks with already-allocated addresses, and geographically allocated addresses.  MDP's your-key-is-your-network-address only makes sense because geographically allocated addresses are not necessary (or even really possible) on a mobile adhoc mesh network.  Also, IPsec relies on third-party authorities to manage key exchange and related functions, and so is not suitable for use in an infrastructure-deprived setting.  MDP's security is not client-server, but rather truly peer-to-peer.

The current state of the MDP security system is that it is operational, but requires some further refinement that we are undertaking, in particular to settle the API.  Nonetheless, it is already surprisingly easy to use MDP to enable secure communications.

So let's see how this all ties together by taking a look at the MDP ping test program that we have written.  You can find the source code here as part of the Serval/DNA source on github. The vast majority of this example is in fact to perform the ping function and to report on errors (a common situation for network programming), the actual MDP networking primitives are no more complex than UDP.  So let's take a look.

First, we need to bind to an MDP socket.  MDP uses 32-bit port numbers and 256-bit network addresses. A given node might have more than one network address.  As MDP is implemented as an overlay network, we must talk to the local MDP server on the node to perform all operations.  This is done over a unix-domain socket.  This has the advantage that it can be poll()'d, select()'d or whatever you prefer, just like a normal socket.  So let's start by opening the connection to the MDP server and asking for the list of network addresses:

We start by creating an overlay_mdp_frame structure and populating it with our request.  Here we are asking for any local addresses (SIDs), by making the requested range of SIDs start at the beginning (-1), and include up to a couple of billion entries.  Of course the reply has to fit in a single packet, so the server may respond with only a partial list, which will be indicated by it setting last_sid to the index of the last SID returned, and frame_sid_count to the number of entries returned.

We then send the MDP request by calling overlay_mdp_send, with the mdp frame, any flags, in this case MDP_AWAITREPLY that tells the MDP library that we want to send this request, and then wait for a reply from the server, followed by the timeout for that reply (in milliseconds).  So the call in the above example will allow the server up to five seconds to come back with the list of local addresses. The return value of overlay_mdp_send is the the number of MDP frames sent, either for for success or zero for failure.

The remainder of the above code checks to make sure that no error has been returned, and that what the server replied with was in fact a list of addresses, as indicated by the packet type being MDP_ADDRLIST.

Once we have the address list, we can then use one of the local addresses to bind our listener to:
We start by choosing a port, in this case a random port number in the range 0x8000 - 0xffff.  The port numbers are natively 32-bit, however, so you could use much larger port numbers.  Port numbers 0xf0000000 - 0xffffffff are reserved.

Moving on to the network address part, we could just set our bind source address to all zeroes like on an IPv4 network, and listen on all local addresses, as illustrated in the disabled line of code, in which case we would not have needed to get the list of local addresses.  Similarly, if we already knew the address we wanted to listen on, then we can simply provide it in the bind sid field.  But it is helpful to illustrate here how to find and bind to a specific local address.

We know that the previous call to overlay_mdp_send has returned a list of addresses, and we know that a Serval overlay mesh node always has at least one local address, so we will assume in this example that the server response contains at least one address.  We then copy this address from the list of SIDs into the bind source address field.  We have to do this before we setup the rest of the bind request, because we are reusing the same overlay_mdp_frame structure, and mdp.addrlist and mdp.bind are part of a union that share storage space.  All that then remains is to provide the port number, and call overlay_mdp_send to request that the server make the binding.  If no error has occurred, the binding has been created, and can be used.

For this ping-over-MDP application we use a sequence number to help filter any old packets that might arrive, and keep track of packet arrival statistics (this is a fairly simplistic implementation of a ping-like service, and lacks a number of important features, but it serves to illustrate the use of MDP sockets just fine).  We then determine the SID we want to ping.  If the string "broadcast" is supplied, we fill the SID with all ones (0xff bytes) to indicate broadcast, and set a flag to remind us that we are broadcasting, just as it does on IPv4 and ethernet.  Else, we store the hexadecimal representation of the address using the stowSid convenience function.  The next step is to actually send ping packets:
This code warns the user that if they are broadcasting that the ping packets will not be encrypted (not that there is anything too private in these packets, but it is nice to inform the user).  We then enter a loop of endlessly sending ping packets.  These data packets are identified by giving them a packet type of MDP_TX.  If they are being broadcast, then the packets must also be marked as not requiring encryption using the MDP_NOCRYPT flag.  As the MDP_NOSIGN flag is not specified, the packet will still be signed, allowing the recipient to reject the packet if it has been tampered with, and to be assured that the packet did in fact originate from the claimed address.

We could have done this the other way around, and have the programmer mark when a packet should be encrypted, but that would cause errors of omission to result in security breaches, instead of resulting in accidental employment of full security, which is a much better failure mode.  This passive safety is supported by MDP server checks that will return an error if an impossible combination is employed, e.g., a broadcast packet with encryption.

The remainder of the above code simply packs the source and destination addresses and payload (the server rejects any packets with a source address/port combination that has not been bound to), and sends it by the now familiar overlay_mdp_send function.  The next step is to watch for any incoming "pongs":

This loop waits until one second has passed, and displays any ping-replies (pongs) that arrive.  It uses the overlay_mdp_client_poll function as a convenience over providing the mdp client socket directly to poll or select (although that could be done).  If packets are waiting, it uses overlay_mdp_recv to receive them, and if the received MDP message is indeed a packet (indicated by the MDP_RX type), then the packet is displayed.  The packet type and flags field also tells the receiver whether the arriving packets were signed and/or encrypted.  Signed or auth-crypted packets that have been tampered with are detected and dropped, and so the client never sees them.  To improve passive security for applications that only want to receive authenticated data we may add an option to the MDP bind request to specify that any unsigned packets be dropped and never presented to the client.

Below is an example of MDP ping in action, showing that the packets received via the MDP loop-back from the local address (FE83...) are signed but not encrypted (since encryption is not required), and that packets received from a remote host (2447...) were both signed and encrypted:

Notice that nowhere in this program is there any mention of keys (apart from their implicit handling as network addresses), and no key-exchange must be handled, or third-party authority consulted.  Instead, special effort is required to reduce the security of the communications, e.g., for sending broadcast messages.

The security is just baked in from the ground up, defaults to on, and can be used anywhere, anytime, as it should be.

Monday, April 16, 2012

Sharpening Serval's Mission

We have been doing a bit of reflecting lately about what it is that we are trying to accomplish with the Serval Project, not because we wonder what we are trying to do or why we are trying to do it.  Rather, we have been thinking about what order to do things in, so that we can get something usable into people's hands as soon as possible.

To answer this question we went back to our motto of "communications anywhere, anytime", and did some thinking there.  We came to realise that we can probably refine that, and that by doing so, we will probably come to understand what are the important things that need to happen first, and what are the (in many cases equally important) things that we will work on once we have the initial objectives under control.

So here is our current thinking, and we invite any and all to give us feedback on this, and potentially change the line up.  What follows is our working position in the absence of any feedback.

We think that Serval's mission can be expressed more clearly as:

mobile communications for those in need

This more or less says what we have been saying all along, but it moves the focus from the technology to the people using the technology.  At the end of the day, we are only making the technology in the hope that it can help people.

So we then turned to think about what the "minimum viable product" is for Serval, and we settled on a short-list of the four things that really matter right now.  These are the things that we are focussing our resources on, and not until those are under control will we move on to our longer list of things that we have on the drawing board (some of which I will describe later).

The four things that matter now are:

  • ease of use - the Serval software must be easy and intuitive to use, otherwise no one will use it, irrespective of how great and innovative the technology might be. 
  • resilient phone calls, messaging and file distribution - these are the core functions of the Serval software, i.e., what it let's you do.
  • strong, simple, security - security in mobile communications is so often an after thought, or severely compromised.  For example, the security of 2G and 3G mobile telephone systems has been thoroughly broken for years.  And when security is introduced into systems it has a bad habit of making life harder rather than easier.  Consider the complexity and interoperability issues suffered by IPSec as an example, indeed so complex that it's security benefits go largely un-used.
  • no infrastructure needed - this is what is needed to make sure that we can most effectively help those in need, as it is the failure or absence of infrastructure (or affordable access to infrastructure) that is a recurring characteristic of those we are trying to help.  Besides, there are already plenty of infrastructure-bound communications systems.

We are tracking well on these points, and expect to have positive news on the progress of several of these points over the next month or so.

Once we have these under control, we have a long list of features that we will immediately turn to, use-cases and improvements, some of which we are working on now in a limited capacity, and that we are planning support for from the outset. In no particular order, this list includes:

  • broad device support - For now we are happy to support a limited range of handsets, but we know that eventually we need to support as wide a range as possible, including non-Android phones.
  • Traffic prioritisation/QoS - Right now our focus is on making the core functionality as solid as possible.  We have plans in place for some innovative QoS functions that we will implement as resources allow.
  • optimising battery life - The mesh can already run for a full business day on a cheap Android phone, but we want more. We want to get to at least 48 hours of continuous operation on a single charge, and ideally much more.
  • massive scalability - Our protocols are designed with massive scalability in mind, but there are aspects of this that we simply don't have the time or resources to implement right now.  We figure that it is more important to get it working at the "village scale", and once that is right and larger networks start to emerge, it will be quite natural for us to complete our plans for scalability.
  • privacy and plausible deniability - For some people they take their privacy seriously, sometimes because their lives depend on it, or because the lives of the people they are in contact with depend on it.  Eventually our software should make sure that it can be configured to not betray any such information.  This is a considerable technical challenge, and one that it is not even certain that it is possible to achieve.
  • journalistic applications - This is intrinsically linked to the previous point, as one of the main issues for journalists is keeping their sources confidential.
  • complex “crisis mode” of operation - By this we mean some complex modes of operation where a network of phones automatically detects that a crisis is underway using various network heuristics (such as loss of cellular service, correlated vibrations or loud noise over a wide area, and to take some hithero undefined appropriate action.
  • group coordination - We have plans to introduce a variety of work-group/team features, that will allow ease of communications and coordination amongst small teams of people.  These are based on the core features that we are working on now, and so will be natural for us to complete once the dependent features are complete.
  • global mapping system and services - We already have an effective prototype of our mapping application that we demonstrated recently, and that we will continue work on as we are able.  This is a feature that has us quite excited, because it allows for a wide variety of coordinated and crowd-sourced activities through allowing users to create and share points of interest on the map, and to even automatically share the map tiles over the mesh, creating something like Google Maps that works without the internet.
  • environmental monitoring applications - We already have some projects in mind here that will make use of the mapping system to make it easy to collect, share and visualise environmental data in the field, and so hopefully help people assess and protect the natural and built environment.
  • -medical applications - We are quite excited about the possibility of making a cheap medical device platform that uses a sub-$100 Android phone as it's basis, combined with an open hardware port that allows the connection of an increasing number of off-the-shelf and custom medical attachments, such as pulse-oximetry, prick-less anemia testing, blood pressure, foetal heart-rate monitor and even imaging ultra-sound.  Support for additional functions and probes would be added to the platform overtime (with software updates over the mesh where required), as the interface would be open and flexible, doing to the medical device field what the smart-phone did to the software field. Extreme low-cost, field-upgradability, combined with resilient mesh communications to allow remote monitoring and collection of data and communication of alerts in infrastructure-deprived medical settings, we expect this concept to be highly disruptive to much of the medical devices space.

So that's what we are thinking about right now.  But we invite your input to help us shape this as best we can, and also to help us achieve these goals if you want to be part of the action or to help us resource this work.