Downgrade Leases

Background

For teams, we have the usual problem of ordering two events in two different sigchains, to ensure that, for instance, a key was used to sign a team update before it was revoked (rather than after), or a team member exercised his admin privileges before he was downgraded (and not after). We finally have a simple and general solution, but there's one important corner case to consider. This document details (1) the simple and general solution; (2) the annoying corner case; and (3) the machinery to fix the (unfortunate) corner case. Here goes!

Establishing Provable "Happens Before" Relationship With Teams

We have two cases in which we need "happens before" relationship to be provable, described above. First, when a team member uses a device key to change the team, he must do so after the key is provisioned, and before the key is revoked. Similarly, when a team member is acting as a team admin, he must do so after he is designated an admin, and before he is removed as being one. These relationships are simple in linearized sigchains, but they are more complicated when "happens before" needs to be proven across chains, as happens in both the of the examples just listed.

A General and Simple Solution

The general problem is establishing an a < b < c relationship, where a and c are on one sigchain, and b is on another. For example, a is when a key was provisioned, b is when it is used, and c is when it is revoked (for non-revoked keys, c = ∞). In both cases, a keybase client performs the following algorithm:

First establish a < b:
1. Look at the signature in b to determine the last seen Merkle Root hash at the time that signature b was made. This is captured in the body.merkle_root.hash_meta field of the signature.
2. Ask the keybase server for a merkle/path from the merkle root from step 1.1 down to the tail of the sigchain that a is in.
3. Walk back from the tail of a to a following prev pointers.
Next establish b < c
1. Look at the signature in c for body.merkle_root.hash_meta
2. Ask the keybase server for a merkle/path from the merkle root from step 2.1 down to the tail of the sigchain that b is in
3. Walk back from the tail of b to b following prev pointers

The technique used in steps (1) and (2) are basically the same, but there is an important difference. Let's look first at step (1), establishing that a < b. For the signer of b to use the key provisioned in a, he must have consumed the Keybase merkle tree to a point at or after a's provisioning, and therefore, the merkle root embedded as body.merkle_root.hash_meta must contain a sigchain with a's provisioning in it. We should of course enforce this invariant on the server, to prevent buggy clients from including old merkle roots by accident. But the clients don't really need to change if they are working properly.

An Annoying Corner Case

When it comes to guaranteeing that b < c, we're not so lucky. There could have been a race, and this interleaving might be acceptable to the server:

Device B downloads the latest merkle root t₁ and signs b
Device C generates statement c at time t₂ that revokes device B
Device B lands its update b at time t₃, with body.merkle_root.hash_meta at time t₁
Device C lands its update c at time t₄ with body.merkle_root.hash_meta at time t₂

The server will allow this sequence of events to happen since device B was alive at the time t₃, just before it was revoked at time t₄. The problem is that we can't use the technique from above for clients to prove that b < c because the hash_meta pointers have crossed! In other words, if a client is trying to prove that b < c, it will follow the hash_meta pointer t₂, but can't possibly find a merkle_path from t₂ down to a sigchain for b that contains b since b happens after t₂. We're stuck!

The key conceptual difference here is that a caused b so therefore a had to have happened enough before b for the signer of b to have observed a. But there is no sense in which b caused c since revoking a device can happen at any time. So we don't get the nice ordering guarantees.

The Solution

Here's the solution called "downgrade leases." There are two classes of important downgrades: (1) when a user revokes a device; and (2) when a user is removed from a group or downgraded from admin to non-admin. In both cases, we have to check that b < c but are susceptible to the downgrade race just mentioned. Here's a solution:

Device C asks the server for a "lease" that covers some downgrade activity, like user u deprovisioning device B with device C.
The server replies with a lease at merkle root time t₁.
All actions that use device B are not valid if there is an outstanding lease for device B's revocation. So we have to change all signature handlers to not just check if B is still active, but also to check if B isn't slated for imminent revocation.
When device C uploads the revocation of B, the server checks that the revocation is properly leased, and that the body.merkle_root.hash_meta in the signature happens at or after the t₁ specified in the lease. If so, the revocation succeeds.
It's possible for a client to die when holding a lease, so these leases expire after about a minute. The same solution is also employed whenever someone loses adminship privileges from a team, and the analogy holds exactly.