Downgrade Leases
Background
For teams, we have the usual problem of ordering two events in two different sigchains, to ensure that, for instance, a key was used to sign a team update before it was revoked (rather than after), or a team member exercised his admin privileges before he was downgraded (and not after). We finally have a simple and general solution, but there's one important corner case to consider. This document details (1) the simple and general solution; (2) the annoying corner case; and (3) the machinery to fix the (unfortunate) corner case. Here goes!
Establishing Provable "Happens Before" Relationship With Teams
We have two cases in which we need "happens before" relationship to be provable, described above. First, when a team member uses a device key to change the team, he must do so after the key is provisioned, and before the key is revoked. Similarly, when a team member is acting as a team admin, he must do so after he is designated an admin, and before he is removed as being one. These relationships are simple in linearized sigchains, but they are more complicated when "happens before" needs to be proven across chains, as happens in both the of the examples just listed.
A General and Simple Solution
The general problem is establishing an a < b < c relationship, where a and c are on one sigchain, and b is on another. For example, a is when a key was provisioned, b is when it is used, and c is when it is revoked (for non-revoked keys, c = ∞). In both cases, a keybase client performs the following algorithm:
- First establish a < b:
- Look at the signature in b to determine the last seen Merkle Root hash at the time that signature b was made. This is captured in the
body.merkle_root.hash_meta
field of the signature. - Ask the keybase server for a merkle/path from the merkle root from step 1.1 down to the tail of the sigchain that a is in.
- Walk back from the tail of a to a following prev pointers.
- Look at the signature in b to determine the last seen Merkle Root hash at the time that signature b was made. This is captured in the
- Next establish b < c
- Look at the signature in c for
body.merkle_root.hash_meta
- Ask the keybase server for a merkle/path from the merkle root from step 2.1 down to the tail of the sigchain that b is in
- Walk back from the tail of b to b following prev pointers
- Look at the signature in c for
The technique used in steps (1) and (2) are basically the same, but there is
an important difference. Let's look first at step (1), establishing that a <
b. For the signer of b to use the key provisioned in a, he must have consumed
the Keybase merkle tree to a point at or after a's provisioning, and
therefore, the merkle root embedded as body.merkle_root.hash_meta
must contain
a sigchain with a's provisioning in it. We should of course enforce this
invariant on the server, to prevent buggy clients from including old merkle
roots by accident. But the clients don't really need to change if they are
working properly.
An Annoying Corner Case
When it comes to guaranteeing that b < c, we're not so lucky. There could have been a race, and this interleaving might be acceptable to the server:
- Device B downloads the latest merkle root t1 and signs b
- Device C generates statement c at time t2 that revokes device B
- Device B lands its update b at time t3, with
body.merkle_root.hash_meta
at time t1 - Device C lands its update c at time t4 with
body.merkle_root.hash_meta
at time t2
The server will allow this sequence of events to happen since device B was
alive at the time t3, just before it was revoked at
time t4. The problem is
that we can't use the technique from above for clients to prove that b < c
because the hash_meta pointers have crossed! In other words, if a client is
trying to prove that b < c, it will follow the hash_meta
pointer t2,
but can't possibly find a merkle_path
from t2 down to a sigchain for b that
contains b since b happens after t2. We're stuck!
The key conceptual difference here is that a caused b so therefore a had to have happened enough before b for the signer of b to have observed a. But there is no sense in which b caused c since revoking a device can happen at any time. So we don't get the nice ordering guarantees.
The Solution
Here's the solution called "downgrade leases." There are two classes of important downgrades: (1) when a user revokes a device; and (2) when a user is removed from a group or downgraded from admin to non-admin. In both cases, we have to check that b < c but are susceptible to the downgrade race just mentioned. Here's a solution:
- Device C asks the server for a "lease" that covers some downgrade activity, like user u deprovisioning device B with device C.
- The server replies with a lease at merkle root time t1.
- All actions that use device B are not valid if there is an outstanding lease for device B's revocation. So we have to change all signature handlers to not just check if B is still active, but also to check if B isn't slated for imminent revocation.
- When device C uploads the revocation of B, the server checks that the revocation is properly leased, and that the
body.merkle_root.hash_meta
in the signature happens at or after the t1 specified in the lease. If so, the revocation succeeds. - It's possible for a client to die when holding a lease, so these leases expire after about a minute. The same solution is also employed whenever someone loses adminship privileges from a team, and the analogy holds exactly.