In OAuth 2 and OpenIDConnect, an authorization server would typically be tasked with issuing bearer tokens to access protected resources. On the web today, the web token format of choice comes in the form of JWTs (JSON Web Tokens). These are normally self-containing tokens that assert information about the entity that bears it. The problem with bearer tokens in their current format is that, like money, once you drop or lose it, it may go into someone else’s possession, and that person may use it however they see fit (OK tokens may not be the same as money since tokens are ephemeral, but the point is there).
In some deployments or architectures, it may be desirable to disavow entities the privilege of using leaked or mishandled tokens. If it can be proved that the entity using the token is the same entity that requested the token from the authorization server, then there would be a class of security threats mitigated by this scheme. Which brings us to the reason why the proposed proof of possession architecture IETF draft is being ironed out.
In OAuth/OpenIDConnect there are 3 key players involved in the protocol’s interactions, namely, the authorization server, the client (sometimes called the application), and the resource server (which typically is an API). In OAuth 2.0, there is basically a mechanism to ascertain if a token was issued by a specific authorization server (by using the tokens signature that was signed by the authorization server) but that is pretty much it. The resource server needs to have a way to authenticate the client as well. More specifically, the resource server may want to assert that the token is used by the intended token recipient.
For PoP to work all parties MUST be authenticated to the other parties with whom they communicate. The drafts proposal is to have the authorization server bind keys to access tokens issued to clients. The material issued is a fresh and unique session key K_session, and K_session is placed into the token encrypted with the long-term key that the authorization server and resource server share K_authserver-resourceserver* (in the case of a symmetric key). To repeat that, within the token itself, K_session is encrypted with K_authserver-resourceserver. In addition to the encrypted K_session being placed in the token, it is also attached to the HTTP response message to the client. It is then the client’s duty to demonstrate to the resource server that it has possession of the key material K_session. The client then when sending a message to the resource header would place the token as usual in the Authorization Header, and in addition to making its usual request to the resource server, it will also create a keyed message digest of the request message and send that along to the resource server as well using K_session. The resource server would receive the token, verify the token then decrypt the encrypted K_session that lies within the token by using the keys that were set up between the resource server and the authorization server (K_authserver-resourceserver). And then the resource server would use that obtained K_session to verify the message digest that the client sent through with the request. In the figure below you can see a more pictorial view of PoP.
*the draft does not specify how the resource server and authorization get these keys, however assume that they are deployed securely out of band such that the key materials whether symmetric or asymmetric can satisfy this scenario.
- Client does a normal request that goes with OAuth 2.0/OpenIDConnect with an additional parameter to denote its intention to want to do proof of possession.
- Authorization server processes the request as per usual in the family of OAuth 2.0/OpenIDConnect grants. Included within the bearer token response is the encrypted session key that has been encrypted by the key that is shared by the resource server and the authorization server. In addition to the normal token response to the client, the session key (unencrypted) is included in the response message.
- The client sends the request to the resource server as normal with the authorization header set with the access token received in (2) however in addition to that the client sends a keyed request of the digest, using the session key retrieved in (2).
- The resource server receives the bearer token, does the usual token verification. In addition, it will extract the session key from the access token by decrypting the encrypted session key found in the token. The resource server will then use this session key to compute a keyed digest of the request sent by the client, if the computed keyed digest matches that of the received digest from the client then we have proof of possession.
Basically, the only extra work required by application developers is to include the digest of the request message to the resource server. Which is not asking for much to be honest.
Proof of possession presents a neat solution to binding a token to a client. There are other architectures and methods that proof of possession can be done that are outlined in the draft but I feel that the one in this blog post is the easiest one to get the point across.