An extremely casual code review of MetaMask’s crypto

NB: This post describes a very casual code review of a few cryptography functions used by MetaMask. It does not describe any vulnerabilities. If you’re the kind of person who likes a meandering and amateurish code review that goes absolutely nowhere, you’ll enjoy this post. Otherwise you might want to read something more exciting: I suggest Moxie’s recent post on web3.

For reasons I can’t fully explain, the other day I decided that it might be fun to spend an hour or two investigating the cryptography used by MetaMask.

For those who aren’t familiar with the tool, MetaMask is a browser-based cryptocurrency wallet that is used used to access decentralized applications (dapps) on networks like Ethereum. My interest in MetaMask is mostly just based on curiosity: I recently invested about $100 into a decentralized finance application and I wanted to see how safe it really was. While there are lots of different ways I could lose that money, taking a look at MetaMask’s crypto seemed like as good a place to start as any.

I want to stress that this was an informal code review: I didn’t use any tooling, didn’t even download (most) code to my computer. In fact my “review” mostly involved poking around various Github repositories to see if I could find anything that immediately jumped out as incorrect, and failing that, at least could give me a feeling for the quality of MetaMask’s crypto code. (In fact I did about half the work on my phone while eating a burrito bowl at Chipotle.)

After an hour or two of hunting through dependencies, I made the mistake of tweeting about my feelings:

I decided to look at MetaMask’s crypto, and oh wow I wish I could unlook.

— Matthew Green (@matthew_d_green) January 12, 2022

I swear that this offhand Tweet was not meant to make anyone unhappy or scare people that their funds were at risk. Although I had a couple of scary moments, those were entirely on me. So let me be clear as possible right now: I did not find any exploitable vulnerabilities in MetaMask’s crypto!

What I did come back with is an uncomfortable feeling about the complexity and quality of MetaMask’s (current) crypto code, and some unhappy feelings about its dependency structure. Some of this stuff is basically unavoidable: file it under “browser crypto is scary.” Some is specific to the way the code is laid out. And some of it is a (non-trivial) gripe: this code is is much harder to audit than it should be.

And this last part isn’t just a personal gripe. My TL;DR is that finding and reviewing the correct code was made difficult because there were far too many different organizations owning the dependencies that led to the crypto routines themselves. This made me uncomfortable, given how much money these routines are responsible for securing.

In this post I’m going to justify these opinions by walking you through a casual skim of MetaMask’s code and and crypto dependencies. To make you feel “like you were there”, I’m going to discuss the review more or less in the order it occurred — including the embarrassing part where I went off the rails and reviewed the wrong code.

If you enjoy reading crypto code for entertainment, this post might be for you. If you’re hoping that this will actually lead to anything exciting, I suspect you’ll be very disappointed.

Mục lục bài viết

Quick explainer: what is MetaMask?

If you already know what MetaMask does, skip this part.

Since many readers have probably never used DeFi, I figure I should give a quick background here on “web3” and MetaMask in particular.

From a technical perspective, web3 is pretty straightforward. It consists of many “decentralized apps” (dapps), each of which typically comprises a (typically Javascript) front-end web app, as well as some back-end business logic. What makes dapps special is the back-end portion, which is decentralized. Generally this means it relies on a smart contract running in a network like Ethereum.

Web3 front-ends are just web apps, and typically they’re served to your browser via standard web infrastructure. (Security here usually means HTTPS, so anyone who hacks a server or steals a Cloudflare API key can change these apps’ code in malicious ways.)

Of course, web apps can’t communicate directly to blockchains, nor are they a good place to store private keys. This is why MetaMask exists. In its most popular instantiation, MetaMask ships as a browser extension for Chrome and Firefox. When a web application needs to send a transaction to a smart contract (e.g., because you want to deposit money into Aave), MetaMask is responsible for signing the transaction and shipping the transaction to the chain.

On the bright side for this review, the actual cryptography in MetaMask is fairly limited: as a wallet, it must generate and store Ethereum public and private keys and it also needs to handle “simple” operations like ECDSA signing. On the other hand: we’re in a browser. So nothing is actually simple.

Climbing the dependency ladder

To put some bounds on the effort required, I decided to only look at the implementation of ECDSA signing and key generation (excluding the curve operations.) This seemed like an easy task I could get done in an hour or two while eating an extended lunch.

Finding the starting point for a review like this was harder than I expected. In short: it isn’t easy to grok the control flow of a complex event-based Javascript browser extension to find out exactly which calls are “real” and which are tests or dead code. To shortcut this problem, my approach amounts to “grepping and hoping”, starting from the “develop” branch of MetaMask’s extension repository.

A quick search for “sign” leads us to a promising call in the file metamask-controller.js:

Sadly, following this call path quickly leads us into dependency hell.

First, we reach a library called eth-keyring-controller, which presumably manages Ethereum keyrings. A quick scan of that library shows it calling a second dependency: eth-sig-util. (Both are NPM packages developed by MetaMask.) We’ll jump right to the latter package, which… you guessed it, calls yet another package:

This call takes us out of MetaMask-owned code and out to a new package called “ethereum-jsutil” that’s maintained by the “Ethereum Javascript community“, whoever they are. (Don’t worry, we won’t stay here long enough to care.) Of course, in this package we find yet another layer of dependency indirection:

From here, we head over to a package called “js-ethereum-cryptography“, which holy cow is actually a repository maintained by the Ethereum project itself!

At this point I’m about halfway into my allotted review time and asking myself questions like “why didn’t MetaMask just call this library directly” and “why do I make poor life choices”. But never mind, we’re almost there: surely we are going to see some actual crypto code soon!

Except no, we are not. This is what we find in the Ethereum library:

Yet another dependency: this time to something called “noble“. Here my quick-and-dirty approach to dependency resolution (Google “npm <package-name>”) fails me, since NPM says that “noble” is is a “Node.js Bluetooth Low Energy library.” As cool as that would be — access the blockchain through Bluetooth! — I’m guessing this is not right.

A bit more searching leads me to believe that noble is actually the Noble cryptography library, which appears to have been developed by Paul Miller. And hey, this code looks promising! The page lists actual cryptography design goals that seem reasonable, and the code is written in TypeScript. Even better: the library has been subject to an audit by Cure53.

Nonetheless, and with no disrespect to Paul or anyone else in this community: I would like to take a moment to complain about this dependency structure:

Resolving all these pointless dependencies has eaten up a lot more of my review time than you might imagine. (I’m leaving out all the times I accidentally visited the wrong libraries because they used some combination of “js” and “ethereum” and “cryptography” and just Googling is risky here.)
More substantively, I can’t help but notice that there are a lot of code owners in this critical path. So far I’ve traveled through repositories owned by four different organizations, and the last one is someone’s personal Github account. Is this normal for a system that secures billions of dollars of user funds? Maybe it should not be.

But enough complaining. There is actual crypto code here! We can finally look at ECDSA.

Except… it turns out that we can’t do that, because I made a stupid mistake.

After speaking with Paul Miller on Twitter, I learned that this whole code review has been premised on a very foolish assumption: namely that the code in the main (development) branches of MetaMask and its dependencies is actually the code people are using in MetaMask today. That turns out to be wrong. In fact, Paul tells me, the noble library is slated for deployment in an upcoming release. The current release of MetaMask relies on a library called “elliptic“, which was written by Fedor Indutny.

I’m now scraping up those last bits of cheese in the burrito bowl.

Let’s look at elliptic

Ok, so forget everything I did above. That was my mistake. I promise to come back to those newer code paths soon, but that’s for the future. My goal in this review was to look at MetaMask as it exists today. Apparently that means I need to look at Fedor’s elliptic library.

(Note: if I was being professional then what I would really do is review all the release branches of MetaMask and all of its dependencies just to make sure I’m in the right place, and to see what the calls looked like. But life is too short: I’m going to trust Paul.)

The elliptic library is written in “plain old” Javascript, so types will tend to be confusing. On the bright side, it’s relatively compact. The core library supports Ed25519 (for EdDSA) and Secp256k1 for ECDSA on networks like Ethereum. Let’s focus on the latter.

I’m not a Javascript developer so certain things aggravate me about reviewing this code. One is that developers often put critical routines in stupidly-named files like “index.js”, which is where we finally reach the core signing implementation for ECDSA.

Signing

ECDSA is a stupid algorithm, but fortunately it’s not a very complicated one. Leaving aside the curve operations, there are basically two places where ECDSA implementations tend to go wrong. The first is in key generation. The other is in signing.

In ECDSA signing the main security risk is in how nonces are generated: it’s critical that the ECDSA nonce (“k”) is sampled uniformly from the range 1 … n-1, where n is the group order. Critically, the same nonce must never be used with two distinct messages (really, message hashes.) If this ever happens, it’s possible to recover a private key from two signatures, something that’s generally frowned upon.

Let’s look at how signing is handled in this library:

So looking at the code above, how do we get “k“?

A first observation is that there is no actual randomness here, no call to a random number generator. Instead, the signing routine instantiates a deterministic random bit generator (DRBG) based on HMAC. That algorithm stretches a shorter “seed” into a longer sequence of pseudorandom bits that can then be used to obtain our random nonce.

I initially had a small heart attack when I (briefly!) misread the code and thought that the only seed for the DRBG was the ECDSA private key: this would definitely lead to nonce re-use. On closer inspection, the DRBG is seeded with two fields: entropy is set to the ECDSA private key, and nonce is set to the (hashed) message to be signed. Within the underlying DRBG implementation both values get hashed into the a value called seed, which actually is the seed for the DRBG. This design should mean that each (private key, message) pair will get different random bits, and thus different nonces “k”.

While I don’t love a purely deterministic generation of “k” (and I’ll explain why below), this isn’t some roll-your-own idea: in fact this is appears very similar to the approach recommended by RFC 6979. And going one step farther, it appears that elliptic is actually directly implementing an algorithm from that RFC, specifically the one in section 3.3, “Alternate Description of the Generation of k.” Annoyingly I did not learn this fact from the code itself, which would have saved me a lot of time. I realized it only while reading through the RFC for unrelated reasons (in fairness, elliptic’s README does mention it.)

(A brief aside to developers: if you implement a standard algorithm, please please add a reference in comments at the point where you’re using it, and comment each step so reviewers can see exactly how it maps to the pseudocode. For a good example of how to do this, take a look at how noble implemented the same standard.)

So are there any issues with this implementation?

A particularly weird feature is that the caller of the sign() function can pass in a data structure called options. One element of this structure appears to be a function, which is named (not very descriptively) options.k. If included in the options structure, this function appears to override the built-in nonce generation and replaces it with logic the caller provides. This is weird. While I can see an argument for allowing callers to provide extra entropy, I can’t see a good reason why a caller should be allowed to entirely override nonce generation within the signing routine.

Overall, this seems like a giant footgun and also a good opportunity for some caller to slip in malicious code that could easily slip past a review. (Indeed, thanks to all this dependency confusion I’m not even sure which of MetaMask’s dependencies is calling this routine.)

As a weird added bonus, the same options struct can also be passed in the enc argument as well; if this happens, the structure will be copied over into options. I’m sure there are perfectly good reasons for this.

A few smaller notes:

The function _truncateToN() seems poorly defined and serves two slightly different purposes, triggered by a boolean flag — which is just omitted in some calls.
In the first mode it simply truncates the message to ensure it is the same length as the group order n. This mode is only used to truncate nonces, which is not something that should ever need to happen — except when you’re using the crazy options.k function discussed above.
When called with “truncOnly = false“, however, _truncateToN() will also ensure that the truncated result is less than or equal to n, and if not it will subtract n. This appears to implement the bits2octets subroutine of RFC 6979, which is fine I guess but took too long to figure out. Why not just write two routines and name them appropriately?

Neither of these two issues are critical, but they made the code harder to read.

As a final note: I would urge developers to avoid these purely deterministic ECDSA implementations, mostly because they make signing implementations very fragile. Since all of the “entropy” in this signing routine comes from the message and private key, this means that any non-determinism in the actual ECDSA implementation (e.g., caused by weird option flags or glitches in BigNum encoding) can potentially produce exploitable signatures if the same message is ever signed twice. (As Nadia points out: this could also happen if you install the same private key into two different wallet implementations and sign the same message.)

In either case: adding some genuine entropy to the nonce could remove these concerns in most cases. Section 3.6 of RFC 6979 describes how to do this by adding an additional random input.

Key generation

Update (1/14): the comments below discuss key generation in elliptic, but MetaMask uses a BIP39-based keyphrase, which means that most ECDSA keys are actually generated elsewhere (allegedly in bitcoinjs/bip39). In retrospect this should have been obvious to me, but thanks to @kumavis_ for pointing it out. I’m keeping the section below just to be complete, but this code (probably) isn’t used by MetaMask.

The last stop on this very brief tour is the ECDSA key generation algorithm. This be found slightly higher up in the same file:

Once again we have an appearance from our old friend HMAC-DRBG. As with the signing routine, the caller can supply an options structure. Fortunately this one only contains optional entropy, and does not allow the caller to completely override key generation. Even better: the input to HMAC-DRBG appears to include actual randomness, drawn from a call to a function called rand().

Does rand() produce actual random numbers in all browsers? I think so, but really could not tell you for certain. This routine is implemented by a package called indutny/brorand, the entirety of which I’m pasting below, just because it made me burst into maniacal laughter:

Let’s just assume the entropy is good. The remainder of the (private) key generation function takes place in these lines:

This generates a random number priv between [0…2^{256} – 1] and then makes sure that priv is not greater than the group order n, in which case it re-generates priv. It then adds 1, which I think gives us a priv in the range [1…n+1], which seems wrong, since it should be in the range [1…n-1]. Am I doing my arithmetic backwards?

A basic point to make here is that much of this review has come down to examining code that solves a single problem: (deterministically) sampling integers within a precise range. So far I’ve reviewed at least two different custom implementations of the same process, both with slightly different results. Why is the same code repeated so often? Just make this a subroutine so we can analyze it and be sure it’s working correctly.

Anyway the (possible) issues above probably don’t really matter. I’ll assume in most cases private keys are generated correctly using a secure random number generator, which takes the most obvious risks off the table.

Conclusions

Ok, I am basically just exhausted now. This was a lot of work to evaluate a tiny piece of a crypto library that, frankly, might not even be the actual crypto library that MetaMask is using. I’m not sure if any of this was worth it, and I’m starting to get indigestion from the Chipotle.

If I could summarize my overall findings above, they would look something like this:

I do not like reviewing Javascript for many reasons, not least of which is the lack of types. Fortunately many MetaMask dependencies are written in TypeScript. This library should be too. (Fortunately Noble appears to have made this upgrade.)
The stupid dependency chain between MetaMask and its crypto libraries makes reviews more difficult than they need to be, and adds too many parties that need to be trusted. This should be greatly simplified, unless there is a reason for it that I don’t understand.
The crypto code may be well-written in a cryptographic sense, but it was not really optimized for humans to review. This makes any review much harder, and I think anything that makes reviews harder is bad for users.
Allowing callers to specify dangerous optional arguments to crypto routines seems bad.
If you’re writing a security-critical routine like “sample random integer in a precise range,” write that routine one time, not multiple times. Please!
Too much dependency on random numbers can make ECDSA dangerous. I would argue that too much determinism can be just as risky. It might be good to find a solution that mixes the two, as discussed forther above.

If it was up to me I would re-write the entire codebase in TypeScript and would try to use more standard libraries. I would remove layers of dependencies. I would tighten up the crypto APIs to make sure malicious calls are harder to get away with. Finally, I would make sure every single major block in this code had crystal-clear comments explaining precisely what it is doing.

As a last point, I would move all of this code back under the control of some more centralized organization(s), rather than leaving essential code in random personal repositories.

In summary: I don’t think this tiny review was entirely a waste of time. Although I don’t love everything about the way this code works, I’m now 15% more confident that MetaMask isn’t doing something utterly stupid with my cryptographic keys. That seems like a genuine win.

Next post: I’m going to review the noble-cryptography libraries to see what the new MetaMask code is planning to do.