Spatial analysis of global Bitcoin mining | Scientific Reports

The validation of Bitcoin transactions is enabled by its proof-of-work (PoW) consensus mechanism1. Bitcoin miners perform scanning for hash value to compete for obtaining the right of recording the block of transactions, and the successful creator of each block is rewarded by a certain amount of bitcoins. This process is called ‘Bitcoin mining’2,3. At the very beginning, mining activity was only supported by a few participants equipped with regular computers4. The surge of Bitcoin price and mining profitability incentivized increasing computing power to participate in the game. Moreover, specific mining rigs were quickly designed, manufactured and upgraded5. Mining sites were purposefully selected and developed. Huge amounts of energy and resources were put into mining industry6,7,8.

Bitcoin and its mining activity have aroused attention in a variety of fields, including but not limited to blockchain technology2,3, financial econometrics9,10, and sustainability issues7,8,11,12,13,14. Exploring the spatial distribution of Bitcoin mining will provide new angles and evidence with respect to a large portion of extant literature. In particular, the investigation from a spatial perspective will help to verify the decentralized design of blockchain technology, to identify certain kinds of price effects on cryptocurrencies and to make accurate estimations on energy consumption and carbon emissions from mining activity.

Some sustainability studies have brought valuable tracking ideas and provided interesting mapping outputs into spatial aspect of mining activity15,16,17,18. Nevertheless, the spatial analyses as by-products from these studies are still limited in terms of data granularity and analytic methods. On the other hand, geographers and economists have a long tradition to describe geographical locations, patterns and dynamics of human production and trading activities19,20,21,22. Bitcoin mining behaves quite differently in space when compared to conventional industrial activities. However, there is barely any novel idea published with regard to this nascent activity. Therefore, in this paper we aim to fill this gap by investigating the spatial patterns, characteristics and shaping forces of mining activity, as well as to understand, from a spatial perspective, the implications to the aforementioned topics from adjacent fields.

We carried out the research by extracting the hash rate data from million-level mining records and then desensitizing, geocoding and aggregating the data by hash rate, month and location (with unique longitude and latitude coordinates). To facilitate the spatial analysis, we divided the surface of the earth into hexagonal grids (n = 7205) and accommodated the hash rate data and the global power plant data23 within the same grid system through multilayer spatial join. We then explored the statistical analysis of spatial measures over the processed data sets. We disclosed four kinds of spatial phenomena of mining activity: diffusion, concentration, association and fluctuation. Furthermore, we put the results in the context of the drivers and stages of Bitcoin mining to better understand the causes for such spatial formations. The data sources and the step-by-step approaches are also detailed in the “Methods”.

Basics of mining activity

Prior to diving into spatial analysis, we explain some basics of mining activity up front. Three key factors that influence Bitcoin miners’ behaviour are economic incentives, technological progress and regulatory schemes. Although there are a number of studies on the economics of Bitcoin mining24,25,26, we simplify the economic concepts of mining to better understand its relation with spatial choices as follows. In Eq. (1), Pij is the mining profit for period i at location j, which is an important indicator for potential participants to determine whether they should enter the industry at the specific period and location. In Eq. (2), GMij is the gross margin for period i at location j, which is another indicator for miners to determine whether the mining rigs should be on or off.

$$ P_{ij} = TR_{ij} {-}FC_{ij} {-}VCA_{ij} {-}VCB_{ij} $$

(1)

$$ GM_{ij} = TR_{ij} {-}VCA_{ij} {-}VCB_{ij} $$

(2)

where TRij is the total mining revenue for period i at location j, which is determined by miner’s hash rate contribution, Bitcoins gained in the total network and exchange rate. FCij is the fixed cost for period i at location j, which consists of the amortization cost of hardware and initial settlement. VCAij is the variable cost (Type A) for period i at location j, which changes along with hash rate, mainly including the electricity cost. VCBij is the variable cost (Type B) for period i at location j, which also varies, but not strictly with hash rate, e.g., labour, bandwidth, cooling and other maintenance costs.

Three key takeaways are worth noting here: (i) any economic decision made by miners is based on the dynamics at a specific period and location but not on the static assumptions regardless of spatiotemporal factors; (ii) revenue factors are almost the same worldwide, while cost factors are highly localized. This means that miners obtain the same economic incentive regardless of where they are located. However, the cost breakdown of mining activity differs from location to location; (iii) it is difficult to achieve a real break-even point because of the high volatility of the Bitcoin price and the constant change in mining competition.

Technological progress intensifies the arm race of mining activity and makes it ‘portable’. Mining hardware has quickly upgraded from central processing units (CPUs), graphic processing units (GPUs) and field programmable gate arrays (FPGAs) to application-specific integrated circuits (ASICs), with an exponential increase in computational performance and energy efficiency5. This has apparently influenced the aforementioned economic equations on both the revenue and cost sides. Meanwhile, a set of modern technologies (including communication, engineering, logistics, etc.) make mining activity able to move and relocate easily in space, as a ‘portable industry’.

Regulatory attitudes towards Bitcoin mining vary significantly jurisdiction by jurisdiction27. Some regulators take it favourable as data centre, cloud computing or fintech, while others treat it as a traditional energy-intensive industry or speculative bubble. Even within the same country, different sub-regions may hold totally different views. For example, mining activity was temporally banned in Plattsburgh, New York28, while it became more favourable in Austin, Texas, due to cheap electricity and a relaxed regulatory environment29. The lack of a clear global-level regulatory framework on how to define and regulate mining activity leaves room for Bitcoin miners to maneuver around the world.

Theoretically, mining activity is therefore free to move wherever it wants to exist. This is different from most industrial activities today, which are tightly constrained in space by two or more factors (e.g., resources, raw materials, talent and labour, market, transportation, regulatory permission). In addition, Bitcoin mining, to some extent, can be viewed as a prototype of the autonomous economy30 (Supplementary Note 2). That is to say, the algorithm, the economic formula and the built-in technology determine the suitable locations for mining and drive human activity to move accordingly.

Spatial diffusion and concentration

It is natural to think that mining activity should be diffused all over the world due to its technical enablers and economic incentives. However, it is still astonishing to see how widely mining activity is distributed. By tracking the nodes connecting to one of the leading mining pools (“Methods”), we detected that mining activity existed in over 6000 geographical units from 139 countries and regions (Fig. 1). Except for well-known locations (e.g., China, Iceland, the US), mining activity was also detected at unexpected locations, such as Tahiti (the island in French Polynesia, the South Pacific archipelago) or Malawi (the landlocked country in Southeast Africa). If we divide the surface of the Earth into hexagonal grids (n = 7205), we notice that 933 grids, namely, 44.3% of Earth’s land surface (Supplementary Note 3), have been found to have Bitcoin mining footprint (Fig. 2). Owing to the arm race of computing efficiency, nonspecific machines were squeezed out, such as desktops, laptops, consoles and smartphones. Otherwise, it will be overwhelming in terms of spatial presence if all the spare capacities of those devices are put into mining activity.

Figure 1figure 1

Global presence of Bitcoin mining activity. All mining locations detected (n = 6062) are mapped by their unique longitude and latitude coordinates. Details of each location are provided in Supplementary Table S2. The results are based on the monthly data from June 2018 to May 2019. The map is created by Geoda 1.18 (http://geodacenter.github.io/download.html).

Full size image

Figure 2figure 2

Share of computing power in terms of hash rate by grid. The share of computing power in each grid is represented as a percentage of total hash rates. All grids (n = 7205) are divided into six tiers with Tier 1 grids (n = 18, share of hash rate ≥ 1%), Tier 2 grids (n = 97, 1% > share of hash rate ≥ 0.1%), Tier 3 grids (n = 162, 0.1% > share of hash rate ≥ 0.01%), Tier 4 grids (n = 211, 0.01% > share of hash rate ≥ 0.001%), Tier 5 grids (n = 445, 0.001% > share of hash rate > 0) and Tier 6 grids (n = 6272, share of hash rate = 0). The results are based on the monthly data from June 2018 to May 2019. Details of the statistics are supplied in “Methods” and the repository as noted. The map is created by Geoda 1.18 (http://geodacenter.github.io/download.html).

Full size image

Although a small portion of miners are hobbyists or believers, the majority of miners nowadays are mining for economic purposes. Undoubtedly, they should tend to concentrate in locations with a competitive advantage for mining. Our results demonstrate this tendency by aggregating and counting all hash rates of individual locations within each grid (Fig. 2). Eighteen top-tier grids (share of hash rate ≥ 1%) accounted for 61.8% of the total computing power during our study period. In fact, miners not only concentrate in a few grids but also cluster with each other in adjacent grids. Moran’s I statistic is used to measure spatial concentration of mining activity (“Methods”). We find that the result suggests a strong rejection of the null hypothesis of spatial randomness (I = 0.65, pseudo p = 0.001 for 999 permutations, z = 97.8). In other words, mining activity demonstrated a strong tendency of concentration, in terms of computing power. We dig it further with Getis and Ord’s Gi statistic (“Methods”) to identify the hot spots (High-High cluster cores) of mining activity under different significance (Fig. 3). Our data extended from June 2018 to May 2019. The maps for spatial concentration and hot spots may change afterwards, which will be addressed in section “Spatial fluctuation”. In addition, mining activity is virtually concentrated in the format of mining pools. An increasing number of miners are now joining pools to optimize the scanning of hash values and share returns based on their computing power contribution3,16. In this analysis, we focus on the spatial phenomena in the physical world, so we will not pursue that in detail here.

Figure 3figure 3

Hot and cold spots of Bitcoin mining activity with the corresponding significance map. (a) The hot spots (High-High clusters) and cold spots (Low-Low clusters) under the default setting of 999 permutations and a p-value ≤ 0.05 are marked in red and blue, respectively. (b) The corresponding significance map shows the clusters with the degree of significance reflected in increasingly darker shades of green, starting with 0.01 < p ≤ 0.05 (n = 215), then 0.001 < p ≤ 0.01 (n = 48) and p ≤ 0.001 (n = 5342). The ‘Not Significant’ category with p > 0.05 remains the same in Maps (a) and (b). Details of the statistics are supplied in “Methods” and the repository as noted. The results are based on the monthly data from June 2018 to May 2019. The maps are created by Geoda 1.18 (http://geodacenter.github.io/download.html).

Full size image

Moran’s I statistic

$$ I = \frac{n}{{\mathop \sum \nolimits_{i = 1}^{n} \mathop \sum \nolimits_{j = 1}^{n} w_{ij} }}\frac{{\mathop \sum \nolimits_{i = 1}^{n} \mathop \sum \nolimits_{j = 1}^{n} w_{ij} \left( {X_{i} – \overline{X}} \right)\left( {X_{j} – \overline{X}} \right)}}{{\mathop \sum \nolimits_{i = 1}^{n} (X_{i} – \overline{X})^{2} }} $$

(3)

where Xi and Xj are the hash rates for grids i and j, \(\overline{X}\) is the arithmetic mean of the hash rate for all grids, wij is the spatial weight between grids i and j, and n is equal to the total number of grids.

Getis and Ord’s Gi statistic

$$ G_{i} = \frac{{\mathop \sum \nolimits_{i = 1}^{n} \mathop \sum \nolimits_{j = 1}^{n} w_{ij} X_{i} X_{j} }}{{\mathop \sum \nolimits_{i = 1}^{n} \mathop \sum \nolimits_{j = 1}^{n} X_{i} X_{j} }},\quad \forall { }j \ne i $$

(4)

where Xi and Xj are the hash rates for grids i and j, wij is the spatial weight between grids i and j, and n is equal to the total number of grids.

Spatial association

As illustrated in Eqs. (1) and (2) and corroborated by our interviews and other studies7,11,15,16, the most significant variable cost for mining activity is the electricity cost, which is used to power mining facilities. In this way, most miners should be inclined to locations that can provide cheap and constant sources of power. We put the global power plant data23 into the aforementioned hexagonal grid system and explored the bivariate Moran’s Ixy statistics (“Methods”) between hash rate and all energy types, fossil, renewable respectively. The results indicate a high significance of the spatial association between hash rate and all three energy variables (Fig. 4), though Moran’s I between hash rate and fossil energy (Ihf = 0.57) is slightly higher than that between hash rate and renewable energy (Ihr = 0.51). Furthermore, we designed a ‘Spatial-hit’ index (“Methods”) to identify areas suitable for renewable mining (Fig. 5), such as the Nordic (Hydro/Geothermal), US-Canada border areas (Hydro), US central (Wind), the Mekong River area (Hydro), and the Caucasus (Hydro).

Figure 4figure 4

Bivariate Moran’s scatter plots and reference distributions between hash rate and different energy variables. (ac) Bivariate Moran’s statistical results between the hash rate and capacity of all types of energy (a), fossil energy (b), and renewable energy (c) demonstrate the degree of spatial association between them. The scatter plot is depicted with the spatially lagged energy capacity on the y-axis and the original hash rate on the x-axis. The slope of the linear fit to the scatter plot equals Moran’s I. The reference distribution demonstrates the result by randomly permuting the observed values over the locations, which is depicted as a distribution curve in the left. The short line shows the value of Moran’s I, well to the right of the reference distribution. Details of the statistics are supplied in “Methods” and the repository as noted.

Full size image

Figure 5figure 5

‘Spatial hit’ index indicates the potential locations suitable for renewable mining. Grids with ‘spatial hit’ index = 2 (i.e. suitable for renewable mining) are highlighted in green (n = 247). Details of the definition and calculation of the index are provided in “Methods”. The results associated with this map are shown in Supplementary Table S4. The results are based on the monthly data from June 2018 to May 2019. The map is created by Geoda 1.18 (http://geodacenter.github.io/download.html).

Full size image

Bivariate Moran’s Ixy statistic

$$ I_{xy} = \frac{n}{{\mathop \sum \nolimits_{i = 1}^{n} \mathop \sum \nolimits_{j = 1}^{n} w_{ij} }}\frac{{\mathop \sum \nolimits_{i = 1}^{n} \mathop \sum \nolimits_{j = 1}^{n} w_{ij} (X_{i} – \overline{X})(Y_{j} – \overline{Y})}}{{\mathop \sum \nolimits_{i = 1}^{n} \mathop \sum \nolimits_{j = 1}^{n} (X_{i} – \overline{X})(Y_{j} – \overline{Y})}} $$

(5)

where Xi and Yj are the hash rate for grid i and the power capacity for grid j, \(\overline{X}\) and \(\overline{Y}\) are the arithmetic mean of the hash rate and the power capacity for all grids, respectively, wij is the spatial weight between grids i and j, and n is equal to the total number of grids.

It is worth noting that it is an adaptive process that mining activity demonstrates a strong spatial association with renewable energy. Renewable energy is not always the cheapest power source and sometimes might be expensive when transmission costs are also included. However, most types of renewable energy (e.g., hydro) bear some kind of ‘perishable’ characteristics, similar to those of fruits (cheap in original place and value down to zero if rotted). Renewable energy providers are willing to offer miners with heavy discounts during peak seasons18. Therefore, it becomes a perfect match between the surplus of renewable energy and the ‘portable’ mining activity. Miners did not realize this at the early stage, while they learned and reacted through continuous testing and iteration. This will be further addressed in the next section.

Spatial fluctuation

When we drilled down to monthly data, we found that mining activity fluctuated in space based on the rolling twelve-month hash rate from June 2018 to May 2019. Here we use 1500 TH/s as the threshold to select grids with at least 100 mining rigs for our analysis (Supplementary Note 4). In terms of the characteristics of monthly fluctuation, grids with hash rate over 1500 TH/s (n = 229) were observed and put into twelve clusters through cluster analysis with K-medoids (“Methods”). We further categorized twelve clusters into four groups with reference to the real operational environment: ascending, descending, relatively stable and seasonal fluctuation (Fig. 6).

Figure 6figure 6

Classification of the grids with differentiated fluctuation patterns. (a) Grids with hash rate over 1500 TH/s (n = 229) are divided into twelve clusters in four groups. The twelve-month fluctuation indices of medoids are plotted in the radar chart as representatives of each cluster. (b) All the observed grids are plotted in Map (b) with their respective categories, sharing the sample colour scheme for each category in panel (a). Details of the results are provided in Supplementary Tables S5, S6 and the repository. The results are based on the monthly data from June 2018 to May 2019. The map is created by Geoda 1.18 (http://geodacenter.github.io/download.html).

Full size image

Every fluctuating grid fluctuated in its own way, which might follow a combination of multiple patterns and can only be explicitly explained case by case. However, four primary patterns are studied and summarized here. (i) Price effect: the drop in the Bitcoin price drives mining profitability down, as illustrated in Eqs. (1) and (2). Large mining farms choose to migrate to locations with more cost advantages or update their mining machines, while most individual or small miners are reluctant to take immediate actions and wait for the suitable time to reopen their mining rigs. All these factors lead to a change in computing power in grids but to different degrees. (ii) Seasonal effect: some miners are accustomed to transfer periodically to leverage the discounts offered by suppliers within certain grids where there is surplus energy during the peak season (e.g., rainy season for hydropower grids). It also happens when these miners move back to their original locations during the off-season. (iii) Regulatory effect: attitudes from regulators dramatically influence the behaviours of miners in related grids. Favourable measures (e.g., subsidies, tax benefits) encourage miners to move in, while adverse measures (e.g., bans, carbon taxation) drive miners out. (iv) Iterative effect: initial mining activity may start randomly from the grids where early believers, tech geeks or speculators inhabit. Miners (in particular large ones) continue to learn and search for better mining locations. The process is iterative for optimal solutions, and the radius of search is expanded to adjacent grids and then gradually to the global scale. Thus, a considerable portion of computing power at the original grids is relocated to the well optimized grids. Unfortunately, only part of this pattern can be observed within our study since the anonymity of the Bitcoin network makes it nearly impossible to recognize early mining locations.

Spatial fluctuation is never ending. We notice that the recent change in regulatory policy towards Bitcoin mining in some jurisdictions (e.g., China’s crackdown in 2021) has intrigued a new round of spatial fluctuation and migration. Bitcoin mining activity is in the process of moving to achieve new spatial equilibrium31,32. We believe that the spatial analysis here will still be applicable in new circumstances.