Decoding the Crypto Mindset with NLP: Bitcoin, Reddit, and FTX

Decoding the Crypto Mindset with NLP: Bitcoin, Reddit, and FTX

The Bubble That Popped But Didn’t Deflate

When financial bubbles burst, they usually, you know, burst. So, when the FTX crypto exchange collapsed last November, many crypto skeptics expected bitcoin prices to fall to where they believed they rightly belonged: roughly zero. Yet, as of this article’s writing, bitcoin is worth more than in the lead-up to FTX’s implosion. So, what can we make of all this?

A key consideration is where crypto investors source their investment data. According to a 2021 study by the National Opinion Research Center (NORC) at the University of Chicago, crypto investors source 24% of their information from social media and only 2% from brokers and financial advisers. Trading platforms and crypto exchanges supply another 25% and 26%, respectively.

So, just how does this reliance on social media drive crypto market behavior? To find out, we applied natural language processing (NLP) techniques to crypto-related comments on different forums, or subreddits, on the social media platform Reddit and explored how the resulting sentiment analysis correlated with bitcoin prices.

Crypto Market Background

SubredditSubscribers
(Millions)CryptoCurrency6Bitcoin4.8personalfinance17.3stocks5.1Economics3.1StockMarket2.6investing2.2finance1.7

The topic-specific discussion boards to which Reddit users subscribe are capable of moving markets. The wallstreetbets subreddit ignited the GameStop short-squeeze in 2021, for example, and demonstrated the vast influence these channels can have on finance and investing. Given crypto investors’ ubiquitous presence on social media, we expected the influence of these subreddits to be especially pronounced. The most popular financial and crypto-related subreddits based on their total number of subscribers are listed in the accompanying chart. (wallstreetbets has banned discussion of crypto, so is not included in our analysis.)

Each subreddit’s name gives a sense of its general focus, but the word clouds below, which correspond to our study period — 4 November 2022 to 15 January 2023 — provide a more granular picture and cover the lead-up to the 6 November FTX collapse through when we conducted our analysis.

Subreddit Word Clouds, 4 November 2022 to 15 January 2023

Word Clouds showing CryptoCurrency and-Economics Subreddits resultsBitcoin and investing subreddits word cloudsStockMarket and Finance subreddits word cloudsstocks and personalfinance subreddits word clouds

Of the hundreds of thousands of comments on these subreddits over the examination period, we isolated those that implied a crypto sentiment based on seed words indicating a general rather than specific connection to cryptoassets. FTX, for example, might betray a sentiment bias given the surrounding controversy, so we excluded it. Crypto, bitcoin, ethereum, cryptocurrency, cryptocurrencies, BTC, and blockchain, on the other hand, are more neutral and thus were among the seed words that guided our analysis, the results of which are summarized in the following table.

Subreddit Summary Statistics

SubredditTotal CommentsAverage Crypto-Related
Comments per Day1Number of Days
with Crypto-
Related Comments2CryptoCurrency130,0551,78273Bitcoin29,53840573personalfinance314554stocks1,3881971economics1,5832267StockMarket2,7473872investing2,5473572finance48711271. Only comments with at least one seed word are included.
2. Total number of days included in the analysis out of the 73-day examination period.

Model Methodology

We tested many open-source NLP models before selecting a fine-tuned RoBERTa model developed by students from the National University of Singapore (NUS-ISS) to conduct our sentiment analysis. The model was trained on 3.2 million comments from the StockTwits investing forum and was a natural choice given its similar domain and large training set. RoBERTa is based on the groundbreaking BERT model developed by Google’s artificial intelligence (AI) team in 2018. Through their ability to parse context, BERT models have increased the precision of NLP tasks by applying attention mechanisms, which determine how words relate to one another. These attention mechanisms are the same building blocks used in other large language models, such as ChatGPT by OpenAI.

The RoBERTa model labeled each crypto-related Reddit comment as 0 or 1, meaning bearish or bullish, respectively, and generated a daily mean as a proxy for sentiment. A 0.5 score, for example, indicated equally bullish and bearish comments. Differences between the StockTwits and Reddit domains and how users comment on them led to some inaccurate labeling; we believe this would not materially impact the results, however, because we are more concerned with the impact on sentiment from the FTX collapse rather than the absolute measure of sentiment related to cryptoassets.

Results

For a more holistic picture, we combined all the non-crypto-related subreddits and plotted the five-day moving average of daily crypto sentiment in the crypto- and non-crypto-related subreddits as well as the price of bitcoin over the same interval. Below the first graph is the comment volume for each day.

Crypto and Non-Crypto Subreddits: Sentiment Five-Day Moving Average vs. Bitcoin Close Price

Chart showing Crypto and Non-Crypto Subreddits: Sentiment Five-Day Moving Average vs. Bitcoin Close PriceSources: Yahoo! Finance, Reddit

The three time series share some similarities: Each shows crypto sentiment growing more bearish around the FTX collapse and recovering not long after, with the non-crypto subreddits lagging their crypto-specific peers. When the non-crypto subreddits are broken out, the relationship looks a bit more tenuous.

Economics Sentiment vs. Crypto Sentiment and Bitcoin Close Price

Chart showing Economics Sentiment vs. Crypto Sentiment and Bitcoin Close Price

investing Sentiment vs. Crypto Sentiment and Bitcoin Close Price

Chart showing investing Sentiment vs. Crypto Sentiment and Bitcoin Close Price

StockMarket Sentiment vs. Crypto Sentiment and Bitcoin Close Price

Chart showing StockMarket Sentiment vs. Crypto Sentiment and Bitcoin Close Price

personalfinance Sentiment vs. Crypto Sentiment and Bitcoin Close Price

finance Sentiment vs. Crypto Sentiment and Bitcoin Close Price

Chart showing finance Sentiment vs. Crypto Sentiment and Bitcoin Close Price

stocks Sentiment vs. Crypto Sentiment and Bitcoin Close Price

Chart showing stocks Sentiment vs. Crypto Sentiment and Bitcoin Close PriceSources for Six Preceding Charts: Yahoo! Finance and Reddit.

There is no clear sentiment trend in the Economics, finance, and personalfinance subreddits, while StockMarket, stocks, and investing indicate increased bullishness a week or two before bitcoin prices resumed their ascent.

The correlation matrices below, which describe the relationship between each subreddit’s daily mean sentiment and bitcoin prices, tell much the same story. For example, crypto sentiment on Economics has a -0.034 correlation with the price of bitcoin, highlighted by the cell outlined in purple.

Crypto Sentiment Daily Mean Correlation Matrix

Chart showing Crypto Sentiment Daily Mean Correlation MatrixSources: Yahoo! Finance, Reddit

So, how did each daily sentiment score relate to future bitcoin prices? To answer that question, we added three more datasets: one, two, and three days forward, or BTC-USD +1, +2, +3, respectively. CryptoCurrency had the highest correlation with the current BTC price (in red outline), while the Bitcoin subreddit had a relatively low correlation (in orange outline) but one that was increasing for future prices (in black outline), possibly suggesting some predictive power in sentiment scores.

The finance subreddit showed a negative correlation (in green outline). Due to the forum’s focus on traditional finance topics, such as finance-related careers, homework problems, and applications, community members may be more skeptical of bitcoin’s underlying value, which could explain the relationship. Of course, our crypto seed words were not especially common, occurring on just 27 of the 73 days under review, which constituted the smallest sample size among all our subreddits, so there may not be enough data to draw any firm conclusions.

Other subreddits demonstrated low correlations with bitcoin prices. StockMarket (in yellow outline), had a slightly lower correlation than CryptoCurrency for the same-day price of bitcoin but did not maintain the same relationship with future prices. The CryptoCurrency sentiment-bitcoin correlations one, two, and three days forward are directionally similar to those between the price of bitcoin and its future prices (in white outline) and are consistent with the autocorrelation often observed in stocks.

Implications

While the sentiment data from the various subreddits imply some correlation with bitcoin prices, a more fine-tuned NLP model trained specifically on the Bitcoin subreddit rather than StockTwits might add to the robustness of these results and otherwise evaluate the model’s accuracy. Nevertheless, these caveats notwithstanding, our analysis raises some interesting questions about how social media forums can influence market performance. What’s especially compelling is how quickly sentiment rebounded after FTX’s collapse and anticipated bitcoin’s renewed price surge.

Such findings have a host of implications not just about the future of crypto investing but about investing more generally. As more and more people turn to social media forums to inform their investment decision making, herd behavior and self-reinforcing groupthink are likely to grow more common and drive investors to follow investment narratives with little or no basis in fundamental value. And if nothing else, independent of your views of crypto, that is a recipe for more market volatility.

If you liked this post, don’t forget to subscribe to the Enterprising Investor.

All posts are the opinion of the author. As such, they should not be construed as investment advice, nor do the opinions expressed necessarily reflect the views of CFA Institute or the author’s employer.

Image credit: ©Getty Images / metamorworks

Professional Learning for CFA Institute Members

CFA Institute members are empowered to self-determine and self-report professional learning (PL) credits earned, including content on Enterprising Investor. Members can record credits easily using their online PL tracker.

Share On

About the Author(s)

Brian Pisaneschi, CFA

Brian Pisaneschi, CFA, is an Affiliate to the Research and Policy Center at CFA Institute. His research focus is at the intersection of data science and sustainability. He holds an MS in quantitative finance from the University of Bologna and a BS in finance and economics from Lake Superior State University. He obtained his CFA charter in 2017.