Most asset managers and stock pickers' goal is to generate alpha, meaning to outperform the market by superior ability. One of these is Jim Cramer, host of CNBC's "Mad Money with Jim Cramer" show, featuring fiery market commentaries, stock recommendations and everything in between.
Jim is a fabled stock picker with an extraordinary ability to pick stocks that keep on... losing. A lot. So much so that there is a bountiful collection of Jim's picks and the subsequent stock movements in the exact opposite direction. In fact, there were so many that people started calling it the "Jim Cramer inverse indicator" - basically, always go the opposite direction of what Jim recommends, and you will make money.
Credit: Wall Street Memes, Instagram: @wallstbets
Though what started as a meme evolved into a business - there is already an ETF tracking the performance of Jim's picks - just in inverse.
The natural question arises - do Jim's (inverse) picks really generate alpha? We can rephrase it as: are Jim's picks so wrong so consistently that you would outperform the market by consistently doing the opposite?
We can test it out quantitatively!
Note: This post serves solely as a fun project to learn coding, the basics of sentiment classification, and creating strategies with Python. In no way does it represent or recommend the actual trade of the strategy.
The project is a great exercise to practice our coding and discover different areas of coding than just pure finance. Let's look at the project's structure and what skills we will need to accomplish it.
Twitter is Jim's primary communication channel about his stock picks (except for his TV Show) and the main source of all the memes. Thus, we will need a way to scrape his tweets to obtain his ideas about different stocks, which is an excellent data collection exercise.
In our last project, we predefined tickers whose prices we downloaded from Yahoo Finance. This time, we will also need names of the stocks assigned to the tickers for a more robust classification. Again, it is a nice data collection exercise.
This is the truly difficult part, as Jim is a human, not a machine. As we will see later, the text in his tweets are somewhat messy and not exactly standardized. To the human mind, $AAPL, AAPL, Apple, or Apple Inc. makes little difference, and we have no problem recognizing what the traded ticker is.
However, these are almost completely different for a machine, and we will have to develop a robust way to recognize which stock Jim is talking about, with as few manual interventions as possible. This is truly an excellent data handling exercise not only for finance.
Another challenging part is deciding whether Jim recommends the stock based purely on the text. As humans with basic finance knowledge, we know that Jim recommends a stock to buy when he says, "$AAPL is undervalued, should go higher," but how does a machine know?
We will have to build a sentiment classification algorithm, which gives a little insight into the NLP (natural language processing) area of data science. Of course, building such an engine from scratch takes time and is far beyond the scope of this post. Thus we will rely on open-source libraries available in Python.
When we know what stock Jim is talking about, whether he likes it or not, and when he makes the recommendation, we can start building our trading strategy. Because the nature of the data is quite unique and we will be trading single stocks, we will have to devise clever rules to make the strategy as realistic as possible, which is an insight into the data handling and finance intuition department.
The last part of any trading strategy is to visualize it, check its plausibility, and make adjustments if needed. Because of the nature of the strategy, we can again create some specialty functions, which is a great data visualization exercise.
With the structure laid out, let's start!
You can find the complete code and necessary data files in this GitHub repository. Make sure to download it to make sure the code runs properly. Don't worry if it's too much all at once - we will take you through the code piece by piece!
Let's begin by importing the necessary modules and a parameter file in .yaml format, which specifies how our scripts should behave. We will visit individual parameters throughout the code.
Key packages here are:
We'll be using the "snscrape" package to scrape Jim Cramer's Twitter using his handle @jimcramer. A detailed instruction can be found in this Medium article or in the official GitHub repository of the package.
In our implementation, we:
Additionally, we create a simple if-else statement to decide whether to load the tweets fresh or whether to use an already prepared .pickle file with roughly 4 years of tweets*.
* Because of rapid news changes at Twitter, the snscrape library currently returns an error when trying to scrape; thus, for the purposes of this article, use the database provided and set the parameter 'load_twitter' to 'False' in the params.yaml file.
It is always a good idea to inspect the data you are working with to find its structure, how to process it, and also to discover possible outliers or unique observations to decide how to handle them.
What we are mostly interested in right now is finding a pattern or structure in his tweets for assigning the stocks he mentions, if any.
Let's look at several chosen examples:
Example 1.:
Now we see that Jim's tweets sometimes mention a stock using the most standard convention: a dollar sign "$" followed by the ticker, as with Amazon or NVIDIA in the pictures above. That gives us the first idea of how to extract the stocks contained in the tweets; go through each tweet and find expressions beginning with a "$". The characters after that will be the stock in question.
Example 2.:
What if Jim makes it harder for us and doesn't mention the stock using the "$" but simply tweets the ticker, such as in the picture below?
The stock in question is Snap Inc. (formerly Snapchat), the social media platform. As humans, we can clearly see that it is a ticker for this particular stock, but what about a computer? We do not have the luxury of having an "anchor" in the form of a dollar sign, as before, which clearly identified the ticker.
One option would be to take any word which is all-caps as the ticker. That has two dangers:
The second option is to create a universe of tradable stocks with their tickers and then compare each word in the tweet with the list. If a word is found in the ticker list, then we are sure it is a company and will assign it to the tweet.
Example 3.:
So far, assigning was easy, given we are dealing with tickers, which are, by definition, a standardized expression. But what if Jim doesn't mention a stock using its ticker but rather a full name (Apple) or even a shortcut ("Bed Bath" instead of "Bed Bath & Beyond")?
This is the most challenging variation; many would simply drop the tweets with such mentions. Two general approaches are as follows:
Before we start building the assignment function, let us continue building our data basis - namely, the tickers, names, and, most importantly, stock prices.
We will again use the yfinance package to obtain prices as we did in our momentum strategy article. The question is, which stocks should we consider?
We want to maximize our chances of having a stock mentioned by Jim in our universe, but at the same time, we want to be computationally effective and not clutter our data with thousands of small-cap stocks which will probably never be mentioned. A good compromise will be using the S&P500 stocks, which should contain most of the stocks Jim tweets about. Then, we can manually add certain "meme stocks" or crypto coins such as Dogecoin, AMC or FTX.
Credit: Imgflip
Let us borrow a code to scrape tickers and names of the S&P500 constituents. The code accesses a Wikipedia page with a table listing all the constituents, and using a standard "beautiful soup" scraper, we find the table, access its elements and save them. We also replace "." with a "-" to adhere to Yahoo's convention. An example of a stock with a "." in its name is Berkshire Hathaway with a ticker "BRK.B", but Yahoo lists it as "BRK-B". Lastly, while we are at it, we clean the tickers from "Inc." suffixes:
As discussed before, it might be useful to add alternative names to existing tickers or even add entirely new tickers (which have to be consistent with Yahoo Finance). Let us not overcomplicate matters and create a simple Excel file with three columns: "ticker," "alt_name" and "alt_name_2." An example Excel file will be provided in the GitHub repository, but you can create your own as well; just make sure the columns are named correctly.
Let's load the Excel file and perform an outer join so that the two new columns are appended to the corresponding existing tickers and that new rows with manually added tickers are added as well:
Now that we finally have a ticker list, we can use it to download prices from Yahoo Finance as we did before. We will specify the end date as the latest date in scraped Tweets, the start date will be obtained from our parameter file.
Again, we can either load the prices "fresh" from Yahoo and pickle them by setting the parameters "load_yahoo=True," or set it to False if we don't want to load prices every time we play around with strategy parameters. Lastly, we will format the index "to_datetime" and calculate the percentage returns of the stocks.
Now, the truly difficult part begins - extracting mentions of stocks to particular tweets. We discussed the three options above, so let's jump right into it.
We will write a function that will take a single tweet and our dataframe with all available names of stocks with tickers matched to it. Once written, we will apply this function to all tweets, i.e., rows of the scraped tweets.
Let's initiate the function and call it "asg_ticker" for "assign ticker." The first argument, "t," is a single tweet, and "tickers_names" is a dataframe containing tickers and names, which we created with the name "tickers_new" in the chunk above.
The steps and the first lines of the function are as follows:
Continuing the function, we create the first case for extracting tickers, that is, when they are marked with identified with a dollar sign.
We split the tweet into individual words, and then, with list comprehension, we extract the ticker. If you have not yet worked with list comprehensions, it is an elegant, one-line for loop. We loop through the t_split list containing all words and check whether the word starts with a dollar sign. If it does, we return x[1:], that is, everything after the dollar sign, which is the ticker we want.
Then, we add it to our found_tickers_list.
Let's move to option 2:
The logic here in option 2 is very similar; we again use list comprehension to loop through the split tweet and check whether any word is found in the column "ticker" in the provided dataframe. Another condition is to omit the ticker "A" - this is a discretionary decision because "A" is the ticker for "Agilent Technologies," which is not mentioned very often but is also a frequently used article in the English language. The code would thus mistakenly assume that this stock is mentioned even though it is not. Again, if anything is found, we add it to the list.
Lastly, let's dissect option 3, which is somewhat different:
Now we flip the logic: we do not loop through the tweets and check if any of the individual words are found in the tickers list, but we loop through rows of the dataframe with names and check if any of the three possible names are in the tweet. If yes, the code returns the corresponding ticker associated with the name.
So, why do we flip the logic now instead of checking whether the words are not found in names? The reason is that many stocks have names with two or more words, such as "JP Morgan" or "American Airlines." With the first approach, a tweet "American Airlines is a strong buy" would result in a list of words ['American,' 'Airlines,' 'is,' 'a,' 'strong,' 'buy']. Then, the code would futilely try to find "American" and then "Airlines" among the names. Thus, we try to find the names in the whole tweet.
The second question is, why don't we use this logic and try to find tickers this way as well? The reasons are mostly, but not solely, single-letter tickers such as $V for Visa. Look what happens if we try to find this ticker in a basic sentence containing the letter "V":
Therefore, this logic would not work for tickers as the "in" looks at whole strings as sentences, not individual words. Let us now try a model sentence with all three options and also some false tickers to test the function.
We would expect that the function return tickers "JPM" for JP Morgan, AMZN and KO (Coca-Cola) while ignoring the article "A" and also the price target "$50". Let's execute the two lines above:
Great! Our assigning function is set up, so let's run it on all the tweets.
To keep proper track, we copy our existing dataframe and create a new column where the tickers potentially found in tweets will be assigned.
In the following part, we will try to determine whether to go long (buy) or short (sell) a stock mentioned in the tweet, or neither. As described in the project areas, we will utilize sentiment analysis for this task.
Sentiment analysis and Natural Language Processing (NLP) is its own exciting and diverse field of data science way beyond the scope of this article. Coding our own engine would be a difficult and complex task, which is why we will rely on open-source libraries, which are easily deployable. The downside is twofold - without deeper digging, they are kind of a black box and we do not know what's happening under the hood. Secondly, they are not necessarily trained for finance texts such as FinBERT by Google, which is an advanced neural network for classifying finance texts.
Anyway, we will deploy a simple approach of using three open source libraries (NLTK, Stanza, Textblob) to classify each tweet. Depending on our setting in the param file, we can choose a combination of these three by turning them on/off. Additionally, we can also weight the classification of "textblob" classifications by subjectivity. After our libraries are done classifying the tweets, again depending on our setting in the param file, we will either take an average of the classifications or take the biggest absolute value (with the correct sign) as our final sentiment. That way, our function will produce a sentiment class between -1 for negative sentiment and 1 for positive sentiment.
Now, we are ready to classify the sentiment of the tweets! As before, we create a copy of the dataframe and call it "tweets_sent." Then, we will filter out only those records where any stocks are assigned - it would be useless to classify tweets which we will not act upon. Lastly, we will run our new function on each tweet using the .apply function.
We have almost everything we need for trading - stock returns, tweets, sentiment and assignments. Now, we only need to devise some trading rules on what weights to assign to the stocks we found in the tweets. Before we do that, let's prepare our dataframes to avoid any potential errors:
In the chunk above, we do the following actions:
Now that we have our data structures ready, we start a loop to assign weights of the stocks depending on set-up rules.
We loop through the index of our tweets dataframe and do the following actions:
Almost there! Now that we have the individual weights, we need to scale them to adhere to our budget constraints. Because we keep adding stocks throughout time but have a limited amount of cash we can allocate, we would have to rebalance the portfolio to reach the desired weights with available cash.
We will write a simple function that takes into account that some positions are short and some are long. Moreover, we will use a short cap of -1, meaning we can short a maximum of 100% of our own capital and a long cap of 2, meaning we use our own capital and then use the extra proceeds from shorting to buy additional stocks.
Now we only have to multiply our returns dataframe with the weights dataframe to get returns of individual positions, sum them up to get portfolio returns and use our portfolio analytics class to carry out some analyses!
Does the strategy with our current set up work? Let's take a look!
Firstly, we can see that the strategy achieves much higher total return but with way steeper drawdowns. The reason is twofold:
Let's look at the strategy through numbers:
We can confirm that the strategy indeed outperforms the benchmark in terms of average return, but also in terms of Sharpe ratio, controlling for risk/volatility. The main question remains: does the strategy generate alpha, i.e., does (inverse) Jim Cramer have superior picking and timing ability?
According to the summary, even if statistically significant, the alpha is minuscule (0.05%) and thus not economically significant. We can also see that beta is very close to 1, meaning most of the returns are driven by the market itself. That is probably due to Jim mentioning big names (Apple, Amazon) most of the time, which are primary drivers of the S&P500 anyway.
The visualizations are very basic and capture only part of the image. For going the extra mile and trying out more advanced visualizations, try the following:
We played around with different areas of coding – scraping, news sentiment, data cleaning and portfolio management - to devise a fun trading strategy, putting the meme to the test. With the setup above, we see it actually can work. However, it is not much better than simply holding the market.
Nonetheless, there is still much room for improvement in terms of strategy setup, better stock assignment and better sentiment classification (i.e., direction detection) – therefore, go ahead and play around! Happy coding!