Whether or not you’re a fan of the show, given a moderate amount of social media exposure over the past week, you probably noticed that Game of Thrones viewers’ emotions were running high as we headed into the show’s final episode.
I was curious how fans would react to whatever ended up happening Sunday night, so I prepared a streaming sentiment analysis pipeline to keep a watch on Twitter to find out.
Spoiler Warning: While I’ll avoid explicit spoilers, some trends in the data (or lack thereof) will be telling on their own, so proceed at your own risk. If you watched the episode, you’ll probably draw some interesting conclusions of your own based on what you see.
What is Sentiment Analysis?
Sentiment Analysis is a sub-field of Natural Language Processing (NLP) that translates text into numerical scores based on how positive or negative the system perceives that text to be.
While naive a system might just look for the presence of positive words like “love” or negative words like “terrible,” a more advanced sentiment analysis model can understand concepts like negation, so “These people are not friendly” would give the whole phrase “not friendly” a negative score despite the appearance of the word “friendly.”
I used a model called VADER Sentiment Analysis, which I like for its ability to handle negations, contractions, and even some emoji. Another huge plus is its availability and ease of use as a Python package.
How did the pipeline work?
I set up a Kubernetes cluster with one pod to capture the stream from Twitter (using a Tweepy client) and publish those tweets to Google Cloud Pub/Sub. Pub/Sub is a messaging platform that makes it easy to pass data and other messages between systems, especially other systems on Google Cloud Platform. Another pod in the Kubernetes cluster was tasked with archiving the raw tweets to BigQuery in case we wanted to follow up with later analysis.
I also set up a streaming Apache Beam pipeline to run continuously on Google’s Dataflow runner and do the following:
1. Receive the tweets via Pub/Sub
2. Window the tweets into 60-second chunks
3. Check for mentions of the specific characters or other entities I was interested in tracking
4. Conduct sentiment analysis on each tweet and make sure its compound sentiment score wasn’t exactly 0, which I took as an indication that VADER couldn’t meaningfully process that tweet’s text. Perhaps it had no text and was only an image of a meme, which this system couldn’t analyze.
5. Calculate the average sentiment score for each character mentioned within each 60-second time window
6. Write that data to Google’s BigQuery for storage
First, let’s use the reaction to the show overall to understand the plots. This first plot represents tweets that used the #GoT or #GameofThrones hashtags (not case sensitive).
The X-axis represents time. Hours are noted in Eastern Standard Time, and the vertical dotted lines represent the start and end times of the episode as it was aired. Streaming was available slightly before the show aired, and we can assume a lot of people finished it slightly later as well if they started the stream a little late and/or paused during their viewings.
The Y-axis represents sentiment, which again can range from -1 for the most negative to +1 for the most positive.
Each colored dot on the graph represents the average sentiment expressed in tweets made within a given 60-second period.
Overall, it looks like people were getting increasingly negative before the start of the show. Perhaps they were worried, based on the rest of this season, how this episode would go.
As the episode began, we see things start to turn around, and it looks like throughout the runtime there was overall a positive trend. After the episode aired, sentiment took a slight dip again until about midnight Eastern Standard Time, when the late-night crowd seemed to have reactions trending slightly upwards again.
Some Notes on Interpretation
The “fuzziness” that we see comes from the fact that we’re really sampling from a distribution at each moment in time. When the line appears more defined, this doesn’t show that people were more unified in their view towards a character. Instead, this tells us that we are more certain of where the “mean” sentiment lay during that stretch of time because we have more data points to draw from.
For example, views about Varys aren’t wildly more polarized than other characters. We just saw so few tweets per minute about him that a few opinionated tweets were able to pull our average dramatically in one direction or the other rather than balancing each other for a more consistent prediction.
By contrast, we’ll see more detail in the other characters’ plots that we look at below.
Another point to mention is that sentiment directed at a given character (“I hate Cersei.”) and on behalf of a character (“I hate what the writers have done to Cersei”) would be confused by this system. Those example sentences both get scores of -0.57.
Interesting Character-Specific Graphs
Let’s start with our friends the Lannisters.
In Tyrion’s plot, we see a slight upward trend during the episode, and that positive sentiment holds fairly steady afterward. Note that this is one of the few characters holding steady in the positive sentiment space (with a rating above 0).
Like in the overall show’s plot, we see negativity leading up to the start of the show, and then a slow improvement over the rest of the night, stablizing at a neutral 0. It appears Cersei doesn’t seem as bad as she once did.
It looks like I wasn’t alone in feeling sad for Jaime as his fate was confirmed. From there, sentiment holds at around -0.5, which gives him one of the lowest consistent scores.
Now let’s look at the most prominent Targaryen and her remaining ally.
I was surprised to see not much movement for Daenerys. It’s possible she’s so polarizing as to keep an average close to the center.
Unlike Dany, it looks like people did not like either what Drogon did or how the character of Drogon was used. It is interesting how we can see little “waves” in some of these plots, and I wonder if there was some back and forth discussion represented here.
Now we bring our featured characters’ analysis home with the Starks.
Like Dany’s plot, Jon’s is pretty stable. I also wonder if this was due to the sheer number of polarized opinions which may have pulled his average back towards the center.
Sansa’s plot is the reverse of the general trend, in that it looks like positive sentiment for her was building before the episode, and then we saw a decline during the episode before it stabilized afterward.
Arya’s graph is an interesting one. We see a positive spike right at the start of the episode, and then a slow fall. I know there were a lot of expectations around what Arya would do in this episode, and I think we see the result of that in that rising energy at the start before the air is let out of that idea over the rest of the episode. Then we see feelings recover soon after the episode, which may be general reflections on her character throughout the series.
I’ve saved the most interesting for last. In Bran’s graph, we see a journey from a barely-distinguishable signal (all those dispersed results before the episode started) to a focused upward trend that held steady for hours. It looks like the crowd responded well to his big news.
Other Interesting Artifacts
Here again we see that our signal for Brienne was pretty low until the start of the episode, but then our lady knight sees a slight increase over the course of the episode.
Gilly’s plot is interesting because while we don’t have enough data for a clear signal, we see this horizontal banding effect occur.
What seems to be going on here is that we’re getting the same exact average sentiment score repeated minute-after-minute. In this case, it seems likely that a single tweet is being retweeted multiple times. It seems as though tweets about Gilly were rare enough that often that was the only tweet about her during that interval (although it could have been retweeted multiple times in that interval), such that we get the exact same score as our average across multiple time periods.
This was a fun project to put together over the course of a couple of days for this analysis.
In hindsight, of course, I wish I’d built it before the season started. Or at minimum, a week earlier to have been able to plot the reactions to that major turn in the penultimate episode.
But so it is with data collection! As the old proverb says, “The best time to plant a tree was 20 years ago. The second best time is now.”
So instead I’ll have to content myself with looking forward to deploying this system for other upcoming pop culture & political events.
And I think it’s worth mentioning that while this year we’re witnessing the epic conclusions of several major franchises that have lasted close to a decade or more (Game of Thrones, Marvel’s Avengers, and Star Wars later this year), I’m really looking forward to encountering the new stories that emerge in their aftermath, and for us all to collectively get excited about those too.