Overlap, Centrality and the Case of Two Hash-Tags

Yesterday Reddit banned five subreddits, including the infamous /r/fatpeoplehate, which promptly kicked off a wave of recriminations, drama and general weirdness. Towards the end of the day I saw a new hash-tag emerging, #RedditRevolt. Clicking around I noticed a lot of faces and handles which I had seen in my brief time engaging with #Gamergate in September (full disclosure, I thought it was silly), which lead me to wonder, what exactly is the overlap between this new hash-tag and the rolling fight that is Gamergate?

Not wanting to dust off my own twitter scraper I leaned on the excellent NodeXL tool and grabbed a mention/re-tweet network from anyone who tweeted the phrase "redditrevolt." Given API limitations and the fact that the #RedditRevolt discussion was just spinning up I ended up with 971 unique users connected by 2959 relationships.

#RedditRevolt social network, green indicates people with some form of #Gamergate identifier (see methodology below)

Given that I was curious about the overlap between Gamergate and redditrevolt the next step was to identify people active on the Gamergate hash-tag who were also tweeting about redditrevolt. Since Gamergate is rather amorphous, taking pride in not having leaders and low barriers to entry, I settled on a sliding scale test based around a number of different possible signals of membership.

The Gamergate block list is a common but not perfect metric for membership based on whether a user follows a given number of high profile personalities active within the hash-tag. Downloading a csv of the block list I joined on the data scraped from twitter to pick out users who appear in both data-sets.

Results show that 108 of the 863 users in the data-set are on the block list, or 12.5% of #RedditRevolt participants. However I was concerned that the block list csv which I found might be a little out of date, so a few more tests were needed. NodeXL provides the tweeted text and user description for all the users in the data-set, so I fired up a bit of regexp to see what proportion of users had #gg or #Gamergate either in their tweet or their twitter description.

Aggregating all of these measures together leaves us with a broad metric for assessing the overlap between #Gamergate and #RedditRevolt, if a user has a positive value on any of the three aforementioned metrics than they are considered as "Yes" on the following graph.

table(gra.base$broadgg)
## 
## FALSE  TRUE 
##   555   416

With this broad metric 416 users have some form of #Gamergate affiliation or identification in their profile, tweets or are on the block list. Obviously this is a crude metric in so far as it doesn't capture things like sarcasm, irony or alternative uses of the hash-tag. That being said the fact that almost 43% of the #RedditRevolt hash-tag has some sort of #Gamergate overlap is interesting.

Of course all forms of participation in a hash-tag are not created equal. Social network analysis has repeatedly demonstrated that people have different relative power and status in a network based on a number of factors. So let's look at who is more central in #RedditRevolt given #Gamergate status.

First up we have in degree and out degree. The former is a measure of how many times people on the hash-tag mentioned or re-tweeted a specific user, while the later reflects how many times the person in question tweeted at or re-tweeted someone else.

Most people are clustered with fairly low in degrees and out degrees, which is to be expected given the power law nature of most social networks. The majority of the clear outlines are positive on the broad based Gamergate metric, suggesting that some of the movers and shakers on #RedditRevolt are also active in #Gamergate.

This conclusion is born out by looking at the eigenvector centrality, which is a metric for assessing the relative influence of a node in a given network. People with higher eigenvector centrality scores tend to be both important in the network and connected to other important people, while lower scores indicate a more peripheral role. The distribution of these scores from the #RedditRevolt network based on the broad Gamergate metric supports the conclusions from the degree chart.

To sum up, there seems to be an interesting overlap between the #RedditRevolt and #Gamergate hash-tags. While portions of #RedditRevolt do have #GamerGate identifies, those who do tend to be more central in the overall network. This is interesting from the perspective of online mobilization and social movements. Often protest groups have a central cohort of dedicated individuals who help shape the discourse that others use to express themselves. The overlap between these two hash-tags may suggest that a group like this is forming around a broad axis while inspiring various forms of expression which represent variations on a theme. Seeing if this central cohort exists and how it adapts and evolves over time may be an interesting case study for communication scholars or sociologists interested in online movements.

Further reading
* The Logic of Connective Action
* Power Laws, Weblogs and Inequity (an oldie but still pretty good)
* Social Network Analyis

Data
Graph File (user names and descriptions purged to comply with Twitter TOS)
CSV

Citations
Network analysis done with igraph and Gephi, charts with ggplot2

Welcome!

This is my new Jekyll powered home for research, code, packages, utility functions and general blogging.