Today, I want to share with you a project I worked on recently: detecting opinion leaders in Twitter.
Let me first introduce the concept of “opinion leaders”. An opinion leader is an influencer in his own social network that increases the traffic and usage of the entity he shares on the web. For instance, in case of Twitter, when an opinion leader starts a hashtag or uses a hashtag already started by another user, hashtag’s occurrence increases.
The concept of “opinion leaders” has emerged in 1940s and was further developed through the studies of Katz and Lazarsfeld. These authors have generated the model of “two-step flow” in communication. According to this model, mass media information is channeled to the “masses” through opinion leadership.
The opinion leaders in social media, today, have a significant role in the world economics, especially in consumer economics, due to the following cycle:
– Consumers are connected through social media that consists of word of mouth (WOM) communication links.
– The opinion leader shares his up to date consumption experience on a certain product to his network neighbors. (to his “followers” in case of Twitter)
– The “Bayesian belief networks”, graphical tools that aid decision making under uncertainty, depend on the Bayes’ theorem which suggests the following:
p(X|Y) = (p(X) *p(Y|X)) /p(Y)
More over, the above equation gives the probability of X happening, given that Y has happened.
Suppose we have the hypothesis H, evidence E (for our hypothesis) and context C (for the evidence).
p(H|EC) = ( p(H| C) * p(E|HC) ) / p(E|C)
p(H|EC) = ( p(H| C) * p(E|HC) ) / ( p(E|HC) *p(H|C) +p(E|~HC) * p(~H|C))
Furthermore, the dependency of the hypothesis on the evidence in the given context is measured. Therefore, Bayesian approach in social media proposes that there are key actors (opinion leaders) that have an influence in decision-making. As a result, there is a possibility that the followers’ purchase decision will change according the consumption experience of the opinion leader.
Bearing in mind the significance of social media and my special interest, I started my project with the following hypothesis: “A keyword (hashtag) on twitter experiences an increase on traffic when used by an opinion leader”
Then I followed the methodology below to prove/disprove my hypothesis:
1. Collect data
2. Generate a traffic graph for each data set
3. According to the traffic graph, detect a peak before the instant the relative hashtag has become a trend topic and determine a range just before the abovem-entioned peak
4. For each data set, compare the Twitter users in the abovementioned ranges to find the common users.
Here is the pseudo code for the first step:
I generated two different methods for defecting a peak:
The next step was to compare the data sets. The users that used 2 or more of the hashtags out of 55 hashtags in the data sets are detected.
So, the results above proved my hypothesis. The same analysis can be applied to any area of interest including but not limited to politics in general, women rights, sports in other countries, the Arab spring, series on TV, shopping, etc. Detecting opinion leaders will also help companies with their marketing activities. For example, if a similar work to what we have done so far is repeated on the consumption of cosmetics in Turkey; the related companies will use the data and reach out the relative opinion leaders to spread their campaigns and get significant customer feedback. Similarly, if you are planning to do a marketing activity for your new e-commerce website for a niche segment, this should definitely be your starting point. You’ll save money! You’ll save time and energy!