Hello everybody!
Today I write about a short but pretty interesting topic. It is about analyzing which devices the Twitter users use and show them in a pie chart.
First of all, like always, we have to go through the authentification process I described here.
But then the fun can begin.
Getting the data
We start with searching on Twitter for Tweets containing a certain keyword. For my example I used “Social Media”.
tweets = searchTwitter("Social Media", n=20, cainfo="cacert.pem")
Analyzing the data
We now have our tweets and just have to get the information about the source out of them. It is like searching for a string.
devices <- sapply(tweets, function(x) x$getStatusSource()) devices <- gsub("","", devices) devices <- strsplit(devices, ">") devices <- sapply(devices,function(x) ifelse(length(x) > 1, x[2], x[1]))
Ok now we have our devices can put them in a nice looking pie chart like this one.
pie(table(sources))
Sometimes you can see interesting trends in the use of devices or maybe with a lot of luck you can find a new and unknown device.
Have fun!
sapply regsprecher.tweets? or sapply tweets?
Thank you for the hint. I corrected it.
Nice Blog!! Some of the visualizations are really excellent! How do you manage to write “parsed” R Code in WP??
Thank you! And the wordpress blog has a built-in feature for parsing R code.
http://en.support.wordpress.com/code/posting-source-code/
Just replace CSS with R.
Hi Julianhi. Great post. How did you manage to plot the pie chart eventually?
worked out how to get the pie chart. thanks
How do I analyze devices in a specific country?
Hey Paul!
You can analyze tweets from a specific country with help of the searchTwitter function as it can handle geo_code parameter.
searchTwitter(searchString, n=25, lang=NULL, since=NULL, until=NULL,
locale=NULL, geocode=NULL, sinceID=NULL,
retryOnRateLimit=120, …)
So just add the geocode information of the country you want to have tweets from to your searchTwitter line and the function just returns tweets connected with this geocode.
Then you do the other steps as normal.
If you have further questions feel free to ask me
Hi Julianhi. Great post.
When I am analyzing tweets around 1500; then there lots of unclean data…..(“a href relnofollow”)
i used below code to clean the data
> devices devices <- gsub("a href relnofollow", devices)
Please help me on this
Thanks in Advance
Hey Abhishek,
the Twitter API returns JSON objects. These JSON objects contain different data fields like the text of the tweet or the source. So you don´t just receive the text of the tweets.
Take a look at the RJSON package which helps you handling JSON objects.
If you have further questions feel free to ask.
Regards
Julian