# Category Archives: Sentiment Analysis

## Step 2 – Sentiment Analysis using Sentiment Library

Its been long time, I wrote a post on Sentiment Analysis without using Sentiment Package. In this post, I will use Sentiment Package developed by Timothy Jurka. You can download this package from here. Before installing the sentiment package, you need to install tm and Rstem from CRAN. Sentiment package has two functions that server out purpose.

#### Lets import the necessary packages for Sentiment Analysis.

```library(twitteR)
library(sentiment)
library (stringr)
library(ggplot2)
library(wordcloud)
library(RColorBrewer)

```

Lets do the analysis on ObamaInIndia as I did in my previous sentiment analysis post. I am using that code to pull the tweets and data cleaning. Lets move forward from that and sentiment Analysis.

```
# classify emotion
class_emotion = classify_emotion(tweet_txt, algorithm="bayes", prior=1.0)
# get emotion best fit
emotion = class_emotion[,7]
# substitute NA's by "unknown"
emotion[is.na(emotion)] = "unknown"

# classify polarity
class_polarity= classify_polarity(tweet_txt, algorithm="bayes")
# get polarity best fit
polarity = class_polarity[,4]

```

We have now emotions and polarity based on our tweets. Lets create data frame from the tweets, emotions and polarity.

# data frame with results
tweet_df = data.frame(text=tweet_txt, emotion=emotion,
polarity=polarity, stringsAsFactors=FALSE)

# sort data frame
tweet_df = within(sent_df,
emotion <- factor(emotion, levels=names(sort(table(emotion), decreasing=TRUE))))

Lets generate some plot based on above data set. Plot tweet distribution based on emotions.

```
ggplot(tweet_df, aes(x=emotion)) +
geom_bar(aes(y=..count.., fill=emotion))+xlab("Emotions Categories") + ylab("Tweet Count")+ggtitle("Sentiment Analysis of Tweets on Emotions")

``` Plot tweet distribution based on Polarity

```
ggplot(tweet_df, aes(x=polarity)) +
geom_bar(aes(y=..count.., fill=polarity))+xlab("Polarities") + ylab("Tweet Count")+ggtitle("Sentiment Analysis of Tweets on Polarity")

``` Separate the text by emotions and visualize the words with a comparison cloud.

```
emos = levels(factor(tweet_df\$emotion))
nemo = length(emos)
emo.docs = rep("", nemo)
for (i in 1:nemo)
{
tmp = tweet_txt[emotion == emos[i]]
emo.docs[i] = paste(tmp, collapse=" ")
}

# remove stopwords
emo.docs = removeWords(emo.docs, stopwords("english"))
# create corpus
corpus = Corpus(VectorSource(emo.docs))
tdm = TermDocumentMatrix(corpus)
tdm = as.matrix(tdm)
colnames(tdm) = emos

# comparison word cloud
comparison.cloud(tdm, colors = brewer.pal(nemo, "Dark2"),
scale = c(3,.5), random.order = FALSE, title.size = 1.5)

``` 