Sentiment Analysis of Twitter feeds

In the last post we looked into accessing Twitter API v1.1 from PHP. In this post we will see how we can add sentiment analysis for the tweets. Generally speaking, sentiment analysis aims to determine the attitude of a writer with respect to some topic. A basic task in sentiment analysis is classifying the polarity of a given text, whether the expressed opinion in a sentence is positive, negative, or neutral.

In this post we will use a simple sentiment analysis library to analyze the sentiment of tweets.

Installation

As usual install with composer with the following. PHP version required >= 5.4.

composer require viracore/caroline:1.0.1

Building a simple example

The following is a complete example that provides a score for the sentiment of a sentence, whether it is positive, negative or neutral. The score ranges between minus five (negative) and plus five (positive).

<?php
 
require_once 'vendor/autoload.php';
use CertifiedWebNinja\Caroline\Analysis;
use CertifiedWebNinja\Caroline\DataSets\AFINN;
 
$afinn = new AFINN;
 
$caroline = new Analysis($afinn);
 
$result = $caroline->analyze('Christmas is coming let us all enjoy.');
 
echo 'Score: '.$result->getScore().PHP_EOL;
echo 'Comparative: '.$result->getComparative().PHP_EOL;

The following is the complete output for the above example.

Score: 2
Comparative: -2
CertifiedWebNinja\Caroline\Result Object
(
    [data:CertifiedWebNinja\Caroline\Result:private] => Array
        (
            [string] => Christmas is coming let us all enjoy.
            [score] => 2
            [comparative] => -2
            [tokens] => Array
                (
                    [0] => christmas
                    [1] => is
                    [2] => coming
                    [3] => let
                    [4] => us
                    [5] => all
                    [6] => enjoy
                )
 
            [words] => Array
                (
                    [0] => enjoy
                )
 
            [positive] => Array
                (
                    [0] => enjoy
                )
 
            [negative] => Array
                (
                )
 
        )
 
)

A little more on sentiment analysis

Current approaches to sentiment analysis can be grouped into four main categories: keyword spotting, lexical affinity, statistical methods, and concept-level techniques. The present library uses a simple implementation of keyword spotting. Keyword spotting classifies text by categories based on the presence of unambiguous words such as happy, sad, afraid, and bored.

One of the simplest sentiment analysis approaches (which is used here) compares the words of a posting against a labeled word list (a dataset), where each word has been scored for valence, — a “sentiment lexicon” or “affective word lists”.

Datasets

The current library uses the AFINN dataset of words to analyze the tweets. AFINN is a list of English words rated for valence with an integer between minus five (negative) and plus five (positive).

You can create your own dataset or edit the current dataset to suit your requirement or vertical by editing the ‘AFINN.php’ file.

Integrating with Twitter API

We can now integrate the current library with the Twitter API implementation we saw in the last post. Below is given the complete example.

<?php
 
require_once 'vendor/autoload.php';
require_once './social/TwitterAPIExchange.php';
 
use CertifiedWebNinja\Caroline\Analysis;
use CertifiedWebNinja\Caroline\DataSets\AFINN;
 
$afinn = new AFINN;
 
$caroline = new Analysis($afinn);
 
$settings = array(
    'oauth_access_token' => "YOUR_ACCESS_TOKEN",
    'oauth_access_token_secret' => "YOUR_ACCESS_TOKEN_SECRET",
    'consumer_key' => "YOUR_CONSUMER_KEY",
    'consumer_secret' => "YOUR_CONSUMER_SECRET"
);
 
$url = "https://api.twitter.com/1.1/statuses/user_timeline.json";
$requestMethod = "GET";
$getfield = '?screen_name=codediesel&count=3';
 
$twitter = new TwitterAPIExchange($settings);
 
$response = $twitter->setGetfield($getfield)
                    ->buildOauth($url, $requestMethod)
                    ->performRequest();
 
$tweets = json_decode($response);
 
foreach($tweets as $tweet)
{
    $result = $caroline->analyze($tweet->text);
    echo $tweet->text . " - " . $result->getScore() . PHP_EOL;
}
 
?>

Working with PHP version < 5.4

The reason the current library is not compatible with PHP < 5.4 is due the use of shorthand arrays.

// old way
private $data = array();
 
// new way - shorthand arrays
private $data = [];
 
// old way
$my_array = array(1,2,3);
 
//new array shorthand
$my_array = [1,2,3];

The really frustrating part is when developers use only a single new feature which is not warranted but added to save a few characters. This however breaks compatibility with old PHP versions which are still used on most hosting servers.

To make the current sentiment library work with older PHP versions, just replace the new shorthand arrays with the old syntax in the library files. There are only a few so that can be done quickly.