Item based collaborative filtering in PHP


Most people are familiar with recommendation systems on websites, wherein after you select an item you are presented with a list of similar items other people purchased . Amazon being the popular one and also one of the first to use it. Below is shown a snapshot from Amazon.

Collaborative filtering algorithms work by searching a large group of users or items and finding a smaller llist from it with tastes similar to yours.

In this post I’ll show you how to integrate a simple recommendation system. You can download the class file here.

We will work with the following book data as an example.

sample data

It shows a list of books purchased by various people and rated on a scale of 5 by them. The list can be any thing – books, jewelry, movies, blog posts; any items where ratings are possible. You have to create this array at runtime form the database.

Our job here will be to use this list to recommend books. The above list is converted into an array here.

$books =  array(
 
    "phil" => array("my girl" => 2.5, "the god delusion" => 3.5,
                    "tweak" => 3, "the shack" =>; 4,
                    "the birds in my life" => 2.5,
                    "new moon" => 3.5),
 
    "sameer" => array("the last lecture" => 2.5,
                      "the god delusion" => 3.5,
                      "the noble wilds" => 3, "the shack" => 3.5,
                      "the birds in my life" => 2.5, "new moon" => 1),
 
    "john" => array("a thousand splendid suns" => 5, "the secret" => 3.5,
                    "tweak" => 1),
 
    "peter" => array("chaos" => 5, "php in action" => 3.5),
 
    "jill" => array("the last lecture" => 1.5, "the secret" => 2.5,
                    "the noble wilds" => 4, "the host: a novel" => 3.5,
                    "the world without end" => 2.5, "new moon" => 3.5),
 
    "bruce" => array("the last lecture" => 3, "the hollow" => 1.5,
                     "the noble wilds" => 3, "the shack" => 3.5,
                     "the appeal" => 2, "new moon" => 3),
 
    "tom" => array("chaos" => 2.5)
 
);

Lets start by including the class file first.

require_once("recommend.php");

Lets say you are ‘John’ and the books you have purchased till now are ‘a thousand splendid suns’, ‘the secret’ and ‘tweak’ with appropriate rating given. If you come to the website again what books would the site recommend. Lets try!

$re = new Recommend();
 
print_r($re->getRecommendations($books, "john"));

It will output the following with appropriate ratings and sorted in descending order.

Array
(
    [the noble wilds] =>; 4
    [the shack] => 4
    [the host: a novel] => 3.5
    [new moon] => 3.5
    [the god delusion] => 3.5
    [the world without end] =>; 2.5
    [the birds in my life] =>; 2.5
    [my girl] => 2.5
    [the last lecture] => 1.5
)

Now lets try with ‘tom’.

$re = new Recommend();
 
print_r($re->getRecommendations($books, "tom"));

The recommended books will be…

Array
(
[php in action] => 3.5
)

Now how about recommending items based on book name rather then on the person. For that what we have to do is invert the array using the ‘transformPreferences’ function and then use the ‘matchItems’ function.

Inverting the array like this

$result = $re->transformPreferences($books);

will return an array as show below.

rray
(
    [my girl] => Array
        (
            [phil] => 2.5
        )
 
    [the god delusion] => Array
        (
            [phil] => 3.5
            [sameer] => 3.5
        )
 
    [tweak] => Array
        (
            [phil] => 3
            [john] => 1
        )
 
    [the shack] => Array
        (
            [phil] => 4
            [sameer] => 3.5
            [bruce] => 3.5
        )
 
    [the birds in my life] => Array
        (
            [phil] => 2.5
            [sameer] =>; 2.5
        )
 
    [new moon] => Array
        (
            [phil] => 3.5
            [sameer] => 1
            [jill] => 3.5
            [bruce] => 3
        )
 
    [the last lecture] => Array
        (
            [sameer] => 2.5
            [jill] => 1.5
            [bruce] => 3
        )
 
.
.
.
.

Lets try with the book ‘chaos’ and see what the system will recommend us.

$re = new Recommend();
 
$result = $re->transformPreferences($books);
 
print_r($re->matchItems($result, "chaos"));

The out will be.

Array
(
[php in action] => 0.4
)

But now the value on the right is not a rating but a match probability from a scale of 0 – 1, 1 being a perfect match. If you change the rating of the ‘php in action book’ in the above array to 5 and then run the above code again you will get the match probability as ’1′.

You can also find how similar the choice of two persons is on a scale of 0-1.

$similarity =  $re->similarityDistance($books, "tom", "peter");
//Converted to percent.
echo sprintf("%.2f", $similarity * 100) . "%";

Will return

28.57%

This simplified filtering system can be used on a site with a few thousand items and members. But as the number of items grows it will be time consuming to calculate the recommendations every time someone purchases or browses an item. In that case it will be helpful to precalculate the array and store it in a database maybe once a day. But that is a different story altogether. For more information on the algorithms and variations you can refer to the wonderful Programming Collective Intelligence: Building Smart Web 2.0 Applications. The code examples in the book are in Python, though.

The above code may not work in each and every instance but you can tweak the code according to your requirements.

Download code

This site is a digital habitat of Sameer Borate, a freelance web developer working in PHP, MySQL and WordPress. I also provide web scraping services, website design and development and integration of various Open Source API's. Contact me at metapix[at]gmail.com for any new project requirements and price quotes.

7 Responses

1

Adrian Puiu

June 2nd, 2009 at 11:11 pm

for those who don’t have the names of books in table just use explode to extract the the id of the product:

foreach($recommendation as $key => $value)
{
if($z > 5)
{
break;
}
$key_parts = explode(‘_’, $key);

$product_id = $key_parts[1] ;

$html .= show_product($product_id);

$z++;

where key is $product_id

2

wow gold

June 12th, 2009 at 2:01 am

this is exactly the post I needed to see!

3

dofus kamas

October 6th, 2009 at 9:50 pm

this is exactly the post I needed to see!

4

Malscorpion

January 28th, 2010 at 12:10 pm

thank you verry much .

this posted subject is begin my project.

now my project just complete .

everybody interresting in recommender system collaborative filtering

and understand You can send question to me at

malscorpio@hotmail.com

if that problem I can help you .I pleased.

5

metin2yang

March 1st, 2010 at 11:47 pm

this is great artic!very good!

6

twoh

April 7th, 2013 at 9:22 pm

What method do you use for calculating the similarity ?

sameer

April 7th, 2013 at 10:34 pm

Euclidean Distance score

Your thoughts