Extracting media information from online videos

With a variety of video sources available online it can be useful if one can get additional information regarding a particular video, this can be helpful for creating embedding code or customizing a display of a particular video.

Essence is a simple PHP library to extract media information from websites, like youtube videos, twitter statuses or blog articles. In this post we will see how we can use the library to gather meta information about various online video sources like YouTube, TED, dailymotion, vimeo etc.

Installation

Get the library zip from git and use the provided bootstrap file to inlcude the library in your project. The library requires at least PHP version 5.4.

require_once 'path/to/essence/bootstrap.php';

Example

Below is a simple example that retrieves the meta information for a youtube video.

require_once 'bootstrap.php';
 
$Essence = Essence\Essence::instance( );
 
/* Pass a youtube video url as a parameter */
$Media = $Essence->embed( 'http://www.youtube.com/watch?v=39e3KYAmXK4' );
print_r($Media);

This will retrieve the following meta information for the video.

Note: If for some reason you are unable to fetch a particular url, try using http instead of https.

[type] => video
            [version] => 1.0
            [title] => Sir Ken Robinson: Do schools kill creativity?
            [description] => 
            [authorName] => TED
            [authorUrl] => http://www.youtube.com/user/TEDtalksDirector
            [providerName] => YouTube
            [providerUrl] => http://www.youtube.com/
            [cacheAge] => 
            [thumbnailUrl] => http://i.ytimg.com/vi/iG9CE55wbtY/hqdefault.jpg
            [thumbnailWidth] => 480
            [thumbnailHeight] => 360
            [html] => <iframe width="459" height="344" src="http://www.youtube.com/embed/iG9CE55wbtY?feature=oembed" frameborder="0" allowfullscreen></iframe>
            [width] => 459
            [height] => 344
            [url] => http://www.youtube.com/watch?v=iG9CE55wbtY
            [thumbnail_width] => 480
            [provider_url] => http://www.youtube.com/
            [thumbnail_url] => http://i.ytimg.com/vi/iG9CE55wbtY/hqdefault.jpg
            [author_url] => http://www.youtube.com/user/TEDtalksDirector
            [author_name] => TED
            [thumbnail_height] => 360
            [provider_name] => YouTube

For example you can access the video thumbnail property with the following.

$Media->thumbnailUrl;

You can use the media information in your HTML page like the following.

<article>
    <header>
        <h1><?php echo $Media->title; ?></h1>
        <p>By <?php echo $Media->authorName; ?></p>
    </header>
 
    <div class="player">
        <?php echo $Media->html; ?>
    </div>
</article>

To enumerate all the field names use the following. This will list all the field names available for an particular video.

foreach($Media as $property => $value) {
    echo $property . ",";
}

You can also retrieve other sources of video such as vimeo, ted, dailymotion.

$Media = $Essence->embed('http://vimeo.com/channels/staffpicks/23895916');

I find it useful to define a function to retrieve the media details.

function getMediaInfo($url)
{
    $Essence = Essence\Essence::instance( );
    return $Essence->embed($url);    
}
 
$media = getMediaInfo('http://vimeo.com/channels/staffpicks/23895916');
print_r($media);

Essence can replace any embeddable URL in a text by information about it. By default, any URL will be replaced by the html property of the found Media.

$text = 'Check this video: http://www.youtube.com/watch?v=RFinNxS5KN4'
echo $Essence->replace( $text );
 
// Will return:
 
// Check this video: <iframe width="480" height="270"
// src="http://www.youtube.com/embed/RFinNxS5KN4?feature=oembed" 
// frameborder="0" allowfullscreen></iframe>

You can customize which information will replace the URL by passing a callback (Note the use of anonymous function:

$text = 'Check this video: http://www.youtube.com/watch?v=RFinNxS5KN4'
echo $Essence->replace( $text, function( $Media ) {
    return sprintf(
        '<p class="title">%s</p><div class="player">%s</div>',
        $Media->title,
        $Media->html
    );
});
 
// Will return:
 
// Check this video: <p class="title">Jurassic World - Official Trailer
// (HD)</p><div class="player"><iframe width="480" height="270" 
// src="http://www.youtube.com/embed/RFinNxS5KN4?feature=oembed" 
// frameborder="0" allowfullscreen></iframe></div>

The Essence class also provides some useful utility functions to ensure you will be able to get some additional information. For example you can use the extract() method to extract embeddable URLs from a web page.

$urls = $Essence->extract('http://mashable.com/2013/07/08/ted-talks-change-your-life/');
print_r($urls );
 
// Array
// (
//    [0] => http://www.youtube.com/user/mashable
//    [1] => http://embed.ted.com/talks...

You can then get media information from any selected url.

$medias = $Essence->embed($urls[1]);

Currently the library supports the following service providers. Note that I’ve tested only a few from the following.

23hq             Dipity          Official.fm     Ted
Bandcamp         Flickr          Polldaddy       Twitter
Blip.tv          FunnyOrDie      Prezi           Vhx
Cacoo            HowCast         Qik             Viddler
CanalPlus        Huffduffer      Revision3       Vimeo
Chirb.it         Hulu            Scribd          Yfrog
Clikthrough      Ifixit          Shoudio         Youtube
CollegeHumor     Imgur           Sketchfab
Dailymotion      Instagram       SlideShare
Deviantart       Mobypicture     SoundCloud