Taking screenshots of websites in PHP

Taking screenshos of websites is not a frequent requirement for developers but can come handy on many occasions. Although there are some nice solutions on the web, a particular one I found very good is wkhtmltoimage.

wkhtmltoimage is a simple shell utility which can be used to convert html to images using the webkit rendering engine, and qt.

Installation

To get started we need to first download and install the shell program wkhtmltoimage. Select the appropriate binaries for your platform. As I’m using Ubuntu 10, I downloaded the complied binary ‘wkhtmltoimage-0.10.0_rc2-static-i386.tar.bz2′. Extract it to a appropriate folder and you are ready to go.

Getting your first snapshot

The simplest way to get a snapshot of a url is through the following:

wkhtmltoimage http://www.bbc.com bbc.jpg

This will fetch the www.bbc.com index page and save it as a jpg image. Below are a few websites rendered using the wkhtmltoimage tool.

Customizing the output

wkhtmltoimage comes with a plethora of options, a few are shown below, more options can be found in the documentation.

The default output quality of the program is set to ’94’ which can make the size of some images a lot bigger, but you can change it to your liking using the below option.

wkhtmltoimage --quality 50 http://www.bbc.com bbc.jpg

Also by default all the images on a page are rendered in the final image, but you can disable images in the final screen-shot using the following:

wkhtmltoimage --no-images http://www.bbc.com bbc.jpg

You can also set the output height and width of the image (in pixels) as below.

wkhtmltoimage --height 600 --width 1800 http://www.bbc.com bbc.jpg

or crop the image to a specified size;

wkhtmltoimage --crop-h 300 --crop-w 300 --crop-x 0 --crop-y 0 
       www.bbc.com bbc.jpg

Sometimes the JavaScript on the webpage you are rendering can cause problems during rendering, preventing the program from saving the screenshot or causing a huge delay. In such cases you can ask wkhtmltoimage to not run JavaScript on the page while rendering.

wkhtmltoimage --disable-javascript http://www.bbc.com bbc.jpg

Using with PHP

Although this is not a pure PHP solution for taking screen shots, you can wrap the final command in a shell_exec function or download the ‘snappy’ PHP5 wrapper from below. The original library is located here, but as the source keeps changing the examples given here do not work, so use the library given below.

Download Snappy
Downloads : 6302 / File size : 4.2 kB
shell_exec('./wkhtmltoimage --quality 50 http://www.bbc.com bbc.jpg');

Below is a short code using ‘snappy’ to take a screen-shot of bbc.com.

<?php
 
/* Tested on Ubuntu 10.0.4, requires PHP 5.3  */
 
namespace Knplabs\Snappy;
 
require_once('Knplabs/Snappy/Media.php');
require_once('Knplabs/Snappy/Image.php');
 
/* 'wkhtmltoimage' executable  is located in the current directory */
$snap = new Image('./wkhtmltoimage');
 
/* Displays the bbc.com website index page screen-shot in the browser */
header("Content-Type: image/jpeg");
$snap->output('http://www.bbc.com');
 
?>

and with a few options added…

<?php
 
/* Tested on Ubuntu 10.0.4, requires PHP 5.3  */
 
namespace Knplabs\Snappy;
 
require_once('Knplabs/Snappy/Media.php');
require_once('Knplabs/Snappy/Image.php');
 
$options = array('zoom' => 0.5, 'no-images' => true);
 
/* 'wkhtmltoimage' executable  is located in the current directory */
$snap = new Image('./wkhtmltoimage-i386',$options);
 
/* Displays the bbc.com website index page screen-shot in the browser */
header("Content-Type: image/jpeg");
$snap->output('http://www.bbc.com');
 
?>

In my opinion you should play with the shell program first till you get to know the complete options and only than use the ‘snappy’ PHP wrapper.

In closing here is the snapshot of my site ,rendered at 50% quality (click to zoom).

Happy rendering!!



20 thoughts on “Taking screenshots of websites in PHP

  1. I’ve looked at a few different methods for site screenshots, the biggest problem always comes down to Flash. A common technique is to add a delay after the page has loaded to let the Flash objects finish, but for some sites this means screenshots are really slow. In some cases I found Flash objects still hadn’t properly loaded after a delay of 30 seconds.

  2. Awesome stuff! Quick question, how much harder would it be, then, to convert the output to a PDF? I can see a real benefit to being able to output dynamically-generated PDF invoices, forms, etc by just designing them in HTML and using this method.

  3. That is genuinely useful, thanks. It used to be an utter pain to do things like this.

    I guess my only question is, can one set the user agent header? I’m thinking about things like getting a screenie of the mobile version of a site etc.?

  4. I can’t make work the example with PHP. Why you include
    “require_once(‘Knplabs/Snappy/Media.php’);” if in the library “snapy” don’t exist the Media.php file?
    plis, Can you make a simple example in PHP? include the library for Windows.

    I try put all the library in the path of mi localhost, but don’t work.
    I litle help!

  5. The author of the Snappy library has changed the source, which caused your example to not work. I’ve uploaded the original library above which will work with the examples given. Thanks for the heads-up.

  6. Sorry the double post, but in the pdf extension ‘wkhtmltopdf’ is the same. I think, problem is the ‘delay’ of FLASH in load of the site. Like as say Joseph Scott in the first answer. The dinamic table of me report appears well, in the PDF file, but the graphics don’t, because are flash.
    How I do this? “A common technique is to add a delay after the page has loaded to let the Flash objects finish” Because the flash draw is very small in me report.

    Again, thanks a lot.

  7. Probably off topic but, Abobe BrowserLab does an incredible job for cross-browser compatibility check and saving screenshots of a web page in multiple browsers

Leave a Reply

Your email address will not be published. Required fields are marked *

*

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>