Taking screenshos of websites is not a frequent requirement for developers but can come handy on many occasions. Although there are some nice solutions on the web, a particular one I found very good is wkhtmltoimage.
wkhtmltoimage is a simple shell utility which can be used to convert html to images using the webkit rendering engine, and qt.
Installation
To get started we need to first download and install the shell program wkhtmltoimage. Select the appropriate binaries for your platform. As I’m using Ubuntu 10, I downloaded the complied binary ‘wkhtmltoimage-0.10.0_rc2-static-i386.tar.bz2’. Extract it to a appropriate folder and you are ready to go.
Getting your first snapshot
The simplest way to get a snapshot of a url is through the following:
wkhtmltoimage http://www.bbc.com bbc.jpg
This will fetch the www.bbc.com index page and save it as a jpg image. Below are a few websites rendered using the wkhtmltoimage tool.
Customizing the output
wkhtmltoimage comes with a plethora of options, a few are shown below, more options can be found in the documentation.
The default output quality of the program is set to ’94’ which can make the size of some images a lot bigger, but you can change it to your liking using the below option.
wkhtmltoimage --quality 50 http://www.bbc.com bbc.jpg
Also by default all the images on a page are rendered in the final image, but you can disable images in the final screen-shot using the following:
wkhtmltoimage --no-images http://www.bbc.com bbc.jpg
You can also set the output height and width of the image (in pixels) as below.
wkhtmltoimage --height 600 --width 1800 http://www.bbc.com bbc.jpg
or crop the image to a specified size;
wkhtmltoimage --crop-h 300 --crop-w 300 --crop-x 0 --crop-y 0
www.bbc.com bbc.jpg
Sometimes the JavaScript on the webpage you are rendering can cause problems during rendering, preventing the program from saving the screenshot or causing a huge delay. In such cases you can ask wkhtmltoimage to not run JavaScript on the page while rendering.
wkhtmltoimage --disable-javascript http://www.bbc.com bbc.jpg
Using with PHP
Although this is not a pure PHP solution for taking screen shots, you can wrap the final command in a shell_exec function or download the ‘snappy’ PHP5 wrapper from below. The original library is located here, but as the source keeps changing the examples given here do not work, so use the library given below.
shell_exec('./wkhtmltoimage --quality 50 http://www.bbc.com bbc.jpg');
Below is a short code using ‘snappy’ to take a screen-shot of bbc.com.
output('http://www.bbc.com');
?>
and with a few options added…
0.5, 'no-images' => true);
/* 'wkhtmltoimage' executable is located in the current directory */
$snap = new Image('./wkhtmltoimage-i386',$options);
/* Displays the bbc.com website index page screen-shot in the browser */
header("Content-Type: image/jpeg");
$snap->output('http://www.bbc.com');
?>
In my opinion you should play with the shell program first till you get to know the complete options and only than use the ‘snappy’ PHP wrapper.
In closing here is the snapshot of my site ,rendered at 50% quality (click to zoom).
Happy rendering!!
I’ve looked at a few different methods for site screenshots, the biggest problem always comes down to Flash. A common technique is to add a delay after the page has loaded to let the Flash objects finish, but for some sites this means screenshots are really slow. In some cases I found Flash objects still hadn’t properly loaded after a delay of 30 seconds.
Awesome stuff! Quick question, how much harder would it be, then, to convert the output to a PDF? I can see a real benefit to being able to output dynamically-generated PDF invoices, forms, etc by just designing them in HTML and using this method.
sounds like a wrapper for the webkit engine, which can do this by default over shell. But really nice and a good summary.
Andrew you can do this by using a php pdf library, There are a number of classes available on the net.
That is genuinely useful, thanks. It used to be an utter pain to do things like this.
I guess my only question is, can one set the user agent header? I’m thinking about things like getting a screenie of the mobile version of a site etc.?
Try something like below:
wkhtmltoimage –custom-header User-Agent Mozilla http://www.bbc.com bbc.jpg
Andrew, for PDF output the project offers another binary, wkhtmltopdf, and you may use the following command to render the output as PDF:
wkhtmltopdf http://www.myhomepage.com myhomepage.pdf
I’ve used the whtmltopdf but it gives inconsistent results for many sites, due to which I did not recommend it.
Thank you, very interesting article!
I can’t make work the example with PHP. Why you include
“require_once(‘Knplabs/Snappy/Media.php’);” if in the library “snapy” don’t exist the Media.php file?
plis, Can you make a simple example in PHP? include the library for Windows.
I try put all the library in the path of mi localhost, but don’t work.
I litle help!
The author of the Snappy library has changed the source, which caused your example to not work. I’ve uploaded the original library above which will work with the examples given. Thanks for the heads-up.
Work perfectly!! Thanks a lot.
But don’t work with sites FLASH. I need a image of my dinamic reports, like this:
http://download.hkvstore.com/phprptdemo4/quarterly_orders_by_productctb.php
This type of report contain a FLASH graph. I can’t achieve a image of this site, and less mine.
Any advice? Please, is my last chance, because is more dificult export directly to PDF, that export a image to PDF.
Please a little help!
Sorry the double post, but in the pdf extension ‘wkhtmltopdf’ is the same. I think, problem is the ‘delay’ of FLASH in load of the site. Like as say Joseph Scott in the first answer. The dinamic table of me report appears well, in the PDF file, but the graphics don’t, because are flash.
How I do this? “A common technique is to add a delay after the page has loaded to let the Flash objects finish” Because the flash draw is very small in me report.
Again, thanks a lot.
Probably off topic but, Abobe BrowserLab does an incredible job for cross-browser compatibility check and saving screenshots of a web page in multiple browsers
Good web site indeed! But we are unable to capture web page screen having rotate image using Snappy. We have tried with this URL – http://projectscare.com/case_creator/temp1.html
Could anyone please help us in this regard? Many thanks in advance.
I want to creat some thumbs / images of the firs page of each .doc and displayed it to the left of the title.
See for example:
http://www.rasfoiesc.com/inginerie/tehnica-mecanica/index3.php here is a list of doc. On the left i want to be the thumb too.
– so i want to generate images thumbs for eache Word / doc .
How can i do that?