Scanning web pages for malicious scripts

With the recent surge of malicious JavaScript injections on web, it has become necessary to regularly check for malicious code injections on your web sites. I created a small php script that checks a list of urls for malicious Javascript code. This can come handy if you have many client websites under your control.

The PHP script reads two text files – ‘malicious.txt and ‘urls.txt’ : the first containing a list of web pages to be scanned and the other containing malicious script signatures. The script scans the urls for malicious scripts and if any infections are found it saves the result in the ‘infected.txt’ file. The script needs to be run from the command line as you can easily see the progress of the scan if you are scanning a large number of urls.

D:\localhost\test\scan>php url_scan.php

A sample output of a scan is show below:

D:\localhost\test\scan>php url_scan.php

Checking 3 sites for malicious scripts.
3 malicious signatures in file.

Now scanning :

Now scanning :

Now scanning :

Total 0 sites infected of 3

Note that the script only scans the url path given and not the complete web site. So if given a url like ‘’ it will only scan the index file of the site. It may happen that the index file may not be infected but some other file in a sub-directory is, in that case the malicious code will not be found. But a larger percentage of malicious script injections are usually inflicted on the index page.

Setting a cron for automatic scanning

The best way to regularly check for any infections is to setup the script as a cron job. This can help you in checking malicious script on a regular interval, the cron job can then send the ‘infected.txt’ file via a email if any infections are found.

Updating your malicious.txt file

You cannot fight new code injections if your ‘malicious.txt’ file is not updated. So if you find some new malicious Javascript code, then it is essential that you include a new signature in the file. Well I know I’m putting the cart before the horse but you can find various new information about infections at or malwaredomainlist.

Other ways to check malicious code injections

One main problem with the script is that if some new infection occurs and the signature is not in the ‘malicious.txt’ database then that particular infection will be missed. One other solution is to check the filesize of the particular url you are checking. The filesize needs to be added to the ‘urls.txt’ file, so the script can check to see if the filesize of the url scanned is the same as the one given. But for that we will need to use the ftp functions of php but we will leave that to another post.

Download Source
Downloads : [downloadcounter(urlscan)] / File size : [downloadsize(urlscan)]

11 thoughts to “Scanning web pages for malicious scripts”

  1. Hi, how it work, I’m try fron the command line but send me a message:
    PHP is not a internal or external command, program or executable file.

    I have XAMPP and Windows XP.

  2. Your php folder is not set in your Windows path. Add it in Control Panel > System > (Advanced Tab) > Environment Variables.

  3. Hi 🙂 great tool. i have a quick question

    in the malicious txt file, do i add only linkt to malicous url per line?? or i add each line a signature, for example
    line 1 .
    line 2 var
    line 3 etc

    Thanks again and look forward to better using this 🙂

  4. You add the signatures in the malicious.txt file. In the download code given above I’ve added signatures of malicious websites.

  5. thanks for repsonding so quickly, you say youve added signatures but in the malicious.txt file is only 3 seperate lines of infected websites, is this what the signatures are?
    thanks again

  6. Yes, these are the signatures. On the web most malicious scripts are that of injected JavaScript links from a particular domain (mostly Chinese or Russian sites) or some code. The 3 websites given in the sample malicious.txt file above were the domain from where at one point of time some malicious JavaScript code files were referenced on my site.

  7. hi, I want to know , can i apply this tool to extract malicious JavaScript,Html and Url based features for classification of websites.

    if you have any idea,please reply.


  8. You could also use the Google Safe Browsing Lookup API. that would save you the trouble of having to keep malicious.txt up to date.

Leave a Reply

Your email address will not be published.