Beautifying XML documents


I frequently write php code to access various web services and the most common response data I encounter is in xml, which most of the time is not formatted. I use xmlPad to format and analyze xml documents, but many times I need to format xml documents on the production server wherein xmlPad is of no use. What one needs is a library that would allow you to beautify your untidy xml documents within your php code. XML_Beautifier provides that solution.

Installation

XML_Beautifier being a Pear package we will use the Pear installer as below. I recommend to always use the Pear installer to download packages rather than dowloading it manually as the Pear installer automatically downloads any dependent packages.

pear install XML_Beautifier

Usage

You can beautify a xml file or a on-the-fly generated xml string. The following will nicely format a untidy xml file. The input file must be a valid xml document or the parser will flag an error.

<?php
 
require_once "XML/Beautifier.php";
 
$fmt = new XML_Beautifier();
$result = $fmt->formatFile('unformated.xml', 'beautified.xml');
 
if (PEAR::isError($result)) {
    echo $result->getMessage();
    exit();
}
 
echo "Done";
 
?>

Formatting a xml string is quite as simple.

<?php
 
require_once "XML/Beautifier.php";
 
/* Unformatted xml string */
$xml = '<rootNode><foo   bar = "pear">hello world!</foo></rootNode>';
$fmt = new XML_Beautifier();
echo $fmt->formatString($xml);
 
?>

Formatted xml string.

<rootNode>
    <foo bar="pear">hello world!</foo>
</rootNode>

If you frequently need to format xml documents than it would be better to have a command line access to the beautifier as below.

<?php
 
/** xmlformat.php
 *  usage: php xmlformat.php untidy.xml beautified.xml
 */
 
if ($argc < 3 ) {
    echo "\nUsage: $argv[0] unformatted.xml formatted.xml\n";
} else {
 
    require_once "XML/Beautifier.php";
 
    echo "Formatting {$argv[1]}...";
 
    $fmt = new XML_Beautifier();
    $result = $fmt->formatFile($argv[1], $argv[2]);
 
    if (PEAR::isError($result)) {
        echo $result->getMessage();
        exit();
    }
    echo "Done";
}
 
?>

Options

The XML_Beautifier class also accepts a array of options which you can pass to the constructor during initialization. The details of the options can be found here.

.
.
$options = array(
                    "caseFolding"       => true,
                    "caseFoldingTo"     => "uppercase",
                    "normalizeComments" => true
                );
 
$fmt = new XML_Beautifier($options);
.
.

One thing to note in closing is that the formatting of big xml documents can take some time depending on your machine configuration, so if you plan on formatting documents on a live server you need to take that into account.

This site is a digital habitat of Sameer Borate, a freelance web developer working in PHP, MySQL and WordPress. I also provide web scraping services, website design and development and integration of various Open Source API's. Contact me at metapix[at]gmail.com for any new project requirements and price quotes.

Your thoughts