Whether you are dynamically creating urls or changing existing ones, manipulation of urls is a frequent coding requirement during development; doing the same on short urls is easy, but quickly becomes complex for urls which have larger query parameters.
In this post we will see how we can use Net_URL2 Pear package to manipulate URLS.
General url sytnax
Before we start, a general URL syntax review will be useful. The most general form of a URL contains only two elements:
The term scheme refers to a type of access method such as ftp, http, telnet, file etc; which describes the way the following resource is to be used. The rest of the url, after scheme, is dependent on the scheme type.
A complete generlized syntax for http, ftp is shown below.
:// : @ : / ; ? #
Net_URL2 being a Pear package we will use the Pear installer as below. I recommend to always use the Pear installer to download packages rather than downloading it manually as the Pear installer automatically downloads any dependent packages.
pear install Net_URL2-0.3.0
Reading url data
Now that we have seen how a general url looks like, its time to move on to real examples. In this example we will use the following sample url.
Below is an example using the Net_URL2 library and its output for the above url:
include('Net/URL2.php'); $url = new Net_URL2('http://www.some-domain.com:80/search.php? q=beatles&id=56&cat=music'); echo "Host : " . $url->host . "\n"; echo "Protocol : " . $url->scheme. "\n"; echo "Port : " . $url->port . "\n"; echo "Path : " . $url->path . "\n"; echo "Query Variables: \n"; print_r($url->QueryVariables); ?>
Which will output the following:
Host : www.some-domain.com Protocol : http Port : 80 Path : /search.php Query String : Array ( [q] => beatles [id] => 56 [cat] => music )
Changing url data
We can as easily change various url parameters as we can read them.
. . $url->protocol = "https"; $url->path = "/my_search"; $queryVars = array(); /* Get the query variables array */ $queryVars = $url->QueryVariables; /* Change some url parameters */ $queryVars ['q'] = "Scarlett Johansson"; $queryVars ['cat'] = "movies"; $queryVars ['pics'] = 1; /* Save back the query variables array */ $url->QueryVariables = $queryVars; /* Display the changed url */ echo $url->geturl();
Which will change the example url to the following:
Note the changed parameter values, also note that we have added a new ‘pics’ parameter in the url.
We can also change the parameter values using a name,value pair.
/* Change the 'cat' parameter value to 'books' */ $url->setQueryVariable('cat', "books");
Or unset a parameter
/* This will remove the 'pics' parameter from the url */ $url->unsetQueryVariable('pics');
You can also easily get fragment url identifiers from a url. Fragment identifier locates a sub-location in a resource. If you have a url like the following:
The fragment id can be reached by:
$url = new Net_URL2('http://www.some-domain.com/index.php#book_id'); /* Will return 'book_id' from the url */ echo $url->fragment;
If you are accessing a url using some credentials as below:
You can get the username-password by:
. . echo "Username : " . $url->user . "\n"; echo "Password : " . $url->password . "\n";
URL normalization (or URL canonicalization) is the process by which URLs are modified and standardized in a consistent manner. Normalization helps you determine if two syntactically different URLs are equivalent.
We normalize a url as below.
$url = new Net_URL2('http://www.example.com/../a/b/../c/./d.html'); /* Returns 'http://www.example.com/a/c/d.html' */ echo $url->getNormalizedURL();
Net_URL2 package helps you quickly process urls, without resorting to complex regular expressions or string manipulation.