Whether you are dynamically creating urls or changing existing ones, manipulation of urls is a frequent coding requirement during development; doing the same on short urls is easy, but quickly becomes complex for urls which have larger query parameters.
In this post we will see how we can use Net_URL2 Pear package to manipulate URLS.
General url sytnax
Before we start, a general URL syntax review will be useful. The most general form of a URL contains only two elements:
:
The term scheme refers to a type of access method such as ftp, http, telnet, file etc; which describes the way the following resource is to be used. The rest of the url, after scheme, is dependent on the scheme type.
A complete generlized syntax for http, ftp is shown below.
://:@:/;?
#
Installation
Net_URL2 being a Pear package we will use the Pear installer as below. I recommend to always use the Pear installer to download packages rather than downloading it manually as the Pear installer automatically downloads any dependent packages.
pear install Net_URL2-0.3.0
Reading url data
Now that we have seen how a general url looks like, its time to move on to real examples. In this example we will use the following sample url.
http://www.some-domain.com:80/search.php?q=beatles&id=56&cat=music
Below is an example using the Net_URL2 library and its output for the above url:
host . "\n";
echo "Protocol : " . $url->scheme. "\n";
echo "Port : " . $url->port . "\n";
echo "Path : " . $url->path . "\n";
echo "Query Variables: \n";
print_r($url->QueryVariables);
?>
Which will output the following:
Host : www.some-domain.com
Protocol : http
Port : 80
Path : /search.php
Query String :
Array
(
[q] => beatles
[id] => 56
[cat] => music
)
Changing url data
We can as easily change various url parameters as we can read them.
.
.
$url->protocol = "https";
$url->path = "/my_search";
$queryVars = array();
/* Get the query variables array */
$queryVars = $url->QueryVariables;
/* Change some url parameters */
$queryVars ['q'] = "Scarlett Johansson";
$queryVars ['cat'] = "movies";
$queryVars ['pics'] = 1;
/* Save back the query variables array */
$url->QueryVariables = $queryVars;
/* Display the changed url */
echo $url->geturl();
Which will change the example url to the following:
https://www.some-domain.com:80/New_search.php?
q=Scarlett%20Johansson&id=56&cat=movies&pics=1
Note the changed parameter values, also note that we have added a new ‘pics’ parameter in the url.
We can also change the parameter values using a name,value pair.
/* Change the 'cat' parameter value to 'books' */
$url->setQueryVariable('cat', "books");
Or unset a parameter
/* This will remove the 'pics' parameter from the url */
$url->unsetQueryVariable('pics');
You can also easily get fragment url identifiers from a url. Fragment identifier locates a sub-location in a resource. If you have a url like the following:
http://www.some-domain.com/index.php#book_id
The fragment id can be reached by:
$url = new Net_URL2('http://www.some-domain.com/index.php#book_id');
/* Will return 'book_id' from the url */
echo $url->fragment;
If you are accessing a url using some credentials as below:
ftp://username:password@some-domain.com
You can get the username-password by:
.
.
echo "Username : " . $url->user . "\n";
echo "Password : " . $url->password . "\n";
Normalizing URLS
URL normalization (or URL canonicalization) is the process by which URLs are modified and standardized in a consistent manner. Normalization helps you determine if two syntactically different URLs are equivalent.
We normalize a url as below.
$url = new Net_URL2('http://www.example.com/../a/b/../c/./d.html');
/* Returns 'http://www.example.com/a/c/d.html' */
echo $url->getNormalizedURL();
In conclusion
Net_URL2 package helps you quickly process urls, without resorting to complex regular expressions or string manipulation.
Additional information
RFC 3986
Uniform Resource Locator
URL normalization
Is this something one would only use during the development phase? I’m having a hard time coming up with reasons or scenarios to use this library … ? Where or when might someone use this on a production site?
Thanks,
Guy
http://www.nullamatix.com
There are many – generating seo friendly urls, manipulating urls during redirection, logging urls etc.