Easy manipulation of URLs


Posted in: pear | Save to del.icio.us | Twit This! 9 Nov 2009

Whether you are dynamically creating urls or changing existing ones, manipulation of urls is a frequent coding requirement during development; doing the same on short urls is easy, but quickly becomes complex for urls which have larger query parameters.
In this post we will see how we can use Net_URL2 Pear package to manipulate URLS.

General url sytnax

Before we start, a general URL syntax review will be useful. The most general form of a URL contains only two elements:

<scheme>:<scheme-specific-part>

The term scheme refers to a type of access method such as ftp, http, telnet, file etc; which describes the way the following resource is to be used. The rest of the url, after scheme, is dependent on the scheme type.

A complete generlized syntax for http, ftp is shown below.

<scheme>://<user>:<password>@<host>:<port>/<url-path>;<params>?
<query>#<fragment>

Installation

Net_URL2 being a Pear package we will use the Pear installer as below. I recommend to always use the Pear installer to download packages rather than downloading it manually as the Pear installer automatically downloads any dependent packages.

pear install Net_URL2-0.3.0

Reading url data

Now that we have seen how a general url looks like, its time to move on to real examples. In this example we will use the following sample url.

http://www.some-domain.com:80/search.php?q=beatles&id=56&cat=music

Below is an example using the Net_URL2 library and its output for the above url:

<?php
 
include('Net/URL2.php');
 
$url = new Net_URL2('http://www.some-domain.com:80/search.php?
                     q=beatles&id=56&cat=music');
 
echo "Host      :    " . $url->host . "\n";
echo "Protocol  :    " . $url->scheme. "\n";
echo "Port      :    " . $url->port . "\n";
echo "Path      :    " . $url->path . "\n";
 
echo "Query Variables: \n";
print_r($url->QueryVariables);
 
?>

Which will output the following:

Host      :    www.some-domain.com
Protocol  :    http
Port      :    80
Path      :    /search.php
Query String : 
Array
(
    [q] => beatles
    [id] => 56
    [cat] => music
)

Changing url data

We can as easily change various url parameters as we can read them.

.
.
$url->protocol = "https";
$url->path = "/my_search";
 
$queryVars = array();
 
/* Get the query variables array */
$queryVars = $url->QueryVariables;
 
/* Change some url parameters */
$queryVars ['q'] = "Scarlett Johansson";
$queryVars ['cat'] = "movies";
$queryVars ['pics'] = 1;
 
/* Save back the query variables array */
$url->QueryVariables = $queryVars;
 
/* Display the changed url */
echo $url->geturl();

Which will change the example url to the following:

https://www.some-domain.com:80/New_search.php?
q=Scarlett%20Johansson&id=56&cat=movies&pics=1

Note the changed parameter values, also note that we have added a new ‘pics’ parameter in the url.

We can also change the parameter values using a name,value pair.

/* Change the 'cat' parameter value to 'books' */
$url->setQueryVariable('cat', "books");

Or unset a parameter

/* This will remove the 'pics' parameter from the url */
$url->unsetQueryVariable('pics');

You can also easily get fragment url identifiers from a url. Fragment identifier locates a sub-location in a resource. If you have a url like the following:

http://www.some-domain.com/index.php#book_id

The fragment id can be reached by:

$url = new Net_URL2('http://www.some-domain.com/index.php#book_id');
 
/* Will return 'book_id' from the url */
echo $url->fragment;

If you are accessing a url using some credentials as below:

ftp://username:password@some-domain.com

You can get the username-password by:

.
.
echo "Username  :    " . $url->user . "\n";
echo "Password  :    " . $url->password . "\n";

Normalizing URLS

URL normalization (or URL canonicalization) is the process by which URLs are modified and standardized in a consistent manner. Normalization helps you determine if two syntactically different URLs are equivalent.

We normalize a url as below.

$url = new Net_URL2('http://www.example.com/../a/b/../c/./d.html');
 
/* Returns 'http://www.example.com/a/c/d.html' */
echo $url->getNormalizedURL();

In conclusion

Net_URL2 package helps you quickly process urls, without resorting to complex regular expressions or string manipulation.

Additional information

RFC 3986
Uniform Resource Locator
URL normalization




Share this post

Share on Facebook
Share on Twitter
Share on StumbleUpon
Share on Delicious
Share on Digg
Share on Technorati
Share on Reddit
Feeds RSS Subscribe to site Feed

Other related posts

  • No Related Post


2 Responses

1

Guy Patterson

November 10th, 2009 at 6:28 am

Is this something one would only use during the development phase? I’m having a hard time coming up with reasons or scenarios to use this library … ? Where or when might someone use this on a production site?

Thanks,

Guy
http://www.nullamatix.com

sameer

November 12th, 2009 at 2:26 am

There are many - generating seo friendly urls, manipulating urls during redirection, logging urls etc.

Comment Form

Use the html <code> tag to insert small source code snippets

For longer code examples use http://pastie.org/.

Get latest updates by E-mail

About this blog

This site is a digital habitat of Sameer, a freelance web developer working from Pune.More

Recent Comments

  • sameer: Check to see if the 'IDE > options > format' is set to HTML. [...]
  • sameer: Google strips any newline characters form the text. Although it does accept it with the online trans [...]
  • Arjan: Fiddler is a debugging tool for IE (not Microsoft's Fiddler) [...]
  • Susan Martin: while creating a test for site, command icons on IDE greyed out and do not respond when selected. I [...]
  • Saar: Thanks for this example. helped me a lot. I have 1 problem, I am translating chunks of code, but I [...]
  • sameer: You can add extra GET variables in the options array as below: $pager_options = array( 'mode [...]
  • Martin: How can you carry over your own variables into the URL? I am using a form to POST a couple of var [...]
  • nancy: thanks very much ! first tools [...]

  • Users Online

    • 8 Users Online
    • 8 Guests