<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>code-diesel &#187; urls</title>
	<atom:link href="http://www.codediesel.com/tag/urls/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.codediesel.com</link>
	<description>/* PHP &#38; MySQL Journal */</description>
	<lastBuildDate>Thu, 02 Feb 2012 13:19:04 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
<xhtml:meta xmlns:xhtml="http://www.w3.org/1999/xhtml" name="robots" content="noindex" />
		<item>
		<title>Easy manipulation of URLs</title>
		<link>http://www.codediesel.com/pear/easy-manipulation-of-urls-in-php/</link>
		<comments>http://www.codediesel.com/pear/easy-manipulation-of-urls-in-php/#comments</comments>
		<pubDate>Mon, 09 Nov 2009 12:32:21 +0000</pubDate>
		<dc:creator>sameer</dc:creator>
				<category><![CDATA[pear]]></category>
		<category><![CDATA[urls]]></category>

		<guid isPermaLink="false">http://www.codediesel.com/?p=1868</guid>
		<description><![CDATA[how to manipulate urls using Net_URL2 Pear package]]></description>
			<content:encoded><![CDATA[<p>Whether you are dynamically creating urls or changing existing ones, manipulation of urls is a frequent coding requirement during development; doing the same on short urls is easy, but quickly becomes complex for urls which have larger query parameters.<br />
In this post we will see how we can use <a target="_blank" href="http://pear.php.net/package/Net_URL2/">Net_URL2</a> Pear package to manipulate URLS.<br />
<span id="more-1868"></span></p>
<h4>General url sytnax</h4>
<p>Before we start, a general URL syntax review will be useful. The most general form of a URL contains only two elements:</p>

<div class="wp_codebox"><table><tr id="p18681"><td class="code" id="p1868code1"><pre class="html" style="font-family:monospace;">&lt;scheme&gt;:&lt;scheme-specific-part&gt;</pre></td></tr></table></div>

<p>The term <em>scheme</em> refers to a type of access method such as ftp, http, telnet, file etc; which describes the way the following resource is to be used. The rest of the url, after <em>scheme</em>, is dependent on the scheme type.</p>
<p>A complete generlized syntax for http, ftp is shown below.</p>

<div class="wp_codebox"><table><tr id="p18682"><td class="code" id="p1868code2"><pre class="html" style="font-family:monospace;">&lt;scheme&gt;://&lt;user&gt;:&lt;password&gt;@&lt;host&gt;:&lt;port&gt;/&lt;url-path&gt;;&lt;params&gt;?
&lt;query&gt;#&lt;fragment&gt;</pre></td></tr></table></div>

<h4>Installation</h4>
<p>Net_URL2 being a Pear package we will use the Pear installer as below. I recommend to always use the Pear installer to download packages rather than downloading it manually as the Pear installer automatically downloads any dependent packages.</p>

<div class="wp_codebox"><table><tr id="p18683"><td class="code" id="p1868code3"><pre class="text" style="font-family:monospace;">pear install Net_URL2-0.3.0</pre></td></tr></table></div>

<h4>Reading url data</h4>
<p>Now that we have seen how a general url looks like, its time to move on to real examples. In this example we will use the following sample url.</p>

<div class="wp_codebox"><table><tr id="p18684"><td class="code" id="p1868code4"><pre class="html" style="font-family:monospace;">http://www.some-domain.com:80/search.php?q=beatles&amp;id=56&amp;cat=music</pre></td></tr></table></div>

<p>Below is an example using the Net_URL2 library and its output for the above url:</p>

<div class="wp_codebox"><table><tr id="p18685"><td class="code" id="p1868code5"><pre class="php" style="font-family:monospace;"><span style="color: #000000; font-weight: bold;">&lt;?php</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">include</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'Net/URL2.php'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$url</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Net_URL2<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'http://www.some-domain.com:80/search.php?
                     q=beatles&amp;id=56&amp;cat=music'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">echo</span> <span style="color: #0000ff;">&quot;Host      :    &quot;</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">host</span> <span style="color: #339933;">.</span> <span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">echo</span> <span style="color: #0000ff;">&quot;Protocol  :    &quot;</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">scheme</span><span style="color: #339933;">.</span> <span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">echo</span> <span style="color: #0000ff;">&quot;Port      :    &quot;</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">port</span> <span style="color: #339933;">.</span> <span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">echo</span> <span style="color: #0000ff;">&quot;Path      :    &quot;</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">path</span> <span style="color: #339933;">.</span> <span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">echo</span> <span style="color: #0000ff;">&quot;Query Variables: <span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
<span style="color: #990000;">print_r</span><span style="color: #009900;">&#40;</span><span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">QueryVariables</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000000; font-weight: bold;">?&gt;</span></pre></td></tr></table></div>

<p>Which will output the following:</p>

<div class="wp_codebox"><table><tr id="p18686"><td class="code" id="p1868code6"><pre class="text" style="font-family:monospace;">Host      :    www.some-domain.com
Protocol  :    http
Port      :    80
Path      :    /search.php
Query String : 
Array
(
    [q] =&gt; beatles
    [id] =&gt; 56
    [cat] =&gt; music
)</pre></td></tr></table></div>

<h4>Changing url data</h4>
<p>We can as easily change various url parameters as we can read them.</p>

<div class="wp_codebox"><table><tr id="p18687"><td class="code" id="p1868code7"><pre class="php" style="font-family:monospace;"><span style="color: #339933;">.</span>
<span style="color: #339933;">.</span>
<span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">protocol</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;https&quot;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">path</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;/my_search&quot;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #000088;">$queryVars</span> <span style="color: #339933;">=</span> <span style="color: #990000;">array</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/* Get the query variables array */</span>
<span style="color: #000088;">$queryVars</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">QueryVariables</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/* Change some url parameters */</span>
<span style="color: #000088;">$queryVars</span> <span style="color: #009900;">&#91;</span><span style="color: #0000ff;">'q'</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;Scarlett Johansson&quot;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$queryVars</span> <span style="color: #009900;">&#91;</span><span style="color: #0000ff;">'cat'</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #0000ff;">&quot;movies&quot;</span><span style="color: #339933;">;</span>
<span style="color: #000088;">$queryVars</span> <span style="color: #009900;">&#91;</span><span style="color: #0000ff;">'pics'</span><span style="color: #009900;">&#93;</span> <span style="color: #339933;">=</span> <span style="color: #cc66cc;">1</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/* Save back the query variables array */</span>
<span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">QueryVariables</span> <span style="color: #339933;">=</span> <span style="color: #000088;">$queryVars</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/* Display the changed url */</span>
<span style="color: #000000; font-weight: bold;">echo</span> <span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">geturl</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>Which will change the example url to the following:</p>

<div class="wp_codebox"><table><tr id="p18688"><td class="code" id="p1868code8"><pre class="html" style="font-family:monospace;">https://www.some-domain.com:80/New_search.php?
q=Scarlett%20Johansson&amp;id=56&amp;cat=movies&amp;pics=1</pre></td></tr></table></div>

<p>Note the changed parameter values, also note that we have added a new &#8216;pics&#8217; parameter in the url. </p>
<p>We can also change the parameter values using a name,value pair.</p>

<div class="wp_codebox"><table><tr id="p18689"><td class="code" id="p1868code9"><pre class="php" style="font-family:monospace;"><span style="color: #009933; font-style: italic;">/* Change the 'cat' parameter value to 'books' */</span>
<span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">setQueryVariable</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'cat'</span><span style="color: #339933;">,</span> <span style="color: #0000ff;">&quot;books&quot;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>Or unset a parameter</p>

<div class="wp_codebox"><table><tr id="p186810"><td class="code" id="p1868code10"><pre class="php" style="font-family:monospace;"><span style="color: #009933; font-style: italic;">/* This will remove the 'pics' parameter from the url */</span>
<span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">unsetQueryVariable</span><span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'pics'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>You can also easily get fragment url identifiers from a url. Fragment identifier locates a sub-location in a resource. If you have a url like the following:</p>

<div class="wp_codebox"><table><tr id="p186811"><td class="code" id="p1868code11"><pre class="html" style="font-family:monospace;">http://www.some-domain.com/index.php#book_id</pre></td></tr></table></div>

<p>The fragment id can be reached by:</p>

<div class="wp_codebox"><table><tr id="p186812"><td class="code" id="p1868code12"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$url</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Net_URL2<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'http://www.some-domain.com/index.php#book_id'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/* Will return 'book_id' from the url */</span>
<span style="color: #000000; font-weight: bold;">echo</span> <span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">fragment</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<p>If you are accessing a url using some credentials as below:</p>

<div class="wp_codebox"><table><tr id="p186813"><td class="code" id="p1868code13"><pre class="html" style="font-family:monospace;">ftp://username:password@some-domain.com</pre></td></tr></table></div>

<p>You can get the username-password by:</p>

<div class="wp_codebox"><table><tr id="p186814"><td class="code" id="p1868code14"><pre class="php" style="font-family:monospace;"><span style="color: #339933;">.</span>
<span style="color: #339933;">.</span>
<span style="color: #000000; font-weight: bold;">echo</span> <span style="color: #0000ff;">&quot;Username  :    &quot;</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">user</span> <span style="color: #339933;">.</span> <span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span>
<span style="color: #000000; font-weight: bold;">echo</span> <span style="color: #0000ff;">&quot;Password  :    &quot;</span> <span style="color: #339933;">.</span> <span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">password</span> <span style="color: #339933;">.</span> <span style="color: #0000ff;">&quot;<span style="color: #000099; font-weight: bold;">\n</span>&quot;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<h4>Normalizing URLS</h4>
<p>URL normalization (or URL canonicalization) is the process by which URLs are modified and standardized in a consistent manner. Normalization helps you determine if two syntactically different URLs are equivalent.</p>
<p>We normalize a url as below.</p>

<div class="wp_codebox"><table><tr id="p186815"><td class="code" id="p1868code15"><pre class="php" style="font-family:monospace;"><span style="color: #000088;">$url</span> <span style="color: #339933;">=</span> <span style="color: #000000; font-weight: bold;">new</span> Net_URL2<span style="color: #009900;">&#40;</span><span style="color: #0000ff;">'http://www.example.com/../a/b/../c/./d.html'</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span>
&nbsp;
<span style="color: #009933; font-style: italic;">/* Returns 'http://www.example.com/a/c/d.html' */</span>
<span style="color: #000000; font-weight: bold;">echo</span> <span style="color: #000088;">$url</span><span style="color: #339933;">-&gt;</span><span style="color: #004000;">getNormalizedURL</span><span style="color: #009900;">&#40;</span><span style="color: #009900;">&#41;</span><span style="color: #339933;">;</span></pre></td></tr></table></div>

<h4>In conclusion</h4>
<p>Net_URL2 package helps you quickly process urls, without resorting to complex regular expressions or string manipulation.</p>
<h4>Additional information</h4>
<p><a href="http://www.ietf.org/rfc/rfc3986.txt">RFC 3986</a><br />
<a href="http://en.wikipedia.org/wiki/Uniform_Resource_Locator">Uniform Resource Locator</a><br />
<a href="http://en.wikipedia.org/wiki/URL_normalization">URL normalization</a></p>
]]></content:encoded>
			<wfw:commentRss>http://www.codediesel.com/pear/easy-manipulation-of-urls-in-php/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
	</channel>
</rss>

