Sorting PHP arrays with substrings

A couple of days back I encountered the following request on a forum for sorting a single dimensional array. The programmer wanted to sort the following array by the substring after the colon. For example ‘CDF’ in the string ‘66345:CDF’. Those values that do not have any colon should be ignored and pushed to the end of the array. Also, the length of the strings are not constant.

$values = array("66345:CDF", "61179:HGT", "64146:ABA",
                "68768:BNG", "68015:ZCZ", "80231:LPO",
                "64146:QWP", "68736:HHB", "86801:MNV",
                "80178:OIU", "80178:ASE", "88178:BRT",
                "801782OIU", "801378ASE", "881578BRT");

A quick solution was the following. But I’m sure that this is not the fastest, as a matter of fact it is quite slow for larger arrays.

 
<?php
 
$values = array("66345:CDF", "61179:HGT", "64146:ABA",
                "68768:BNG", "68015:ZCZ", "80231:LPO",
                "64146:QWP", "68736:HHB", "86801:MNV",
                "80178:OIU", "80178:ASE", "88178:BRT",
                "801782OIU", "801378ASE", "881578BRT");
 
 
function my_cmp($a, $b)
{
    $pieces_a = explode(":", $a);
    $pieces_b = explode(":", $b);
 
    if(!isset($pieces_a[1]) && isset($pieces_b[1])) {
        return 1;
    }
    elseif(!isset($pieces_b[1]) && isset($pieces_a[1])) {
        return -1;
    }
    elseif(!isset($pieces_a[1]) && !isset($pieces_b[1])) {
        return 0;
    }
 
    return strcasecmp($pieces_a[1], $pieces_b[1]);
}
 
usort($values, "my_cmp");
 
?>

So what will be the fastest algorithm for the same?

4 thoughts on “Sorting PHP arrays with substrings

  1. if you have plenty of RAM you could give this a try:

    <?php

    $values = array("66345:CDF", "61179:HGT", "64146:ABA",
    "68768:BNG", "68015:ZCZ", "80231:LPO",
    "64146:QWP", "68736:HHB", "86801:MNV",
    "80178:OIU", "80178:ASE", "88178:BRT",
    "801782OIU", "801378ASE", "881578BRT");

    //get more values for longer experience
    for ($z1=0;$z1sizeof($values)
    ,"time to beat"=>$t1-$t0
    ,"time took"=>$t2-$t1
    ) );

    ?>

    22.43:je@8@booster:~
    $ /usr/local/php5/bin/php /var/www/test.php
    array(3) {
    ["number of items to sort:"]=>
    int(15360)
    ["time to beat"]=>
    float(6.2945919036865)
    ["time took"]=>
    float(2.0644869804382)
    }
    22.47:je@8@booster:~


  2. $f1 = create_function('$v', 'list($a,$b)=explode(":",$v);return $b.($b?":":"_").$a;');
    $f2 = create_function('$v', 'list($a,$b)=explode(":",$v);return $b.($b?":".$a:ltrim($a,"_"));');

    $valeurs = array_map($f,$valeurs);

    sort($valeurs);

    $valeurs = array_map($f,$valeurs);

    array of 3000 items :
    yours = 1.96446013451
    this = 0.0130748748779

    ~30000 items :
    26.4119880199
    0.182013988495

    seems coherent results.

    I don’t even know why I have imagined such a solution! or why it looks like more efficient!

  3. oops (^Z)

    corrected to:

    $max = array_reduce($valeurs,create_function('$a,$b', 'return (strlen($a)<strlen($b)?$b:$a);'));
    $max=strlen($max);
    $f1 = create_function('$v', 'global $max; list($a,$b)=explode(":",$v);return $b.($b?":":"_").str_pad($a,$max,"_",STR_PAD_LEFT);');
    $b.($b?":":"_").str_pad($a,10,"_",STR_PAD_LEFT);');
    $f2 = create_function('$v', 'list($a,$b)=explode(":",$v);return ltrim($b,"_").($b?":".$a:ltrim($a,"_"));');

    $valeurs = array_map($f1,$valeurs);

    sort($valeurs);

    $valeurs = array_map($f2,$valeurs);

    -> 4.84191417694 (still better, probably not optimal)

Comments are closed.