Exchanging columns in a CSV file


In one of my recent project requiring a CSV text file import, the data columns had to be in certain predefined positions for the program to work correctly. The import did not depend on field labels, which is obviously a wrong way to process data. It would not have been much of a concern had the CSV files been in the correct format. However, the CSV files I had had the columns in wrong positions, which required me to exchange the columns to their correct positions for the parent program to import them correctly. For a few files I’d have easily used a spreadsheet for the task, but with around 34 files it was going to be tedious. The task entailed me to write a small script which helped in the matter.

The PHP script to exchange CSV columns is shown below.

$fp_out = fopen("data_new.csv", "w+");
 
if (($fp_in = fopen("data.csv", "r")) !== FALSE)
{
    /* Exchange the 2nd with the 3rd field and the 1st with the 4th
     * Column numbers start with 1 rather than 0 as in array indexes
     */
    $exchange = array("2-3", "1-4");
 
    while (($data = fgetcsv($fp_in, 1000, ",")) !== FALSE)
    {
        foreach($exchange as $e)
        {
             $fields = explode("-", $e);
             $temp = $data[$fields[0] - 1];
             $data[$fields[0] - 1] = $data[$fields[1] - 1];
             $data[$fields[1] - 1] = $temp;
        }    
 
        fputcsv($fp_out,$data,',', '"');
    }
 
    fclose($fp_in);
    fclose($fp_out);
}

So if I’ve a CSV file with the following content:

customerno, firstname, lastname, sales
23242, john, doe, 2345.00
23253, jane, doe, 1234.00
23221, greg, johnson, 2345.00
23210, howard, gardner, 2345.00

Running the above script on the file will result in a new file with the following content.

sales,lastname,firstname,customerno
2345.00,doe,john,23242
1234.00,doe,jane,23253
2345.00,johnson,greg,23221
2345.00,gardner,howard,23210

The columns to exchange are represented by the $exchange array.

/* Exchange the 2nd with the 3rd field and the 1st with the 4th
 * Column numbers start with 1 rather than 0 as in array indexes
 */
$exchange = array("2-3", "1-4");

However, if we set the array to the following, the 1st and 4th columns will be reset to the original, so take care when you create the array.

$exchange = array("2-3", "1-4", "4-1");

The script above will not however exchange all types of CSV files; files where fields are enclosed by quotes will not work correctly with the above script. You will need to make some small modifications top the same.

Just for comparision, prior to creating the PHP script, I played around with Awk, which enabled me to accomplish the same with only two lines of code. However, I’d still prefer working with PHP, as I can easily integrate it with my other code.

{   
    FS = ","
    print $4, "," $3, "," $2, "," $1 
}





Enhanced by Zemanta

This site is a digital habitat of Sameer Borate, a freelance web developer working in PHP, MySQL and WordPress. I also provide web scraping services, website design and development and integration of various Open Source API's. Contact me at metapix[at]gmail.com for any new project requirements and price quotes.

4 Responses

1

Shantanu Oak

October 15th, 2012 at 2:08 am

Thanks for sharing this. But I guess awk would an appropriate tool for this.

awk -F “,” ‘{print $4″,”, $3″,”, $2″,”, $1}’ customer.csv

I prefer to use the best suited tool for a given task, maybe this approach is different than those who think that coding in a single language makes it uniform to maintain.

2

Shantanu Oak

October 15th, 2012 at 3:30 am

I did not see that you have already mentioned awk in your post. But my argument still stands valid. Let us see how python would do it.

import csv
with open(‘customer2.csv’, ‘wb’) as output:
input = csv.reader(open(‘customer.csv’, ‘rb’))
output = csv.writer(output, dialect=input.dialect)
for line in input:
line.reverse()
output.writerow(line)

I guess you will agree that this 7 lines of code is more elegant, readable and flexible.

sameer

October 15th, 2012 at 3:42 am

The idea was not to completely reverse columns, but to exchange any two arbitrary columns.

4

Shantanu Oak

October 15th, 2012 at 5:29 am

You can save the results to a tuple and the rearrange them as you wish.
Here is how it can be done using exception handling.

import csv
with open(‘customer2.csv’, ‘wb’) as output:
input = csv.reader(open(‘customer.csv’, ‘rb’))
output = csv.writer(output, dialect=input.dialect)
for line in input:
try:
(customerno, firstname, lastname, sales) = line
except ValueError:
print ‘valueerror’
else:
outLine = (sales, firstname, lastname, customerno)
output.writerow(outLine)

Your thoughts

Sign up for fresh content in your email