Splitting large MySQL dump files


One of the frustrating things with working with MySQL is of importing large sql dump files. Either you get a ‘max execution time exceeded’ error from PHP or a ‘Max_allowed_packet_size’ from MySQL. In a recent task I needed to import a table of around a million records on a remote host, which quickly became an exercise in frustration due to various limitations on the server. SSH was of no help as changing the configuration files was restricted to the root user.

My last resort was to split the huge ‘INSERT’ statements into smaller size files. Manually doing the same is obviously time consuming and error prone; the only other solution is to write a small script to split the insert statements. This was to be a quick hack so the parsing code was to be of minimum complexity. Splitting a sql dump containing extended insert statements is somewhat complex so you need to have the dump file in a simple format – each insert statement should be on its own line as shown below.

INSERT INTO `dt_codes` VALUES ...
INSERT INTO `dt_codes` VALUES ...
INSERT INTO `dt_codes` VALUES ...
.
.

You can create the initial dump file in a simple format using the following command or you can do it using phpMyAdmin.

mysqldump -uUSER -pPASS --databases DATABASE_NAME --tables TABLE_NAME \
--extended-insert = FALSE > dump.sql

Once you have the big dump file you can use the following PHP script for splitting the file into smaller chunks. Change the variable ‘$max_lines_per_split’ to whatever value you think will work on your system.

<?php
 
set_time_limit(600);
ini_set("auto_detect_line_endings", true);
 
/* Number of 'insert' statements per file */
$max_lines_per_split = 50000;
 
$dump_file = "dump.sql";
$split_file = "dump-split-%d.sql";
$dump_directory = "./sql-dump/";
 
$line_count = 0;
$file_count = 1;
$total_lines = 0;
 
$handle = @fopen($dump_file, "r");
$buffer = "";
 
if ($handle) {
    while(($line = fgets($handle)) !== false) {
        /* Only read 'insert' statements */
        if(!preg_match("/insert/i", $line)) continue;
        $buffer .= $line;
        $line_count++;
 
        /* Copy buffer to the split file */
        if($line_count >= $max_lines_per_split) {
            $file_name = $dump_directory . sprintf($split_file, $file_count);
            $out_write = @fopen($file_name, "w+");
            fputs($out_write, $buffer);
            fclose($out_write);
            $buffer = '';
            $line_count = 0;
            $file_count++;
        }
    }
 
    if($buffer) {
        /* Write out the remaining buffer */
        $file_name = $dump_directory . sprintf($split_file, $file_count);
        $out_write = @fopen($file_name, "w+");
        fputs($out_write, $buffer);
        fclose($out_write);
    }
 
    fclose($handle);
    echo "done.";
}

Once the dump file has been split into smaller ones, you can gzip them to reduce the size further.

gzip dump-split-*

This site is a digital habitat of Sameer Borate, a freelance web developer working in PHP, MySQL and WordPress. I also provide web scraping services, website design and development and integration of various Open Source API's. Contact me at metapix[at]gmail.com for any new project requirements and price quotes.

5 Responses

1

Ian Jenkins

October 3rd, 2011 at 8:16 am

Had this exact same problem recently, ended up using the following…

http://www.softpedia.com/get/Internet/Servers/Database-Utils/SQL-Splitter.shtml

2

Yasen Zhelev

October 3rd, 2011 at 9:29 am

And why you did not use mysql command line to import the file into database? mysql -h… -u…. -p…. database_name < filename.sql You said that you have ssh access, right?

sameer

October 3rd, 2011 at 8:59 pm

As I have said in the post, the sql dump being large, the command from SSH raised the ‘Got packet bigger than ‘max_allowed_packet’ bytes’ error and I did not have the permission to change the settings due to admin restrictions.

4

Anonymous Coward

October 5th, 2011 at 10:44 pm

This is also useful for backing up SQL dumps to google docs which has a char+line limit on uploads for non-enterprise users. Good work.

5

Willem

October 19th, 2011 at 7:17 am

Here’s a free and open-source command-line utility that will split your sql-dumps:
http://rodo.nl/english/opensource/mysql-splitter/

Even extended inserts will be split and the split files will be kept syntactically correct. Especially handy when you can’t easily re-do the sqldump without the extended-inserts option!

Your thoughts

Sign up for fresh content in your email