...or, How I Suffered Through an 'If it Can Go Wrong, it Will Go Wrong' Scenario for a One-Off Export of Database Tables
This is one of those posts I'm writing up because in the moment I was so incredibly frustrated about how unnecessarily complex this action was, and I still have the various browser tabs that ultimately provided me the necessary bits open. I write this mostly for Future Me should I need to do something like this again.
The Problem
I needed to do a pretty complete export of a complex MySQL database (there are many core tables, and many more relational tables) to CSV as a snapshot for someone to do deeper data analysis. I was hoping to avoid that mass-export, instead providing more direct read-based access, but that's what they needed so that's what I was to provide. One of tables in question contains 20 years of data and has over a quarter million individual records, and there are about 30 others to complement it all.
My thought on the matter was to do a pretty routine export/backup of the database from production, import it on my local dev MySQL instance, and then Do The Work from that copy. And that worked out all fine and well (the backup/restore on dev) until...
My "Normal" Direct Access Tool Doesn't Directly Export to CSV
Ugh. I love my HeidiSQL application as it does many things I need, but massive table dump to CSV is not one of them. Great for smaller datasets; terrible for the entire database.
I consulted the Internet to see what my options were. It boils down to this:
- Go command line with the
mysql
client; - Use
mysqldump
to write to file by table; - Use MySQL Workbench;
- Use another tool such as phpMyAdmin; or
- Use the in-built CSV Engine.
The Non-Contenders
Under no circumstances was I going to use phpMyAdmin for this. I have never cared for phpMyAdmin, and going full install (even on my development server box) was just way more effort than necessary. In hindsight, installing it could have actually saved me time, but I don't regret not installing it and going this route.
The in-built CSV Engine was also out of the running because while it seems simple and awesome (just an ALTER TABLE
call away), it can't handle tables with indexes or AUTO_INCREMENT
set, which immediately ruled it out for my need.
That Leaves...
In my order of preference, I was going to use:
mysqldump
because it's automatically there;- MySQL Workbench because it has a GUI; and
- Command line with
mysql
mysqldump
Attempt
This looked promising, because I could just run it per table and Magic would happen, right? Right?
mysqldump -u <user> -p -t -T/path/to/output/file.csv <db_name> <table_name> --fields-terminated-by=,
Except...
mysqldump: Error: 'Access denied; you need (at least one of) the PROCESS privilege(s) for this operation' when trying to dump tablespaces
In hindsight, I should have just figured out what PROCESS
rights to grant to the functional account in question, but since I had other options I figured it was not worth fiddling and moved on to the next.
MySQL Workbench Attempt
I installed MySQL Workbench, connected to the dev database instance, and followed the steps to for the export wizard. So far, so good, until...
A damned character encoding problem.
Ugh. I knew this weird set of table collations that I mainly inherited would come to bite me in the future, but apparently today was the day. It was not worth trying to fix that in the moment, though it does prompt me to look at atoning for the sins of others at some point in the future.
Let's Go Command Line! Hail Mary!
This left me with the direct query method via the mysql
client. In theory I could just run this query to drop the data to CSV on the local dev box:
SELECT * INTO OUTFILE '/path/to/table_name.csv'
FIELDS TERMINATED BY ',' ENCLOSED BY '"' LINES TERMINATED BY '\n'
FROM table_name;
But then I got a permission error about the mysql user not having FILE permissions. Okay, that was fair. Time to GRANT FILE
(once logged in with a super user account to fiddle with permissions):
GRANT FILE ON database.* to user@localhost;
Logged in to mysql as the user in question with freshly provisioned access, and tried the query again. I was able to run the command, but now I got a file write error that appeared to be filesystem permission-based. Because of course I did. And of course the error code and message were less than outstanding or useful.
I changed permissions on the output path, no go. chown
to the running user, no go. Hell I even gave it a hot minute with 777
permissions, also no go. It had to be something else... Reverting back to normal-ish 664
path permissions, I did some Googling.
Culprit: apparmor on Ubuntu
After a significant amount of Googling various terms and such, I stumbled across this stackoverflow thread which closely matched my experience. I am using current versions of MySQL and Ubuntu, but it was worth a shot.
The TL;DR: on my Fix
I edited the apparmor config for MySQL at /etc/apparmor.d/usr.sbin/mysqld
and in the log file section I added (they're the commented out bits in this snip):
# Allow log file access
/var/log/mysql.err rw,
/var/log/mysql.log rw,
/var/log/mysql/ r,
/var/log/mysql/** rw,
# /path/to/ r,
# /path/to/* rw,
A quick run of sudo /etc/init.d/apparmor reload
made the changes take effect, and running the query once again output the data to CSV successfully!
Once I was done with the table exports, I removed (by comment, see above) the apparmor changes I made and reloaded apparmor just to tighten things back up again.
What a Mess
I don't anticipate needing to do this process again (or certainly not with any frequency), but I definitely learned a few things (and swore a bit) along this little journey. It doesn't seem like this sort of bulk action should be as difficult as it can be, but I also appreciate the built-in security out of the box to prevent stuff like this.