A cautionary tale….
Earlier this week I was busy rebuilding an old machine to use as a dedicated writing machine.
I going to blog all about that at TheFridayBlog (my writing blog) tomorrow!
One thing that I won’t mention there is the minor setback that occurred due to my stupidity and lack of thought.
The problem was rsync!
I decided (quite rightly) that it would be rather sensible to back up the writing machine not only to gmail (my normal process) and to an external USB drive (I just happened to have a spare one available) but also to my server.
Rather than write a backup script from scratch I decided to use an old one and just change the locations.
The edited script (this is just an extract – the bit that did the damage!) that I used was as follows….
# And then back Elgar up to Mozart….
rsync -av –progress –delete –log-file=/home/keckstein/Backup/$(date +%d%m%Y)_Elgar_backup.log /home/keckstein/ /media/Mozart
And what’s wrong with that, I hear you ask?
Yes, it looked OK to me at first glance.
There are (as I later found to my cost) two problems…
1). I should have been rsyncing the data to it’s own directory on Mozart (rather than the root of the share) and
2). I really should have thought harder about that –delete parameter!
Normally I like to use the –delete parameter for disk to disc backup copy jobs.
What happens is this; if you delete a file on the source drive (the drive that you are backing up), rsync will also delete that file from the destination drive (the drive you are backing up to.)
All well and good. The destination drive is a mirror copy of the source drive. That’s what we want, isn’t it?
What happened the first time I ran the backup script (it contained a lot more than just that one line but, that one line was more than enough to cause me a whole pile of grief!) was that the files were copied from Chatwin to Mozart – no problem there.
Then rsync deleted every file on Mozart that didn’t exist on Chatwin!
Oops!
The main data share on Mozart contains all of our photos, our ebook library, our day to day financial and business data and other important stuff like that!
Luckily, we automatically back the main drive on Mozart up to an archive drive in the same box; that’s where I leave customer backups when I rebuild their machines – when I start running out of space they get deleted in date order.
A quick check showed me that the previous night’s backup had worked (one of the reasons I like rsync to write a logfile every time I invoke it) and a quick copy job restored the data.
So, not such a disaster but it could have been so, so much worse.
I suppose that the lessons learned here are….
1). Treat rsync with a bit of respect. It WILL do exactly what you tell it to!
2). Always do a dummy run first!
3). Always think twice about that –delete parameter!
4). Always think twice about reusing old scripts. It might have done the job that you wanted it to but will they do the job that you now want it to?
There’s a great resource on rsync here and here.
Of course, the wiser amongst you might have wondered what would have happened if the overnight script to backup Mozart also used the –delete parameter and if I hadn’t noticed the problem with the first script until the next day?
I believe that Oh shit! would have been the kindest of all the many things that I would have had to say!
All the best
If you liked this article, why not share it with your friends on Facebook



























9 Responses to “A cautionary tale….”
Oh, boi!
you really gotta spend a little more time to learn how rsync works… for it is not rsync who’s at fault here… it is how you chose to write that command!
I always test rsync with –dry-run before running it for real.
Thank you for bravely sharing, so others may benefit.
Note: I love the powerful and fast command line; however, typos and syntax, demonstrate pros of the well written GUI interface; with user friendly choices. While the CLI is not the terror, novice users imagine; it does have it’s limits. As does the GUI apps.
I think we need to continue safe backup ease and with great options; as we further mature the GUI standard, backup progrmas.
Yes, simply always run the rsync command with the “-n” flag first, unless you’re using a script that you haven’t modified since it was tested…
If you want to do backups with rsync look into the –link-dest parameter. Without that you are simply making a mirror. With –link-dest you can have many old backups that you can revert to if needed.
More information: http://www.sanitarium.net/golug/rsync_backups_2010.html
After having a little too much wine while working on the computer I did a right-click delete on a directory that had over 100GBs of data by accident. There is no trash bin I Linux so I kissed that data goodbye. It was the only copy on my backup drive. I was planning to move it over to another location but did not get the chance.
That and similar problems caused me to include –backup and –backup-dir in almost all backup scripts. I rather clean up the old backup directories from time to time that to risk rsync mercilessly deleting whatever I unthinkingly told it to delete.
Thanks for the reminder!
Maybe you should check dirvish out:
apt-get install dirvish
It’s kind of a deduplication backup. You can have a month worth of backups in 110% of the space a regular backup occupies. (It just depends on how much the data changes)
It’s well worth the 15 minutes required to learn how to set it up.
Use rdiff for backup. Rsync is for mirroring, not for backups. Rdiff will create a backup and create a reverse delta that stores every previous version as well.
Leave a Reply