Blog entries tagged with "backups"

SyncToy is dangerous for backups

Wednesday, February 3rd, 2010 at 8:42 pm

For a long time I have been using Microsoft’s SyncToy to backup data on my Windows boxes over the network to my Linux box. Every few weeks (in reality it was months) I would also use it to copy that same data to an external drive for the off-site backup.

Not any more.

When I first started using SyncToy I was satisfied that it was copying all files. Recently I discovered one of two things: back in the beginning I didn’t check properly, or the behaviour of SyncToy has changed since then.

So what is the problem?

The SyncToy setting I have been using (at least on the recent versions) is ‘Echo’ which is described as:

“New and updated files are copied left to right. Renames and deletes on the left are repeated on the right.”

At face value this is what I wanted, a mirror of the local files to a network share. Unfortunately I didn’t take this description literally enough, SyncToy will ONLY echo changes that are made on the ‘left’ side. What I need (for example when rotating through external hard drives) is a proper sync that analyses both source and destination to determine the differences that need to be copied (you know, like rsync).

So if for some reason files on the destination (‘right’ in SyncToy terminology) go missing or get corrupted, SyncToy doesn’t care. In the case where I am using a pair of identical external drives that I swap between home and work every couple of weeks, data that is copied to one drive is then not copied to the other drive a few weeks later.

What really confuses me is a step that the latest version of SyncToy no longer performs, which is how I noticed this (and then found that many others already knew). It used to be that when the sync ran (immediately after login) I could see it walking the destination file tree, both via network activity and in the samba logs. Why? If SyncToy doesn’t care about the destination, what is the point of this scan? Obviously they figured out that it was redundant and it was removed.

So what have I done?

Ideally I wanted a realiable win32 port of rsync that didn’t require me to install Cygwin. But without that I started looking into alternatives and I settled on Robocopy. Yes, another tool from Microsoft. For XP it is obtained from the Windows Server 2003 Resource Kit, but it is standard for Vista and 7.

Robocopy is a command line tool (there is a GUI available) which is fine with me as I want to script it. Which I have done and I now have two scripts. One to run at login which backs up local data across the network, and a second script which backs up the same data to an external hard drive. This second script also pulls other data (such as my email, etc) from the Linux box to the external hard drive.

One important option that I need to specify is /FFT which tells it to ‘assume FAT File Times’ as apparently the FAT file times are not as accurate as you would expect. But I’m copying from NTFS to ext3, FAT or FAT32 is not involved, but in between those two file systems is Samba, whose SMB implementation has similar time accuracy problems as FAT.

It has now been a week and the backups are working correctly. Hopefully it stays that way.

Tagged with: ,

Robust backups

Tuesday, December 15th, 2009 at 9:06 pm

In light of the failure experienced by two prominent technical bloggers I am glad that over the past few weeks I have been gradually improving my situation in regard to backups by finally crossing a few items off my Todo list.

So what have I done?

Firstly I now have a daily backup of my hosted sites (this one, plus those of some friends). Although I assume that Dreamhost have some form of backup and redundancy, Phil and Jeff learned the hard way that you can’t necessarily trust your host. So in addition to a daily rsync that pulls down all of the hosted files (mostly wordpress files, but also any new images) I now have a server side cron job that dumps each mysql database to a date based file. These mysql dumps are included in the rsync.

It was only last night that I tested these files. From the starting point of a generic Apache with an empty htdocs and an empty mysql database, I was able to copy in the files and import the database. It all worked.

However this is only bringing the backup of my site onto a system that I control. What about the backups of that system?

This is where a pair of external USB drives comes in. The plan with these is to alternate every couple of weeks (at the most) these between home and work. What I have been working on is an automated method to get the data onto one of these drives when it is connected to my windows desktop.

Why the windows desktop?

Because the bulk of what I am backing is 110GB of photos. While these are incrementally synced to my linux box, it is faster to sync them straight from the source. But this is causing some issues with backing up the linux data.

My mail is stored in Maildir format, but when that is copied over windows doesn’t like the file names so they get garbled. So technically I should still have the message content, but I wasn’t sure. So instead I am going to create some archives (tar.gz or possibly rar so I don’t end up with gigabyte sized files) that are then copied over the network.

As this is still a work in progress I expect that the details will change.

Tagged with:

I don’t trust the cloud

Saturday, August 15th, 2009 at 3:29 pm

Since StixCampNewstead I have been meaning to write a post about trusting the cloud. I did start it, but it turned into quite a long and detailed post that I never got around to completing.

It seems that every couple of weeks that something happens to compromise user data. A couple that I noted were Ma.gnolia losing their database, Bloglines being neglected after being sold, Google dropping services, Kodak chaning their terms of service, and one of the many examples of Facebook privacy issues. The one prompting this post is the recent (now reversed) decision to shutdown (a URL shortening service).

I don’t use URL shortening services very often, partly because I haven’t needed to and partly because I also don’t agree with them, but this type of action by has made me decided to setup my own. I’ll probably use one of the WordPress plugins, but Lifehacker has an article with other options.

I have all sorts of data that ranges from private data I need to keep (emails, document, financial records) to public data that I don’t care about (dents and tweets). In between is data that I care about, both private (family photos) and public (photos for competitions or that I have up on Flickr).

I have two rules:

  • If the data is private I try to store it at home (with appropriate backups) instead of on a remote service.
  • If I care about the data I make sure that it is stored at home, or if stored in the cloud I have a backup.

The first rule is why I still run my own IMAP server instead of shifting it out of the country to Google or similar. The second rule is why I still have all the originals for my photos that are on Flickr and why I have nightly cron jobs to backup this site, my delicious bookmarks, etc.

My data aside, it is interesting to see what othes are doing, and not just for their own data, but for others. One great example of this is the Archiveteam which is keeping track of services that are going down, but also steps in to try to preserve their data, as is happening with Geocities. Archiveteam is run by Jason Scott, creator of BBS: The Documentary. His blog post FUCK THE CLOUD prompted quite a reaction and now, six months later, it is still getting comments.

It isn’t just your own data that you should care about, but also any data that you rely on.

Tagged with: , , ,

Backups are worth it

Tuesday, December 4th, 2007 at 7:18 pm

Just before I went to OSDC I moved the contents of my Inbox to a new folder so during the conference I only had to worry about anything new that had come in. My first attempt at applying the concept I have seen referred to as ‘process to empty’.

This worked well and I ended up using it as a place to store anything that came in during the conference that I wanted to deal with when I returned home. Which I did.

However, once I had dealt with all of the recent items I accidentally deleted the folder. This meant that a couple of emails that had been hanging around in my Inbox for a long time were gone. And I still needed them.

Six months ago I had burnt a backup of my photos and documents that I was storing off-site (aka at work) so today it was a simple matter of grabbing the appropriate disc, extracting the archive of my home directory, and picking out my Inbox from that. Then when I got home it was a matter of dumping the files in the appropriate directory (under Maildir) and looking at the messages in Thunderbird.

Now, the data that I lost wasn’t particularly important, but I do need it in order to follow some things up so I was thankful.

Something I need to improve is the interval. The off-site backup is 6 months old. While my other backup is a nightly rsync that gives up to 24 hours. I have been meaning to use an external hard drive which could give a number of week intervals.

Tagged with: ,

Internal power on an external case does exist

Wednesday, October 3rd, 2007 at 9:47 am

Earlier in the year I wanted an external hard drive that did not have an external power brick. I even acquired an old SCSI case, into which I was planning to fit the circuit board and power brick from a cheap external case.

Now, thanks to Zazz! I have found an external case with a built in power supply that actually seems to be available: a Sarotech Hardbox.

However, there is an interesting issue. The price.

  • Zazz has the drive case and a 400GB Samsung hard drive for a total of AU$169.90 (and AU$12.90 postage).
  • At the local computer parts places a 400GB drive currently goes for around AU$130.
  • So that would be another AU$40 for the case which is what I have seen at (the now defunct) swap meets for the ‘one touch backup’ external cases.

However a quick search online for places in Australia selling the Sarotech Hardbox brings up prices of at least AU$90. It actually makes the Zazz! deal tempting, although I do prefer Seagate or Western Digital over Samsung…

At least I know know that there is something available. But I probably wouldn’t get one unless the price is below AU$50.

Update: Further searching turned up the case for AU$47.50 and AU$10 shipping. I should get onto a friend and see if he can get it wholesale…

Tagged with: ,

Where are the internal power supplies?

Friday, May 4th, 2007 at 8:12 pm

A few weeks ago Thomas Hawk posted about using external hard drives to back up photos. The post and the comments that followed provide a lot of good ideas and advice, but none of them address a fundamental issue I have with external USB drives:

  • They use an external power supply.

I have problems with this:

  • The power supply is an additional part that must be carried with the drive. This reduces the convenience of the drive unless there is a power supply at each location the drive is to be used.
  • The pins on the power connector are too fragile. Between myself and people I know there are at least a half dozen times where a drive has become useless because the connector or the socket became faulty.
  • The power supply adds to the clutter if the drive needs to be connected for an extended period of time.

A few years ago, before USB, the option for external drives was SCSI and those cases came with internal power supplies. Simply connect an IEC power lead and the SCSI cable and the drive was ready to go.

Why can’t that be the case for USB cases? You could transport a single item which could be used anywhere that had a standard power cable and a standard USB cable.

I can think of two possible solutions which both involve sacrificing a USB drive case:

  • Fit the hard drive, USB interface and the (previously) external power adapter inside another case.
  • Fit the USB interface inside a SCSI hard drive case in place of the SCSI connector.

For now I’m just going to keep my eye out for cheap SCSI cases on eBay.

Tagged with: ,

Off the air

Wednesday, June 21st, 2006 at 11:52 pm

Since Monday evening (my time so at least 48 hours ago) this site has been unavailable.

Q: Why?

The server doing the hosting seemed to drop off the internet.

Q: Why?

Good question. As the site for my hosting provider was also down I decided to take the optimistic view that the situation was being recified and it would be back up shortly.

Q: Was it?

No. I tried again later that evening and the situation had not changed. I was getting more concerned but I decided to stay optimistic and see if it was back up the next morning. That turned into the next afternoon and I was then kicking myself as I realised that it had been a couple of months since I had last backed up these blog posts and there were settings that I had never backed up.

Q: So what now?

I jumped ship. Although the hosting had been ticking along without any issues I had considered it strange that they didn’t seem to care that they had not billed me since October 2005, ie more than six months ago. What company doesn’t care if its customers pay or not? Becuase of this I had looked into other web hosting providers a month or so earlier so it was simple to sign up with the first name on the list after I couldn’t even access the current provider by phone (voice mailbox full!). It was then a waiting game while the new DNS settings for my domain propogated.

Q: What about the data?

Initially I thought that I had only lost a little bit of data but as time flowed along I realised that I had lost enough data to be inconvenienced.

I run a copy of the site on my box at home so that I can test any changes before releasing them so I knew that all of the files were intact. It would be a straightforward matter of uploading them to the new host. This meant that my computer collection area was intact as that is contained within files or brought in from (which I backup via a cron script that uses the API to grab a dump of all my links once a day). The photos section was also ok as it pulls the set details from Flickr.

This blog was a different story as these posts are primarily stored in a MySQL database. Every so often I copy the database over to my local MySQL instance but the last time I had done that had been at the end of March. Almost three months ago! Fortunately Google’s cache came to the rescue and I was able to obtain the text for all of the posts I had made since that time. One item on my list is to setup a mechanism to automatically backup the database, a quick search showed that there was at least one WordPress plugin that could periodically email an export of the database.

Losing all of my email forwarding settings means that my spam strategy has taken a big hit as I will have to regenerate the list of valid addresses and again monitor my gmail account which will be the target of the catchall rule.

Q: When will the status-quo be restored?

Not until after the upcoming weekend. So far I have uploaded the files to the new web host and the DNS setting have propogated. Until I sort out some differences in the configuration of this host to my old host all I have running is this blog (how else can it be read?)…

Tagged with: , ,

One year and a day

Saturday, March 5th, 2005 at 1:02 pm

On thursday night midway through a long overdue backup of my personal files onto dvd my Pioneer burner decided to just stop responding. It can read and burn CD’s fine but doesn’t even want to detect that a DVD (of any type) has been inserted. It is ironic that I had had the drive for one year and a day

That said I went out today (after my golf lesson) and picked up a new one, this time a Pioneer DVR-109. At the same time I picked up some new RAM to take my main windows box up to 1GB.

Tagged with: ,