CrashPlan De-Duplication Problems

2014-02-13 21:19 PST

When I returned from a winter vacation with my family, I discovered that my MacBook Pro had failed. Well, the video card had anyway. I promptly took it to the Apple Store where the Genius agreed with me that the video card was the problem, and also assured me that the machine’s hard drive would not need to be touched.

It came back to me with a new video card, and a fresh copy of Mavericks. So much for promises of the integrity of my data.

No matter, I had a backup with CrashPlan, and I had a bunch of catch-up to do at work, so I could afford to let my personal machine spend three or four days downloading my 110GB restore from CrashPlan. So, disappointed with Apple, I brought the machine home, walked through the setup porcess, installed CrashPlan, and set it about restoring my data.

A day or so later, I found I had some problems. I had only downloaded about 4GB of data. What was going on here? I tried to reach out to CrashPlan Support, to no avail. Eventually after a few days I sent them a NastyGram via Twitter which prompted a response and I was able to open a ticket with their support staff.

Madness ensues!

Josh, in Support and I had a couple of back and forth exchanges about what I could do. He insisted that my machine must be going to sleep. I said it wasn’t. He said that it was changing IP addresses at an alarming rate. I said it wasn’t. blah blah blah. What struck me as maddening here, is that I sent the first message asking for help. His response came back just inside of 24 hours. To which I responded, and waited another 24 hours for his response. After several days of this back and forth with a 24 hour delay, I asked what was up, and if my ticket could be escalated. He said in an email to me

And this ticket is escalated to its highest severity, and we aim to respond once every 24 hours, we cannot promise anything more.

Josh in Customer Supoort

Customer Service this is not.

I had seen for myself that the ticket was of “medium” priority. Their ticketing system gave away that much information. And what is this 24 hour nonsense about? As it was, my restore that should have only taken 3-4 days was working well into 6, and at 4.25GB downloaded… I didn’t see it finishing anytime too soon, and with their delayed response time, it was going to take more than a year to get by data back!

I expressed my frustration several times more with 24 hour return times, until eleven days after I began my restore job, with only 4.5 GB of data downloaded, I turned to the broader Inter-webs for help. I found a page that I thought was interesting, but I didn’t think there was any way it could help me. The post at http://networkrockstar.ca/2013/09/speeding-up-crashplan-backups/ talks about CrashPlan’s de-duplication algorithm causing bottlenecks in the backup process. While I had seen slow backup speeds before, I was currently trying to restore data. Certainly the de-dup bottleneck should be a factor when my computer was trying to de-duplicate date before sending it to CrashPlan. However, using the method in the post should not have an effect on the restore process. At this point I had nothing to lose, and if I applied this fix, I would be all set if I ever did complete the restore, when it next came time to back my machine up.

I effected the changes described in the post above, on the eleventh night and would you believe it, my restore speed jumped ten times what it had been prior to the change. I let it run all night and by morning the retore job was 48% complete!

I had set up a phone call with Mac N. also from CrashPlan Support for the twelfth morning. I was happy to report to him that I had dramatically increased the speed of my restore job. Mac it seems is the CrashPlan Support Macintosh guru. If you have a Mac be sure you get Mac N. He pointed out that while CrashPlan requires Java6 to be on the machine, it actually runs better when running Java7 on OS X Mavericks. He also talked about a number of other Macintosh specific items, that are escaping my brain at this time, but interestingly he had not heard about the de-duplication algorithm problem posted about at NetworkRockstar.ca He did ask for the link so he could review the information and evaluate it for inclusion in CrashPlan’s support knowledge base.

Hopefully with people like Mac N. CrashPlan’s support will get better. After my overall experience, up is the only direction left for it to go. :) Oh, and if you happen to work for Code42 (CrashPlan’s parent company) checkout Zendesk ticket 41109 for details on this case.

Anyway, this is all a rather long way of saying that I felt the need to execute the commands recommended by the NetworkRockstar article, on all of my computers running CrashPlan. The following is a script that will parse your my.service.xml file and make the appropriate changes for each backup set you have specified. Macintosh only. If you are looking for Linux and Windows solutions, look here: http://networkrockstar.ca/2014/01/speed-up-crashplan-backups-automagic/

Additionally, you may find that you need to rerun this script following any change to your backup sets. CrashPlan it seems want’s to de-duplicate files very badly. The change is not persistent across edits to your backup sets.

UPDATE - 2014-03-13: I have updated the above script to more gracefully handle the space in the $FILEPATH variable. Check out the GitHub Gist to view the specifics.

UPDATE - 2015-01-30: I have updated the script to include the new path of the menu bar application. Check out the GitHub Gist to view the specifics.

UPDATE - 2015-01-30: It probably comes as no surprise, but I left CrashPlan not too long after this incident took place.