Jungledisk for offsite backup
For the longest time, I've been looking for a way to securely store a backup of my files off-site. I'm already using BackupPC to backup the files to another harddisk and keep a number of revisions to secure against harddisk failure. But, you never know what happens in the future. I really don't want to loose all my digital pictures, documents or e-mail in case of a fire.
Requirements for backup:
- Offsite (outside my house)
- Secure, both secure access and secure storage
- Affordable - I don't want to spend US$ 30 per month for 10GB
- Unlimited - I don't want to be limited to 1GB, 10GB or 100GB programmes, but just want to be billed for what I use
- Access from Linux and Windows
- Use in automated backup
Sometime ago I had already read about Amazon's S3 storage offerings, which meets most requirements above. It's offsite with datacenters in the US and the EU. It's definitely affordable at $0.15/GB per month and use of their storage is unlimited! However, Amazon does not provide easy access to their storage (i.e. sftp or similar). You need to use their webservices to access their storage, which renders it unusable without writing appropriate software.
Last week I came across Jungle Disk, which is a very nice program to automate backup to Amazon's S3. It's available for both Windows and Linux. The nice thing is that it automates your backup, but they have provided open source code to access your own data at AWS without their software, thus preventing vendor lock-in. Even if Jungle Disk goes bottom-up, my data is still safe at Amazon. Data transfers are protected using SSL and stored data is encrypted using AES and a personal key if you choose to do so. Without AES, the data is protected using Amazon's internal mechanisms.
Installation and configuration on both Windows and Linux is straightforward. Before you start, download the software from Jungledisk and create an Amazon S3 account to store your data. Be careful when selecting the datacenter for your account. While Amazon storage in the US is cheaper than storage in the EU, access can be slower if you're located in the EU. On the other hand Jungledisk will only provide it's Plus services (read further down) if you've chosen a US datacenter.
On Windows, install the software using the installer. You then simply configure it with your Amazon account and a backup schedule, and your backup will run whenever you want it to. Every option you could possibly need is provided: bandwidth limiting, encryption, number of previous versions to keep, age to keep previous versions, file inclusion and exclusion filters, Windows snapshot creation to backup in-use files, etc. You can also configure Jungledisk to mount a network drive to your Amazon S3 account to have access to your files from Windows explorer. In addition, Jungledisk provides local Webdav access via http://localhost:2667/ as an additional access method.
For installation on Linux, you simply download the binary in a .tar.gz file. After extracting it, you end up with two binaries, jungledisk and junglediskmonitor. The first provides you with commandline access to your files, while the second is an X11 GUI similar to the Windows GUI. As my server does not have a monitor, I've only used the first. The configuration of Jungledisk is stored in the jungledisk-settings.xml file. While you can manually edit this file, it's also possible to download the standalone USB edition and create this XML file on any platform you choose (Windows, Linux and Mac OSX).
There are numerous ways to schedule your backups on Linux:
- If you're running X11, you can run
junglediskmonitorand schedule your backups using the GUI, similar to Windows; - Using the XML configuration, you can schedule the backups to run whenever you want it to. You then need to run
junglediskwhen your system boots, for example fromrc.localor by creating your own init.d script; - Third, and this is what I've done, you can run Jungledisk from Cron on a nightly basis. In your Jungledisk configuration, you set your backup schedule to "Manual Only". You then run the following command from your crontab:
</path/to/jungledisk>/jungledisk -f --startbackup --exit. This tells Jungledisk to startup, run your backup (and stay in the foreground) and exit when all backups are complete. - Finally, it's possible to mount your S3 filesystem using Jungledisk and Fuse. Once you've mounted this, it's possible to backup to the mounted filesystem using any regular backup method you choose, such as rsync. I haven't personally used this as Fuse is not available as a standard RPM yet on CentOS and it doesn't provide me with any additional functionality over the standard Jungledisk functionality.
Although Jungledisk costs US$20 for a lifetime license, I've found the ease of use and value that it provides to be worth spending that amount. You can use this one license to backup multiple machines to the same Amazon S3 account. I use it to backup both my Windows laptop and several CentOS 5 Xen instances today.
In addition to it's standard services, Jungledisk provides additional Jungledisk Plus services, which provide you with three additional features.
- First, and most important, you get block-level file updates. Only changed portions of large changed files will be uploaded. This can potentially save you a lot of data traffic. The single reason I've purchased this service, which costs US$ 1/month is to backup my Microsoft Outlook .pst archives.
- The second feature, which is related to the first, provides you with resume on file uploads. If a file upload is disconnected mid-way through the upload, it will continue uploading the file instead of restarting from the beginning.
- Finally, Jungledisk Plus provides you with webaccess to your Amazon files from anywhere.
These services are provided by Jungledisk through one of their own applications running on Amazon's EC2 computing cluster. Because Amazon is currently only allowing Jungledisk to run this application in it's US East Coast data center, you need to store your data in Amazon's US datacenters. For European users, this can lead to slower access to files, but I have not yet noticed issues with this myself.
- login of registreer om te reageren