Synchronization With Rsync
(Page 1 of 9 )
A few years ago, as part of a contract we were servicing, I was
asked to take over Webmaster responsibilities for a suite of
open-source Web applications we were developing. An important
component of the project was its open, or public, nature - users
were invited to participate directly in the development process by
providing feedback on development snapshots released by the
development team to the Web site on a daily basis. Since our
customers for the software were located in a different country,
these daily development snapshots also provided them with an easy
way to check on the progress of the project at any time.
As Webmaster for the project Web site, it became my responsibility to ensure that the site was always running the latest build of the software, so that users could play with it and give the development team feedback on how well it was (or wasn't) working. In the beginning, the task was easy - but as the project size grew, I found myself spending more and more time in front of my FTP client, watching as one file after another slowly wended its way from our staging server to our Web host.
Now, you have to keep in mind that, at this time, no one had heard of broadband, and so most of these uploads took place over a slow modem link. Since I had no way of knowing which files had changed between builds, I usually just uploaded the entire source tree from our local servers to the Web server - a long process, and one which grew ever longer as the project matured and the development team added new features. What I *really* needed was a way to just transfer the delta - the changes between the last build and the current one - so as to reduce both the time spent by me on the task, and the cost to the company in terms of connectivity charges.
That was when I discovered rsync.
File Synchronization With Rsync - Getting The Skinny
(Page 2 of 9 )
Rsync, in the words of its official Web site at http://www.samba.org/rsync/, is a "faster, flexible replacement for rcp". (for those of you not clued into the lingo, rcp is a remote shell program which allows you to copy files from one host to another). Like rcp, rsync allows you to transfer files between hosts; however, unlike rcp, rsync attempts to identify differences between source and destination files prior to initiating a transfer, and (assuming differences exist) tries only to copy the changes, rather than the entire file.
Needless to say, this is far more complicated than it seems - rsync accomplishes it via a specially-designed algorithm that allows it to obtain a list of all the differences between the source and destination files, and transfer these differences only. You don't need to worry about the details - if you're really interested, you can read about rsync's internals at http://dp.samba.org/rsync/tech_report/ - but you certainly should be impressed with the end result: a substantial reduction in both bandwidth and time used, all accomplished by the simple expedient of transferring only the differences between files, rather than the entire file.
In addition to the differential-search algorithm that makes up the core of rsync, the program also comes with a bunch of other useful features. Files (and not just files, oh no - entire directories, devices and links too!) can be copied from one host to another with permissions and other file attributes intact. Support for two-way transfer substantially simplifies the task of mirroring data between hosts. Built-in authentication makes it simple to protect access to sensitive files, and data can be transmitted over SSH for greater security.
Intrigued? Wanna see it in action? Flip the page, and let's get installing!
File Synchronization With Rsync - Building Blocks
(Page 3 of 9 )
The first order of business to install rsync on the your Linux box. Drop by the official rsync Web site at http://www.samba.org/rsync/ and get yourself the latest stable release of the software (this tutorial uses rsync 2.5.5).
Once you've downloaded the source code archive to your Linux box (mine is named "olympus"), log in as "root"
[[email protected]] $ su -and extract the source to a temporary directory.
[[email protected]] $ cd /tmpNext, configure the package using the provided "configure" script,
[[email protected]] $ tar -xzvf /home/me/rsync-2.5.5.tar-gz
[[email protected]] $ cd /tmp/rsync-2.5.5and compile and install it.
[[email protected]] $ ./configure
[[email protected]] $ makeUnless you specified a different path to the "configure" script, rsync will have been installed to the directory "/usr/local/bin".
[[email protected]] $ make install
rsync can be used to compare files on the same physical machine, or between two different hosts on your network. If you're planning to sync files between two hosts, you should make sure that rsync is installed on both hosts - simply follow the process above to install the program on each host.
Once you've got rsync installed, the next step is to take it for a test drive.