I wrote a shell script the other day to sync remote files using
rsync. Thought I’d share it since it took me some time to get it exactly how I wanted.
rsync -rtvP --delete --include=$PATTERN* --exclude=* -e "ssh -i $SSH_KEY -p $SSH_PORT" $USERNAME@$DOMAIN:$SERVER_PATH/ $BACKUP_PATH/ 2> $ERROR_LOG
a fast, versatile, remote (and local) file-copying tool
It can also synchronize folders, so it’s more than just a file copying tool like
rsync has a significant number of options, so the documentation is quite lengthy.
To explain the code snippet above, I’ll start with the options in order of use and why I used them. You’ll also notice that I used
$VARIABLES throughout the script. The definitions of these variables (among a few others) were included in the original script, but their values were both private and irrelevant so I’ve simply excluded them.
--recursive option allows you to recurse folders and specify them as the source or destination. Don’t forget to add a trailing slash (/) to your path.
--times option preserves the modification times on files when transferred. It’s often appropriate and preferred to use the
--archive option which is the same as using
-rlptgoD. These options combined will recurse, copy symlinks as symlinks, and preserve permissions, modification times, group, owner, device files and special files (respectively). Perfect for archiving, but not what I wanted at the time.
--verbose option just causes rsync to be more ‘chatty’ and tell you what it’s doing.
-P option is the same as adding
--partial --progress. In essence,
rsync will keep partial files if the transfer is interrupted and tell you the progress of the file transfer via standard output (your terminal screen…unless you redirect it). I wanted both these options so I chose
Delete any files from the destination that do NOT exist in the source. There are a variety of other delete options to pick from should you need them.
--exclude options take patterns that are matched against files in the source. The source folder included a number of files; however, I only wanted files that matched a specific pattern. In this case, all the files I wanted were prepended with something like ‘backup’, so that’s the value I assigned to $PATTERN. The filenames also included variable data like a timestamp, so in addition to the prefix I used the wildcard (*) to match any suffix.
If I hadn’t added the
--exclude option, I still would have transferred all the files from the source folder.
--include only explicitly says what should be included. It is NOT exclusive. Thus, I added
--exclude=* which matches all other files. These filter rules are executed in order and build on one another. Theoretically, you could use multiple
--exclude options as needed.
man rsync for more info.
-e “ssh -i $SSH_KEY -p $SSH_PORT”
--rsh=COMMAND option allows you to specify what remote shell want to use. I believe
ssh is the default on most distributions. However, I also wanted to specify the private key I would use to authenticate with the remote server and what port I would use.
-e allows me to specify these configurations.
The source path. Since it’s on a remote host, I’ve specified credentials and the hostname. Notice the trailing slash for my folder.
The destination path. Notice the trailing slash for my folder.
I chose to redirect all errors from
sterr to a specific document.
If you want to test the command to make sure it works, just add the
--dry-run option. I highly recommend it.
I’d also recommend creating a shell script file where you can define all your variables. It makes your script more readable and easier to edit in the future. Then you can add the script to your personal
bin of scripts.