rsync is a popular file synchronization utility that uses an efficient algorithm to minimize bandwidth consumption. One of rsync’s common roles is deploying a website build to a remote production server. Here’s how to combine rsync’s versatility with the automation provided by GitLab CI pipelines.
GitLab CI supports several types of pipeline executors. These define the environment that your job will run in. The
shell executor is the default and runs bare metal on the host machine. It lets your pipelines use any command available on the host without further configuration. As most popular Linux distributions ship with rsync installed, this approach is easy to get to grips with.
shell executor doesn’t provide strong isolation and can pollute your host’s environment over time. A better alternative is the
docker executor, which spins up a new Docker container for each CI job. All jobs run in a clean environment that can’t impact the host.
The drawback here is that Docker base images don’t generally include
ssh. Even official OS images like
ubuntu:latest ship as minimal builds without these commands. This makes for a slightly more involved pipeline script to add the dependencies and
rsync your files.
Here’s how to add rsync to your pipeline. Make sure that you have a Docker-based GitLab Runner available before you continue. We’ll also assume that you have a GitLab project that’s ready to use.
You’ll need an SSH key pair available if you’ll be using rsync to connect to a remote SSH host. You can generate public and private keys by running
ssh-keygen -t rsa. Copy the public key to the server that you’ll be connecting to.
Next, copy the generated private key to your clipboard:
cat ~/.ssh/id_rsa | xclip -selection c
Head to your GitLab project and click “Settings” at the bottom of the left navigation menu. Click the “CI/CD” item in the sub-menu. Scroll down to the “Variables” section on the resulting page.
Click the blue “Add variable” button. Give your new variable a name in the “Key” field. We’re using
SSH_PRIVATE_KEY. Paste your private key into the “Value” field, including the leading
----BEGIN and trailing
Adding the key as a CI variable lets you reference it in your pipeline later on. It will be added to the SSH agent in the containers that your pipeline creates.
Adding Your Pipeline File
GitLab CI runs jobs based on the contents of a
.gitlab-ci.yml file in your repository. GitLab will automatically find this file and run the pipeline it defines when you push changes to your branches.
deploy: stage: deploy image: alpine:latest script: - rsync -atv --delete --progress ./ email@example.com:/var/www/html
.gitlab-ci.yml contains a job that uses
rsync to synchronize the contents of the working directory to
/var/www/html on the
example.com server. It uses the
alpine:latest Docker image as the build environment. The pipeline will currently fail because
rsync isn’t included in the Alpine image.
Installing SSH and rsync
Alpine is a good base for the job because it’s a lightweight image with few dependencies. This reduces network use while GitLab pulls the image at the start of the job, accelerating your pipeline. To get rsync working, add SSH and rsync to the image, and then start the SSH agent and register the private key that you generated earlier.
deploy: stage: deploy image: alpine:latest before_script: - apk update && apk add openssh-client rsync - eval $(ssh-agent -s) - echo "$SSH_PRIVATE_KEY" | ssh-add - script: - rsync -atv --delete --progress ./ firstname.lastname@example.org:/var/www/html
OpenSSH and rsync are installed using Alpine’s
apk package manager. The SSH authentication agent is started, and your private key is added via
ssh-add. GitLab automatically injects the
SSH_PRIVATE_KEY environment variable with the value that you defined in your project’s settings. If you used a different key on the GitLab variables screen, make sure that you adjust your pipeline accordingly.
Managing Host Verification
SSH interactively prompts for confirmation the first time that you connect to a new remote host. This is incompatible with the CI environment, where you won’t be able to see or respond to these prompts.
Two options are available to address this: Disable strict host key checks, or register your server as a “known” host ahead of time.
For the first option, add the following line to your pipeline’s
- echo "Host *ntStrictHostKeyChecking no" >> ~/.ssh/config
While this works, it’s a potential security risk. You’d have no warning if an attacker gained control of your server’s domain or IP. Using host key checking lets you verify that the remote’s identity is what you expect it to be.
You can add the remote as a known host non-interactively by connecting to it on your own machine outside of your pipeline. Inspect your
~/.ssh/known_hosts file and find the line containing the remote’s IP or hostname. Copy this line and use the procedure from earlier to add a new GitLab CI variable. Name this variable
Now, update your
before_script section with the following line:
- echo "$SSH_HOST_KEY" > ~/.ssh/known_hosts
Now, you’ll be able to connect to the server without receiving any confirmation prompts. Push your code to your GitLab repository and watch as your pipeline completes.
This pipeline is a simple example of how to get started with SSH and rsync in a Dockerized environment. There are opportunities to further improve the system by wrapping the preparation steps into a dedicated build stage that constructs a Docker image that you can reuse between pipelines.
.gitlab-ci.yml would also benefit from greater use of variables. Abstracting the remote server’s hostname (
example.com), directory (
/var/www/html), and user (
user) into GitLab CI variables would help keep the file clean, prevent casual repository browsers from seeing environmental details, and let you change the configuration values without editing your pipeline file.
Using rsync in GitLab CI pipelines requires a little manual setup to form a build environment that has the dependencies you need. You have to manually inject an SSH private key and register the remote server as a known host.
Although community Docker images are available that roll SSH and rsync atop popular base images, these ultimately give you less control over your build. You’re extending your pipeline’s supply chain with an image that you can’t necessarily trust. Starting with an OS base image and adding what you need helps you have confidence in your builds.