Quick Links

AWS's Simple Storage Service (S3) is great for storing large amounts of objects, but it's also an API that's compatible with many other competiting services. If you want to move off AWS, transferring an S3 bucket is easy to do.

How Does This Work?

If both services you're transerring to and from are S3 compatible, you can simply use a utility like rclone, configured to access each service, to transfer all the items over. For example, you could transfer from S3 to Digital Ocean's compatible Spaces service, or transfer from an S3 bucket in one account to another account.

To handle the transfer, rclone will read from the source bucket, find all the files that need to be transferred, and handle cloning them into the destination bucket. rclone can also handle file updates, which can be useful if the source bucket is being written to as it's transferring.

As far as transfer times go, it will likely take a while depending on the size of the bucket. Number of files is also an issue, as rclone adds overhead for each transfer. If you have millions of files, or multiple terabytes, you should be prepared for hours of transfer times.

Luckily, you can perform the large initial transfer while the bucket is still actively being written to. You will likely need a bit of downtime to ensure the buckets are synced up before finally switching over. If that is a problem, there are other tools available to seamlessly transfer, including commercial tools like NetApp Cloud Sync that can sync multiple buckets together.

Some cloud services, like Google Cloud Platform, have services of their own that can handle the transfer. If you're moving to a platform that supports this, you'll likely want to use their service instead.

Setting Up rclone

The simplest method is to set up rclone on your own server to handle the transfer operation. You will need to run it in the background, or through a

        tmux
    

 window so you can disconnect from long transfers.

rclone is available from most package managers:

apt install rclone -y

rclone is meant mostly to transfer files locally or between SSH compatible servers, so it will need a bit of configuration to handle transfers between S3 services. This file is located at:

~/.config/rclone/rclone.conf

Add a new block with the following configuration, which hooks it up to your AWS account (not a specific bucket):

[s3]

type = s3

env_auth = false

acl = private

access_key_id = ACCESS_KEY

secret_access_key = SECRET_KEY

region = REGION

location_constraint = LOCATION_CONSTRAINT

You will need to fill in the config with your access key and secret, and enter in your bucket's region. You can find a list of the regions from the AWS docs.

You will need to fill in another block for the other service you're transfering to. If you're moving between AWS accounts, you'll need a separate key with access to that account. If you're moving to a service like DO Spaces, you'll need to define another block with a new endpoint configured:

[spaces]

type = s3

env_auth = false

acl = private

access_key_id = ACCESS_KEY

secret_access_key = SECRET_KEY

endpoint = nyc3.digitaloceanspaces.com

In any case, you will need to give it a new name on the block title, because these are two separate remotes.

Performing The Transfer

Once configured, you will be able to view all the possible remotes

rclone listremotes

s3:

spaces:

Confirm the type of remote by adding the --long flag to the rclone listremotes command.

You can view a bucket's contents by using the endpoint name followed by a colon and the bucket name.

rclone tree s3:source-bucket

Then, you can run the sync, with some extra flags for optimal performance:

rclone sync source:/source-bucket 

destination:/destination-bucket

-P -v --log-file /var/log/rclone/rclone-1.log

--create-empty-src-dirs --s3-chunk-size 20M

--s3-upload-concurrency 64 --checksum

The -P flag will allow you to view progress interactively in your terminal, and will give an estimate of how long it is going to take.

rclone sync will simply scan the source bucket and update the target bucket. You can continue modifying the source bucket while the transfer completes. After it's done, you can run additional transfers and continue to sync the buckets together.