Quick Links

Many cloud providers charge for data transfer, often for each GB every month. These costs can be so high that it may be prohibitively expensive to run some data-heavy services. If you still want to move to the cloud, what can you do to mitigate your bandwidth bill?

Data Is Expensive

Most of the big name cloud providers charge for data, and it's usually unavoidable if you want to use those services. In an effort to be as efficient as possible, providers like AWS micro-optimize all of their pricing, and will charge you exorbitant rates if you want to run a data-heavy workload.

AWS charges $0.09 per GB of data. Azure charges $0.0875 per GB. Google Cloud Platform charges $0.08. All data coming in is free, and all data being transferred between local servers in the same zone is generally free, but once it leaves and goes out to the internet, you have to pay for it. This can be a problem if you are sending terabytes every month.

The big three---AWS, Azure, and GCP---all have offerings for dedicated servers, but none of them come with dedicated bandwidth. They may have dedicated Mbps connections, but all that does it let you spend your money faster.

Unfortunately, the solution is usually to either attempt to limit your data as much as possible, like with gzip compression, or to give up on using a big name provider, and use a smaller provider that bundles compute with bandwidth at a reasonable price.

This may mean that you won't be able to use many of the services that come with state-of-the-art cloud providers like AWS, but if you don't have the money to pay their fees, that may not even be an option in the first place.

How Much Data Am I Using?

If you don't know how much data you're currently using, you'll want to monitor that to get an idea of what services you should use.

There's a lot of Linux utilities for measuring this, but

        vnstat
    

 is lightweight and works well.

sudo apt install vnstat

This will display totals on the command line, and can also generate PNGs displaying data usage.

An image summary

If you're on AWS, you can view EC2 usage and usage for other services in the CloudWatch Dashboard.

AWS Lightsail

AWS is notorious for their awful data pricing, but in an attempt to compete with providers like Digital Ocean (which is simpler and charges bargain rates for data), they launched AWS Lightsail, which is the only saving grace for the big name cloud providers.

Lightsail is a simpler version of AWS that only offers a few services. However, it still offers compute instances and managed databases, and you can still interface with regular AWS services. It's essentially EC2, but simpler with an interface designed for beginners.

Here's the best part---each instance comes with multiple terabytes of data transfer per month, more than Digital Ocean even offers at some tiers. You'll still pay overage fees, but you can always upgrade or purchase additional instances.

Choose your instance plan.

Great right? Well, there are a few catches. Since it can talk to other AWS resources, AWS doesn't want you abusing the service to save money, and includes the following clause in their TOS:

51.3. You may not use Amazon Lightsail in a manner intended to avoid incurring data fees from other Services (e.g., proxying network traffic from Services to the public internet or other destinations or excessive data processing through load balancing or content delivery network (CDN) Services as described in the Documentation), and if you do, we may throttle or suspend your data services or suspend your account.

This is pretty vague, so it's not quite clear what high-data workloads Lightsail can and can't be used for.

For most services only using Lightsail, you're probably fine. The term "other Services" applies to the rest of AWS outside of Lightsail. If you want to run a Lightsail database, Lightsail API service, and Lightsail web server, and they happen to be using a ton of data, you can still do so.

However, if you're thinking of setting up a reverse proxy to directly proxy traffic from EC2, Lambda, S3, or some other service, you'll need to think of another solution. That would be a flagrant violation of their TOS, and would probably get your account throttled or shut down.

It's a gray area whether or not you're allowed to use a Lightsail instance to run data processing on external data stores like S3 or RDS. For example, if you had a Lightsail instance that compressed images in S3 on request, you'd be saving the data costs compared to using EC2. You're not disallowed from using external AWS services, but if you're using them from Lightsail with the intent to save money, you could be in violation if your usage is deemed excessive.

It's also a bit of a gray area whether running extreme load balanced workloads entirely in Lightsail is allowed. Lightsail includes load balancers at $20 a month, but it's possible to run ten $5 instances, which each come with 2 TB of data, and pay $70 overall for 20 TB of data, which would cost nearly $2000 if you ran on EC2.

Is using Lightsail like this cheating? Maybe not, but AWS may decide so, so proceed with caution if you want to run a data-heavy application. At the end of the day, AWS will likely decide on a case-by-case basis.

Digital Ocean

Digital Ocean has basically molded their entire business model around being the opposite of AWS---easy to use, with simple fixed pricing for all their services. While they don't have every PaaS offering that AWS and other providers may have (they don't have a competitor to Lambda, for example), they have the basics, and they're good at getting the basics right.

Their simple burstable instances, which are comparable to AWS Lightsail and EC2 T3, provide a ton of data every month with very little restriction. Their cheaper instances, below $20, don't give as much data as Lightsail, and the SSD is smaller, but overall they're very comparable.

What's even better is that they don't charge excessive data fees for overage data, only $0.01 per GB, eight times cheaper than AWS. Compared to EC2, you'd be saving hundreds per month

They're also easy to create and destroy, so if you want to run these in an autoscaling group, you're free to do so. However, Digital Ocean doesn't have built in auto-scaling support yet unless you're using Kubernetes, so you'll need to automate that yourself.

It's certainly cheap, and will probably work for many businesses, but its lack of many services can be a turn off. If you want premium AWS services like Lambda, you'll need to pay premium prices.

You can check their products page for an up to date list, but they offer:

  • VPS Compute with "Droplets"
  • Kubernetes, using Droplets
  • Managed DB using Droplets
  • An "App Platform" service like AWS App Runner
  • S3 compatible object store, with 250 GB free plus $0.02 per GB stored after that, and 1 TB of transfer plus $0.01 per additional GB
  • Local volumes, like AWS EBS.

And, unfortunately, not much else at this point besides basic networking and monitoring tools.

Dedicated Cloud Servers

Some cloud providers, like those offering dedicated servers, don't charge for data per GB, and instead give you a dedicated and unmetered connection at a fixed Mbps.

For example, OVH is a provider focused mostly on dedicated machines, and simply provides unmetered bandwidth for most of their instances.

This may vary based on region though, as data transferred out from machines in places like Australia will be metered unless you pay a lot extra per month. However, this is still 5 TB of traffic, so it's probably fine for most people.

Linode is another provider that offers both shared virtual servers and dedicated machines. Their pricing is comparable to Lightsail and Digital Ocean, and offers a few TB of transfer per month as well as multiple Gbps of egress speed.