If you have a really sharp eye when looking at web addresses in your browser’s top bar, you’ll probably have noticed the very first part of any website’s address, the letters “HTTP” or “HTTPS.” What is HTTP, though, and how does it work? Let’s take a look at the glue that keeps the web together.
HTTP: The Short Version
HTTP is an acronym and stands for hypertext transfer protocol. Let’s break that down a bit, starting with the “protocol” part. In tech, a protocol is the set of rules machines need to abide by to “talk” to each other. For example, VPN protocols determine how VPNs interact with servers. HTTP is a lot less specific than that, and instead sets the rules for how the internet works.
This is no exaggeration. Without HTTP, there’d be no communication over the world wide web. This is because HTTP governs the communication between web servers and web clients—the “transfer” part. Web servers are where you connect to so you can view sites; for example, you’re currently in contact with the web server of How-to Geek so you can read this article.
To access a web server, you need a web client. Most of the time, this client is your browser, but it can be any kind of app, really. For example, if you clicked through to this article from the Facebook mobile app, then Facebook’s in-app browser is your web client. The client-server interaction is pretty much what the whole internet boils down to, and HTTP is integral to that.
The final part of the HTTP acronym is the “hypertext” part, which is the type of files being transmitted, almost always through HTML files. These types of files are the building blocks of the web since they don’t just display language, they can also be interlinked. This is different from the kinds of files you have on your device, which usually can’t do that.
How HTTP Works In a Nutshell
HTTP is a protocol that runs on the so-called application layer of the internet, above the internet layer, where the real nuts and bolts of the web are like IP addresses. The application layer is where you’ll find the browsers and apps that you use every day, and HTTP is very much a part of that.
How it works is that your browser, the client, will send an HTTP request over the network, which is processed by the server of the site you want to access. The site then sends back an HTTP response, which is—if everything went well—the page you wanted to see. The browser then displays the response.
Breaking Down HTTP Requests
Of course, there’s a little more to it than that. An HTTP request is actually made up of several parts, each of which plays an important part in how the site is displayed. Among the most important parts of any request are the HTTP method, the request headers, and the request body.
The method is usually the action HTTP is being asked to perform, so retrieving information or supplying it (the “GET” and “POST” commands, respectively, though there are plenty of others). The HTML request headers are a little harder to explain, but think of them as envelopes: each one contains the address of where it’s going, the address of the sender, plus a whole bunch of other information, like the type of postbox (browser) and also information about encryption.
The HTML body “fills” the envelope up with information like login information, or anything else that the server needs to know to display the page; sometimes it’s empty and the envelope, the request header, is enough.
With the request received, the web server now starts to work on its response, which is also made up of three parts: the HTTP status code, the response header, and the response body. The header and body are much like their counterparts in requests, except that the body will contain a lot more information going back, like the files carrying the information to display a webpage.
The status codes are an interesting touch, since we’ve likely all encountered them without realizing what they were. They’re three digits that can start with numbers 1 through 5. Each series stands for something. So any three-digit code starting with 2 means success (the page is displayed without problems), while one starting with 4 means an error, like the infamous 404: page not found code.
This call-and-response system is the basis for everything we do on the internet. Though it gets more complicated than we describe above, this covers the basics. Of course, there is the issue of how all this communication is kept safe.
This is where we run into the problem with HTTP: at no point is any of the information being encrypted or protected in any way. It’s purely request-and-receive, there’s no step where security is added. Anybody that is able to intercept messages can see what’s being sent, which includes things like credit card numbers or account information.
In a way, it’s like when you’re talking to a neighbor over the fence that separates your properties: you’re each in your own zone, but if anybody stands close enough, they can hear every word you’re saying.
As you can imagine, this is extremely bad news for most internet users, and incredibly good news for the people that prey on them. To fix this, a new type of HTTP was rolled out, called HTTPS, where the final “S” stands for “secure.” This type of HTTP does encrypt information, making it a lot harder for anybody to listen in, so to speak.
Currently, it’s becoming less and less common to see what’s being called “plain” HTTP anywhere as over the past few years almost every site worth mentioning has moved over to HTTPS. There are some that—for reasons that vary depending on the site owner—have resisted this change. You may want to shy away from them, or at least use a VPN to safeguard any sensitive information.
That said, though HTTPS is definitely a vital upgrade, that’s all it is, an upgrade. HTTP has been powering the internet since it started, and we doubt that will change any time soon.