Besides web browsing, email is the most common activity on the Internet. Making it a snap to communicate 1-on-1 or in a group, email has been around even before the existence of the World Wide Web or browser-based Internet. Companies, engineers and early Internet adopters have been using email since before the era of browser surfing.
The union of browser and email, ie. webmail was what gave a huge boost to the adoption of email by the domestic Internet user. The first movers offering webmail to the market were Hotmail and AOL. Google’s Gmail is now the most commonly used email platform. I explained how browser-based Internet works in a two-part series (Part 1, Part 2). In this post, I will explain how Email works. Exactly what happens when you click the Send button and how does the email find its way to the recipient?
You compose an email. You add the recipient and subject, and an image as an attachment too. You have been using your favourite email client, be it Outlook, Thunderbird, your laptop’s browser or your mobile phone’s email app. Now you are ready to send the email and you click the ‘Send’. The magic of the Internet is ready to happen and eventually, the message will find its way to your recipient’s inbox. Let us go on a trip with the email message as it floats over the Internet and becomes a ‘You’ve got mail’ notification in your recipient’s device.
Step 1: Your Email client connects your Email account’s SMTP server
First, you yourselves must have an account registered with an email provider. The email provider has a database that contains a list of all valid users who are authorised to send/receive emails using their service. The database can be in any format, but every email provider provides a standard way for email clients to communicate with them. This standard communication is called the Simple Mail Transfer Protocol (SMTP).
If you are using a web client, e.g. Gmail on a browser, Yahoo! Mail on a browser or Hotmail, the underlying web application knows which SMTP server to connect to. But when you use a desktop / mobile app like Outlook or TypeApp, you need to configure the settings yourself. These settings are well documented by your email provider. For your office email server, the network administrator will configure your Outlook for you.
Once you click the ‘Send’ button, the email client attempts to connect to the SMTP server on port 25 (I explain the concept of ports here) which is the international standard port reserved for SMTP.
The SMTP server is either an IP address or a domain name. As discussed in this post, a domain name is a human-readable name like smtp.gmail.com or an IP address like 203.84.242.16. If the SMTP server is mentioned as a domain name, the corresponding IP address must be first found out using DNS, as I explained in the post linked above. Once the IP address is known, the email client knows which remote machine to connect to.
Step 2: The Email client and the SMTP server exchange introductions
Next, come the small talk and the authentication phase. The server asks the client to identify itself and authorise with the correct email account username and password. These are achieved using SMTP commands like HELO, LOGIN, etc. Once the formalities are over, the client signals intent to send the contents of the email to the SMTP server.
Step 3: The email client sends the email contents to the SMTP server
The email is a combination of the main content of the email accompanied by administrative / book-keeping data called headers. Think of them as the envelope and the stamps that go along with a post office post. The headers that go along with an email are typically the list of recipients, subject, date/time and other data that are established as standards.
Then comes the content of the email. An email can contain text and attachments. Email communication uses a method called multipart-data to let the email server and the recipient software know that the mail contains multiple entities. Multipart data contains a series of headers that describe the type of each data within the email, followed by the data itself. The headers instruct the recipient software on how to interpret the data. Think of it as communication among the personnel working with movers and packers as they hand out sealed packages to each other. “Hey, this package contains a metal chest. It is rather heavy, so take care of your back when you lift it. The next sealed package is a crate of bottles. Fragile. Careful….” and so on.
Step 4: The SMTP server queues up the email in its list of outgoing mails
The upcoming step 5 may take time. Hence the server has to maintain a queue of outgoing mails to hold the mails at its end for some time.
Step 5: The SMTP server finds out the domain to use for the recipient
When the SMTP server reads through the email, it retrieves the domain name from the recipient’s address. E.g. for a recipient john.smith@gmail.com, the domain name is gmail.com.
Step 6: The SMTP server seeks the MX record for the domain
If you are a beginner to the world of emails, the title of step 6 will not make much sense to you. Allow me to explain with an analogy.
You order a paperback book from Amazon and receive it at your doorstep. Who delivered the book to you? If you answered ‘Amazon’, then probably you may not be entirely right. While Amazon does have its own logistic service, they depend on many partners too, e.g. FedEx, DHL, etc. If you look at the delivery person’s badge or the label on the package, you will know who actually carried the book to you. It would be the same if you were to use Western Union to transfer to someone’s Citibank account. You would say that you wired to Citibank, but it was the WU infrastructure which you used.
MX records are like ‘logistic partners’. People would want to use convenient names like @gmail.com and @abclimited.com, typically a domain name which matches the main domain name of an organisation. This is just like saying, ‘I bought from Amazon’ or ‘I wired to Citibank’. Just like FedEx and Western Union, email servers (SMTP) typically reside on a seperate infrastructure from the main website of an organisation. This seperate infrastructure is kept failsafe so that email can still be used in case the main website goes down.
Conventionally, the email infrastructure has a domain name that is the main domain prefixed with smtp or mail, e.g. smtp.abclimited.com, mail.mycompany.com, etc. If MX records were not to exist, people would have to use the email domain name directly inside the address, e.g. recipient@smtp.gmail.com or sales@mail.company.com, which would be like saying, “Hey Amazon, I want to buy this book. Please deliver it to me via DHL”!
MX records are looked up to solve this issue. When the sender’s SMTP server gets the domain name from the receiver’s address, the MX record helps get the domain name of the email server for a given domain name, e.g. smtp.gmail.com helps deliver a recipient address john.smith@gmail.com.
The above screenshot shows how any mail sent to a user user@example.com, will be sent to mx1.dnsmadeeasy.com. The latter is the actual SMTP mail server which provides email service for the company on behalf of the domain example.com.
Step 7: The SMTP server connects to the recipient SMTP server
Using the TCP handshake described in this post, the sender’s SMTP server connects to the recipient’s SMTP server.
Step 8: The sender’s SMTP server transfers the mail to the recipient SMTP server
Everything including the headers and the content of the email are transferred between the servers. The recipient server stores the email in its database of received messages.
Step 9: The Email client of the recipient retrieves new mails using an Email retrieval protocol
Once the Email has reached the database of the recipient’s email server, the email client can access the recipient’s list of new mails. To do this, one of the following techniques is used.
- A desktop/mobile client like Outlook uses a protocol called IMAP to retrieve the list of emails for a particular account.
- Webmail over a browser uses a technique called WebSocket to stay connected to the web server for notifications. The web server itself uses IMAP to connect to the mail server.
Step 10: The Email client parses the Email
The email client then parses the email to retrieve the details about the sender, the subject line, the contents and attachments. These are then shown to the recipient. It also shows a notification to the recipient.
Conclusion
Email is one of the most commonly used communication methods over the Internet. Emails are simple, yet powerful and have been around for a long time. Even with the proliferation of instant messaging, Email still serves its purpose and remains ubiquitous.
Further reading
- Video
- ‘How email works’ free chapter from my Udemy course on setting up your own email server
- Other blogs