Consider that you are preparing for a party at your home. There are several tasks to be done. Walls are to be decorated, plates are to be washed, food is to be made ready and you may need an extra shoe rack for guests to leave their footwear. Most probably, you do not do everything alone. You get the help of the members of your household. In fact if the load is too much, then you even ask your friends for help.
When more people work on the tasks for organising a party, you get multiple things done parallely. But with more participation comes the need for co-ordination and communication. Some of the tasks may be related to each other, e.g. someone who has promised to wipe washed dishes and stack them on your shelf will need the one who washes dishes to complete his/her task first. Usually, one of the participating persons needs to stay on top of what gets done and who does what. He/she is like the captain of a sports team. When everyone smoothly communicates the status of his/her tasks and one or two persons are aware of each task’s status, things go on quite well despite the complexity.
Modern computers work the same way. They have multiple processors capable of executing multiple things at the same time. Even a single modern processor performs time slicing, i.e. divides its attention among several processes / threads at the same time. Processes need to communicate among each other. For this, they use a system called a message queue.
Message queue
Whenever one process wants to communicate with another, the sender process sends out a piece of communication to the receiver. This piece is called a message. The receiving process needs to have a system to receive and process incoming messages, just like you have an inbox for email and a postbox for your home. This system is called a message queue.
Here is how we can describe an application driven by a message queue.
- Your application has a main process that is responsible for whatever your application does. As an example, let’s say that your application is a stock ticker that shows the current price for 100 stocks.
The main process is only responsible for displaying the stock prices and staying responsive to user’s touch inputs. The algorithm that gets the latest price updates is intensive and can slow down your application. Hence you have seperated it into another process. This seperate process will use your main process’s message queue to post stock price updates as soon as they are available. - Another process running in parallel to your main process is the message queue.
Here is what the message queue consists of.
- It contains a bucket into which new messages are dropped when sent by another process. For practicality, the size of the bucket is limited and once the limit is breached, incoming messages are dropped and lost, just like an overflowing bucket. Capacity management is a huge challenge in message queue systems, especially when the system is running on a small, low-memory device like a Raspberry Pi.
- The algorithm of the message queue itself is an unending loop. It has only one job for eternity.
- Look inside the bucket for new messages.
- If there are messages, then extract them and pass them to the main process so that it can deal with the message. In our application, the messages will be the updated prices of stocks. These messages will fly in at incredible speeds as the stock market updates prices nearly every second.
- Just like your email inbox has an email address and your home’s postbox has a postal address, all message queues have a well-known address that is known to all other processes.
- A message contains two parts:
- An instruction or code that usually tells the main process what to do. For our stock example, one possible instruction is UPDATE, that tells that main process to update the price shown on its display. Another possible instruction is CIRCUIT-BREAK, which says that a stock’s price is too volatile and that the stock market has frozen trading on that stock. The main application can use this message to put a special icon on the screen against that stock.
- A message can have data that is used by the main process to carry out the instruction. E.g. the UPDATE instruction should have the name of the stock and the updated price, so that the main process can update the appropriate row on the screen. The CIRCUIT-BREAK instruction should have the name of the stock.
Programming languages that have in-built message queues
The following systems have the concept of message queues built in:
- Javascript scripts running in a browser.
- Javascript applications running in NodeJS web server.
- Applications running on mobile phone platforms, i.e. Android, iOS, Window mobile.
- Desktop applications with a user interface on Windows, Mac OS and Linux.
- Applications written in Erlang programming language.
The above systems support message queues out of the box, so there is no special third-party software that you need to install to unleash communication between processes running in parallel.
Centralised and distributed message queues
Message queues can be of two types: centralised and distributed. In a distributed message queue, each process has its own message queue and every other process needs to know the address of or have a reference to that queue in order to send messages to it. Each process can only read from its own queue. It has no idea what is going on in all other queues.
But what if your messaging needs were more complicated. What if you want to broadcast a message to multiple processes so that they all get to work on the same signal. E.g. a hard disk running out of space is of concern to all processes who wish to write to the hard disk. What if processes want to opt in and opt out of receiving messages based on certain conditions? E.g. a user will either want to or not want to receive notifications from an app.
In this case, a single message queue is used. All processes use this queue for their messaging. Such as system is called a centralised queue. Here are some features of a centralised queue.
- A piece of software seperate from the ones that require messaging runs the message queue. The address of the queue hosted by this software is well known to the ones that require messaging. This software is called a message broker.
- Any process that requires access to the message queue sets up a connection with the broker and maintains that connection as long as the process runs.
- In addition to the connection, the process also ‘subscribes’ to topics. Whenever another process sends a message whose topic matches a topic subscribed to by our process, then our process will receive that message. Otherwise our process will never come to know about that message. A process can subscribe to multiple topics. E.g. our application’s main process will subscribe to topics with names such as ‘stock updates’ and ‘software updates’. Messages from the first topic will cause our app to get latest stock prices, whereas the ones from the second topic will cause our app to prompt the user that a new version of the app is available for download.
- In addition to messages being sent to all processes who have opted for a topic, there are special messages called broadcast messages that cause a message to be sent to all processes who have connected to the broker, without any exceptions. The computer shutting down or Internet going down are good examples.
A centralised queue is made to solve the following problems
- For a setup that doesn’t support message queues out of the box, a centralised queue is a good way to enable message queues.
- A centralised queue enables broadcasting of messages and sending messages to multiple processes.
- The queue can run on a machine seperate from the machine that is hosting the main application. This reduces issues with processing and memory consumption, but brings up fresh problems due to network speed.
How to use a message queue in any programming language
We mentioned 5 frameworks in which message queues are built-in. What if we want to use them in other languages? E.g. a standard Java or a Python program? We have two types of solutions.
- As mentioned in the previous paragraph, we can use a centralised message queue by installing a broker software, connecting our Java / Python program to the broker and subscribing to topics.
- A second approach is to use a library called ZeroMQ. Using this library, each process, even in Java and Python, can have its own message queue.
Task queue
So far we have talked about message queues where one process sends a message in the form of a small instruction plus some data to another process. It is upto the receiving process to set up an algorithm that actually uses that instruction and data. This is akin to you asking your computer graphics savvy friend to convert a TIFF image to a JPEG image for you. He has the know-how. All you gave him was an instruction ‘convert’ and the data, i.e. your TIFF file. He’ll know how to open Photoshop and use the Save As menu and the array of options that allow us to set the image quality. He’ll message you back when he’s done.
But what if you have hired 5 new trainees who do not know how to use Photoshop? In this case, you will have to tell them how to open a TIFF image, which menus to navigate through to bring up the option to save as JPEG and how to control the quality options to get the desired quality of output. Instead of giving them just an instruction, you’d be giving them detailed steps, like an algorithm. What’s the use of hiring trainees if you have to tell them detailed steps when you can do those steps yourselves? Well, what if you have 100 images to convert from TIFF to JPEG? I am sure your work hours are more valuable than converting images from one form to another. You may need to document and tell your trainees those steps once, but they will continue to use those instructions as long as they are trainees with you.
Computer applications can work similarly. Instead of an instruction, the application’s main process can hand out entire algorithms to processes and threads. These processes will do the hard work while our application’s main process can stay responsive to the user.
Similar to message queues, processes can have their own task queues. Other processes can fill up the task queue with algorithms, which the process with the task queue will pick up and execute as is. This is similar to the trainees executing your detailed instructions to the dot.
Task queues are supported out of the box by Java’s Executor framework and by Android. In Javascript, callback functions and promises are task queues. A third-party tool named Celery enables task queues for Python programs.
Conclusion
To get things done at the double, programming languages offer threading and multi-tasking. But this makes it necessary for multiple processes to communicate with each other. Messaging queues and task queues offer the solution by allowing processes to send short instructions, status messages or even entire algorithms between each other.