My wife Priya and I went on a year-long trip named India 360. We clicked tens of thousands of photos during the trip. We share them on our Facebook page and Instagram channel. But, we realised that the quality of the photos we shared weren’t high. Sure, the resolution was great and the most of the photos were good. But we weren’t getting the photos to look like what professional travel photographers do.
Late last year we met Aravind, my brother-in-law (Priya’s brother), who is an excellent photographer. He is also good with post-processing using software tools like Snapseed and Lightroom. In a span of half an hour, Aravind taught me how to make my travel photos look good…. really good. He didn’t fiddle with gimmicky settings, nor use jargon. He taught me 5… just 5…. steps that make every photo look great after post-processing. There was a bonus 6th step which should be used sparingly.
Since then, I have learnt from his principles and edited 100s of photos from our travel, making them look much better than the original shot. I even added some steps of my own to the process. I edit my photos from two places. On my Android phone, I use an app called Snapseed. I use neither Mac OS, nor Windows. On Ubuntu Linux, Adobe Lightroom doesn’t work. So my desktop photo-processing app of choice is Gimp. Continue reading “Making your photos look good with post-processing”
…. thus goes a famous nursery rhyme from our childhood. How is it relevant to version control? In this post, you are going to imagine yourself as an author of the afore-mentioned nursery rhyme, working with a few more colleagues. Using that example, we will see how version control software works.
In the last article Introduction to clean architecture: Part 1, we saw how clean architecture is a set of principles for designing software such that the purpose of a software program is clear on seeing its code and the details about what tools or libraries are used to make it are buried deeper, out of sight of the person who views it. This is in line with real world products such as buildings and kitchen tools where a user knows what they are seeing rather than how they are made.
In this article, we will see how a very simple program is designed using clean architecture. I am going to present only the blueprint of a program. I won’t use any programming language, staying true to one of the principles of clean architecture, i.e. it doesn’t matter which programming language is used.
The simple program
In our program, we will allow our system to receive a greeting ‘Hi’ from the user while greeting him/her back with a ‘Hello’. That’s all we need to study how to produce a good program blueprint with clean architecture.
Where do we start
I have outlined this in the post An effective 2-phase method to gather client requirements. When given a problem, we must always start with who the user are and how the system work from their points of view. Based on the users, we should build possible use cases.
In our system, we have a single user who greets our system. Let’s call him/her the greeter. Let’s just use the word ‘system’ to describe our greeting application. We have just one case in our system which we can call, ‘Greet and be greeted back’. Here’s how it will look.
The greeter greets our system.
On receiving the greeting ‘Hi’ (and only ‘Hi’), our system responds with ‘Hello’, which the greeter receives.
Any greeting other than ‘Hi’ will be ignored and the system will simply not respond.
This simple use case has two aspects.
It comprehensively covers every step in the use case covering all inputs and outputs. It distinctly says that only a greeting of ‘Hi’ will be responded to and that other greetings will be ignored without response. No error messages, etc.
The use case also has obvious omissions. The word ‘greet’ is a vague verb which doesn’t say how it’s done. Does the greeter speak to the system and the system speak back. Does the greeter type at a keyboard or use text and instant messaging? Does the system respond on the screen, shoot back an instant message or send an email? As far as a use case is concerned, those are implementation details, the decisions for which can be deferred for much later. In fact, input and ouput systems should be plug-and-play, where one system can be swapped for another without any effect on the program’s core working, which is to be greeted and to greet back.
The EBI system
Once the requirements are clear, we start with the use cases. The use case is the core of the system we are designing and it is converted into a system of parts known as the EBI or Entity-Boundary-Interactor. There are 5 components within the EBI framework. Every use case in the system is converted to an EBI using these five parts.
Interactor (I): The interactor is the object which receives inputs from user, gets work done by entities and returns the output to the user. The interactor sets things in motion like an orchestra director to make the execution of a use case possible. There is exactly one interactor per use case in the system.
Entities (E): The entities contain the data, the validation rules and logic that turns one form of input into another. After receiving input from the user, the interactor uses different entities in the system to achieve the output that is to be sent to the user. Remember that the interactor itself must NEVER directly contain the logic that transforms input into output. In our use case, our interactor uses the services of an entity called GreetingLookup. This entity contains a table of which greeting from the user should be responded to with which greeting. Our lookup table only contains one entry right now, i.e. a greeting of ‘Hi’ should be responded to with ‘Hello’.
Usually, in a system that has been meant to make things easy, automated or online based on a real world system, entities closely resemble the name, properties and functionality of their real world equivalents. E.g. in an accounting system, you’ll have entities like account, balance sheet, ledger, debit and credit. In a shopping system, you’ll have shopping cart, wallet, payment, items and catalogues of items.
Boundaries (B): Many of the specifications in a use case are vague. The use case assumes that it receives input in a certain format regardless of the method of input. Similarly it sends out output in a predetermined format assuming that the system responsible for showing it to the user will format it properly. Sometimes, an interactor or some of the entities will need to use external services to get some things done. The access to such services are in the form of a boundary known as a gateway.
E.g., in our use case, our inputs and outputs may come from several forms such as typed or spoken inputs. The lookup table may seek the services of a database. Databases are an implementation detail that lie outside the scope of the use case and EBI. Why? Because, we may even use something simpler such as an Excel sheet or a CSV file to create a lookup table. Using a database is an implementation choice rather than a necessity.
Request and response model: While not abbreviated in EBI, request and response models are important parts of the system. A request model specifies the form of data that should be sent across the boundaries when requests and responses are sent. In our case, the greeting from the user to the system and vice-versa should be sent in the form of plain English text. This means that if our system works on voice-based inputs and outputs, the voices must be converted to plain English text and back.
Controllers
With our EBI system complete to take care of the use case, we must realise that ultimately the system will be used by humans and that different people have different preferences for communication. One person may want to speak to the system, while another prefers instant messaging. One person may want to receive the response as an email message, while another may prefer the system to display it on a big flat LCD with decoration.
A controller is an object which takes the input in the form the user gives and converts it into the form required by the request model. If a user speaks to the system, then the controller’s job is to convert the voice to plain English text before passing it on to the interactor.
Presenters
On the other side is a presenter that receives plain text from the interactor and converts it into a form that can be used by the UI of the system, e.g. a large banner with formatting, a spoken voice output, etc.
Testability
Being able to test individual components is a big strength of the clean architecture system. Here are the ways in which the system can be tested.
Use case: Since the use case in the form of EBI is seperated from the user interface, we can test the use case without having to input data manually through keyboards. Testing can be automated by using a tool that can inject data in the form of the request model, i.e. plain text. Likewise the response from the use case can be easily tested since it is plain text. Also individual entities and the interactor can be seperately tested.
Gateway: The gateways such as databases or API calls can be individually tested without having to go through the entire UI and use case. One can tools that use mock data to see if the inputs to and outputs from databases and services on the Internet work correctly.
Controllers and presenters: Without involving the UI and the use case, one can test if controllers are able to convert input data to request model correctly or if presenters are able to convert response model to output data.
Freedom to swap and change components
Interactors: Changes to the interactors are often received well by the entire system. Interactors are usually algorithms and pieces of code that bind the other components together, usually a sequence of steps on what to do. Changes to the steps does not change any functionality in the other components of the system.
Entities: Entities are components that contain a lot of data and rules relevant to the system. Changes to entities will usually lead to corresponding changes in the interactor to comply with the new rules.
Boundaries: Boundaries are agreements between the interactor and external components like controllers, presenters and gateways. A change to the boundary will inevitably change some code in the external components, so that the boundary can be complied.
UI: With a solid use case in place, you can experiment with various forms of UI to see which one is most popular with your users. You can experiment with text, email, chat, voice, banner, etc. The use case and the gateway do not change. However, some changes to the UI can cause a part of the controller and the presenter to change, since these two are directly related to how the UI works.
Controller and presenter: It is rare for the controller or presenter to change in their own rights. A change to the controller or presenter usually means that the UI or the boundary has also changed.
Conclusion
Clean architecture seperates systems such that the functionality is at the core of the system, while everything like user interface, storage and web can be kept at the periphery, where one component can be swapped for another. Hopefully, our example has given you a good idea about how to approach any system with clean architecture in mind.
If you ever walk through the kitchen appliances section of a shopping mall, you will see lemon juicers made of steel, plastic and wood. The raw material doesn’t matter. One glance at the simple tool and you know what it is. The details about what the juicer is made of, who made it and how the holes at the bottom were poked become irrelevant.
Similarly, if you look at a temple, you know that it is a temple. A shopping mall screams at you to come shop inside. Army enclaves have tell-tale layouts, stern looking guards and enough signboards to let you know that you should stay away and avoid trespassing.
Over the past decade, computer graphics have moved increasingly to the world of 3D. With the release of 3D graphics libraries like Unreal and Unity, more players are entering the game, figuratively and literally. But it takes a lot to change your perception from a 2D world to one of 3D. It is more than just adding a third axis named Z. We wouldn’t simply want to see flat 2D shapes floating around in our 3D world, would we?
Humans have been recognising their peers by identifying their faces. This process of recognition is several millenia old and we have lost track of which species of homo sapiens actually started it. Other than immediately recognising twins, this method has served us quite well and fails very rarely. It is one of the those activities that seems to happen instantly. So fast that we are unable to study how it happens.
The technology has recently been ported to machines. Work has gone on for decades, but it is only now that we can expect a reasonable performance from our laptops and phones to recognise our face using front camera. Even then, there are false positives or embarassing mismatches. Computers are nowhere near humans when it comes to the speed and accuracy of facial recognition. But they are getting there eventually.
In this post, we shall see the various technologies used to recognise human faces.
Recognition by texture
Texture recognition is the most commonly used form of facial recogntion because it has been around for a while and it is cheaper to implement. The image of a face coming through a camera is dissecting into a grid. Each cell in the grid deals with a certain section of the face. Specific features such as tone of the skin, presence of birthmarks, etc. are considered. These are noted down in the appropriate section of the grid. The shape of the jaw, ears and lips are paid particular attention to. Features like facial hair are sought to be ignored.
Facial recognition by texture needs only a single camera taking a reasonably close view of the face. However, it is prone to failures due to the following. If the lighting in the room changes or is too low, then the mapping algorithm gets thrown off or completely fails. While the algorithm seeks to ignore temporary details like facial hair, ornaments, sun glasses or small cuts, it doesn’t always succeed and may actually consider those features as unique identities. Recording of such temporary data causes the match algorithm to fail the next time when the person does not have those features anymore.
Recognition by reflection
This method attempts to build a model of the contours of the face and use that for recognition. There are tiny differences in the shapes of the face between two individuals, which can be used to identify and distinguish. The method is similar to the way an X-Ray works.
A pattern of a beam of light in the invisible spectrum (such as infrared), a steady sound inaudible to the human ear or radio waves are sent toward the face of the person. The reflection from the face is received. These are detected by sensors on the face detection device. The device then measures the time after which each beam / sound / wave that was sent is received. If the reflection takes less time, then that part of the face must have been closer. A reflection that takes more time must be coming from the part of the face that is farther away. This data is then used to build the model of the contour of the face.
For this method to work effectively, the face must be reasonably still, otherwise the calculations are thrown off and the contour is not mapped correctly.
Recognition with several angles
In this method, more than one camera are placed around the face at different angles. Multiple images are captured. All of them can together build a 3D model of the face. The more the number of cameras used, the more accurate the model built. Between 3 to 6 cameras are generally used. Anything below 3 cannot build a good model and anything over 6 is overkill.
Thermal detection
Thermal detection uses a heat-sensitive camera to create a heat map of the face. Heat maps are different for different faces. This method can be used even in the dark. Heat maps may differ due to weather conditions or changes in body conditions.
This is an emerging technology already used in zoology and tracking of animals in the dark. The human body is a warm entity, but different organs and different parts in each organ are warmer or colder than each other. The warm and cold areas of the face can vary from person to person. This variation can be used for identification and differentiation.
A heat sensor generates a heat map of the face by detecting warm and cool areas. The warmer areas are recorded as brighter shades and the cooler ones as darker shades.
The advantage of this method is that it works even in the dark. However, the body’s heat signature can vary due to changing weather, sickness or exposure to places of different temperatures. This can cause variations in the heatmap of the same person.
Which one to use?
None of the above methods is 100% accurate on its own. Each method is used in combination with another. Sometimes they are used simultaneously and other times, the failure of one method leads to a fallback to another method.
Overall, facial recognition is not as accurate as fingerprints or optical scan. I would wait for 5 more years to give this technology time. Facial recognition cannot be used as a mainstream authentication method yet.
Conclusion
While we as humans need no time to distinguish among the people we know and to identify them, a computer is just getting started with facial recognition and there are several years to go before the technology matures. It will be a while before a computer can put a name to a face or a face to a name.
If you have ever visited a government office, you are probably directed from one counter to another to get tasks done. Why doesn’t the same person do everything? This is because the work is divided into small tasks and each government official is given the responsibility of only one task. Once done, that official will direct you to the next one. You are seeing the chain of responsibility design pattern in action. Continue reading “Design pattern: Chain of responsibility”
In the last two posts, we saw how to use BHIM and PayTM using a smart phone. But what if the paying customer does not have a smartphone. Has India’s rapidly advancing digital payments pioneer NPCI (National Payments Corporation of India) considered a way to include customers with no smartphones? Turns out it has. Of course, terms and conditions apply, but NPCI has made a way for non-smartphone customers to pay digitally. Enter Aadhar Pay. Continue reading “Intro to Aadhar pay system”
BHIM stands for BHarat Interface for Money. It is an app that uses the Indian government initiated UPI (Uniform Payment Interface) to transfer money between two bank accounts in real time. The Indian government’s arm named NPCI (National Payments Corporation of India) is responsible for the development and maintenance of UPI and BHIM. Continue reading “Using BHIM for cashless transactions”
In the last post, we saw what Agile methodology is and its manifesto. We saw an example of how Agile can be used by a tailor Bharat to sew a shirt for a customer Aditya. While we talked about Agile through a story, there are formal ways of planning and keeping track of projects using Agile. Two of them are Scrum and Kanban. Continue reading “How agile is your project: Part 2: Scrum and Kanban”