Mục lục bài viết

System Design 101

A step by step guide on designing a system

#design Photo by Jonathan Singer on Unsplash

Nowadays, system design interviews are part of the process of top-level tech companies. Many people are afraid of system design interviews as there’s no certain pattern to prepare. Besides, system design questions are open-ended, and there’s no standard right or wrong answer, which makes the preparation process even harder.

System design is one of the most important and feared aspects of software engineering. One of the main reasons was that everybody seems to have a different approach; there is no clear step by step guidelines.

The struggle of software engineers with system design can be divided into two parts:

The system design process is unstructured by nature; there is no proper answer to say binary right or wrong.
Software developers lack in experience developing complex and large-scale systems.

In this article, we’ll go through steps to approach solving a design problem. This guideline may help you to design a system. This one is based on my experience of learning architecture courses.

★Step 1: Requirements clarifications

We need to clarify the goal of the system. System design is such a vast topic; if we don’t narrow it down to a specific goal, it will become complicated to design the system, especially for newbies. Sometimes constraints are good for the system. It helps to focus on the main feature you are trying to design. It clarifies ambiguities about the features of the system. We may divide these requirements into two parts:

Functional requirement:

This is the requirement that the system has to deliver. We may say it is the main goal of the system. Here a function is described as a specification of behavior between outputs and inputs. What would be system input and what is the output it should be cleared in these requirements.

Non-Functional requirement:

Now for the more significant requirements that need to be analyzed. If we don’t fulfill this requirement, it will be harmful to the project’s business plan. These requirements restrict system design through different system qualities.

Performance, modifiability, availability, scalability, reliability, etc. are important quality requirements in system design. These ‘ilities’ are what we need to analyze for a system and determine if our system is designed properly.

★Step 2: Estimation of important parts

One of the important points of the system design is to know about the scale of the system. What does scale actually mean when you are designing a system? The number of article views, the number of timeline generation per second, etc. can measure the system’s scale. If the measurements of the system are very large in number, then they are high scale systems.

Another important estimation is about storage. We need to know how much storage is needed for the system for say 5 years. It can only increase, but you need to have an estimation. It will give the direction of data storage.

In the case of System Design of URL Shortening Service, you may see the calculation like this:

Let’s assume, the system stores all the URL shortening request and their shortened link for 5 years. As we expect to have 500M new URLs every month, the total number of objects we expect to store will be 500 M * (5 * 12) months = 30 B. Now let’s assume that each stored object will be approximately 100 bytes. We will need total storage of 30 billion * 100 bytes = 3 TB.

So, you have an idea of how much storage is needed. And off you start to a direction of data flow.

Network bandwidth usage is also an important factor. In the case of distributed systems, bandwidth usage management is crucial. For example, if you want to efficiently handle file transfer, you may need to divide a file into chunks.

file chunk transfer google drive #systemdesign Figure: Transfer only the updated chunk only (Image by Author)

If we transfer the whole file every time it is updated, we might need to transfer a 100MB file every time if the file size is 100MB. Say, for example, we will be dividing files into 2MB chunks and transfer the modified portion of files only, as you can see from the figure. It will decrease bandwidth consumption and cloud storage for the user.

★Step 3: Data Flow

We need to define the system’s data model and how data will flow between different system components. We need to figure out the entities of the system and different aspects of data management.

For newbies to system design, please remember, “If you are confused about where to start for the system design, try to start with the data flow.”

Here are some entities for a service like Medium:

User: UserID, Name, Email, etc.

Article: ArticleID, ContentOfArticle, TimeStamp, NumberOfClaps, etc.

UserFollow: UserID1, UserID2

Followers: UserID3, UserID4

Database system selection is part of this section. NoSQL or SQL database selection is a common scenario. On the other hand, we may need to decide on what kind of storage needed to be chosen for photos and videos.

★Step 4: High-level Component design

If we try to design the system in one go, it is a tough task. So, it’s better to break them as high-level components. Then, break those components into detailed design.

Try to draw a block diagram representing the core components of our system in 5–6 parts. It can be more if the system is too big. There is no sure rule of how many components we can divide the system into. Just try to remember that we need to identify enough components to help solve the system’s actual problems.

Here is a high-level diagram for designing file storage and synchronization service, like Google Drive.

#systemdesign high level Figure: High-Level Design of Google Drive( image by author)

The File Processing Server will manage the file processing Workflow. Metadata Server will take care of the info of file, chunk size, and user information. The Notification server will let the client application know about updating files to all the other devices the client is logged in. Cloud Storage will keep the file stored.

We can then break down these components for a further detailed design according to the system’s requirements.

★ Step 5: Detailed design

For the last step, we need to dig deeper into major components that are important for achieving the system’s quality requirements.

In this step, we can analyze different approaches to solve a problem, their pros and cons, and explain why we prefer one approach over the other. Tradeoff analysis is an important part of this section. Here might be some example:

Since we need to store huge amounts of data, we may need to partition data to distribute to multiple databases. There might be a question of a celebrity profile and how we will handle such users who have many followers.

How much data we need to cache to speed up the system response time. Where should we need to use load-balancer, etc.?

Here is an example of a detailed design of a cloud file storage service like Google drive.

★ Step 6: Identify bottlenecks and resolve them

Now, we have a detailed design of the system. We have to find the bottlenecks of the system and find different ways to mitigate them. For example:

If there is any single point of failure in our system, we need to remove them. This may cause availability issues, which is a huge concern.

availability in #systemdesign Image by Author

We need to have enough replicas of the data to still serve our users if we lose a few servers. If there is no replica of the data, and for some reason, data is lost, the system does not have the data. The system will have reliability issues.
Similarly, we need to have enough copies of different services running so that a few failures do not cause a system’s total shutdown.