From Monolithism to Efficiency

12 min readMay 27, 2023

In this article, we will delve into the main techniques employed for scaling monolithic applications, especially those grounded in the MVC (Model-View-Controller) pattern. By exploring this theme, you will discover essential strategies for optimizing the performance and growth capacity of these traditional applications. Through innovative approaches and best practices, you will be prepared to tackle the challenges of efficiently scaling your monolithic application, making it more robust and capable of handling increasing demands. So, embark on this journey of technical enhancement and discover how to make the most of the potential of monolithic applications in the current context.

The communication process between a browser and a monolithic application (and its database) is a sequence of interactions that allows the browser to request information from the application, which in turn accesses and manipulates the data stored in the database, and then returns the results to the browser. Here is a step-by-step explanation of the process:

Browser: The user interacts with the browser by entering a web address in the URL bar or clicking on a link to access a specific page. The browser then sends an HTTP (Hypertext Transfer Protocol) request to the server where the application is hosted.

Application: The server receives the HTTP request and forwards it to the corresponding web application. The application is responsible for processing the request and generating an appropriate response.

Database: To fulfill the request, the application may need to access a database. A database is a system that stores and manages information organized into structures, such as tables, so that they can be efficiently retrieved, updated, and manipulated. The application uses the appropriate query language (e.g., SQL) to send queries to the database, retrieving the relevant data for the browser’s request.

Application processing

Based on the information retrieved from the database, the application performs the necessary operations to fulfill the browser’s request. This may involve calculations, data manipulation, user authentication, updating records in the database, among other actions.

Application response

After completing the processing, the application generates a response for the browser. This response usually consists of an HTML document, which contains the structure and content of the web page, along with other resources such as CSS style sheets and JavaScript scripts. The response also includes an HTTP status code, indicating whether the request was successful (e.g., code 200) or if an error occurred.

Overview

Now that we understand the standard architecture of a monolithic application, let’s dive into how it works within an MVC framework like Rails or Laravel. To do this, let’s mention some components commonly found in most MVC frameworks:

Router

The router is responsible for directing the HTTP request received by the application to the appropriate controller. It maps the request’s URL to the corresponding action in the controller.

Controller

The controller is responsible for receiving the request from the router and processing it. It contains the logic to handle browser requests and coordinate the necessary actions.

Action

An action is a specific method within the controller that is called to handle a particular request. The action can receive parameters from the request, perform additional operations, and prepare data to be sent to the model.

Model

The model represents the data access layer of the application. It deals with retrieving, manipulating, and storing data in the database. The model is responsible for providing an interface for the application to interact with the database.

View

After the model retrieves data from the database, it sends it to the view. The view is responsible for formatting the data and generating the HTML response that will be sent back to the browser. It defines the structure of the page and how the data will be displayed to the user.

These components can be found in many web frameworks, following the Model-View-Controller (MVC) design pattern. It allows for a clear separation of responsibilities, where the router directs the request to the appropriate controller, the controller coordinates the necessary actions, the model interacts with the database, and the view generates the HTML response. This helps with code organization and maintenance, as well as the creation of scalable and flexible web applications.

Scaling

Once an application is deployed in production and starts to experience a high volume of data, it will likely need to be scaled to keep up with its growth. Now, let’s discuss some techniques used for scaling.

Load balancer

Load balancers are used to efficiently distribute network traffic among multiple servers or application instances. They play a crucial role in application scaling, enabling them to handle a higher number of requests and users.

Here’s a general explanation of how load balancers are used to scale applications:

Load distribution
The primary goal of a load balancer is to evenly distribute the workload among multiple servers or application instances. It receives requests from clients and forwards them to available servers. This ensures that no server is overloaded, improving the overall system performance.

Availability detection
Load balancers can also regularly check the availability of servers or application instances. If a server fails or becomes inaccessible, the load balancer can automatically redirect traffic to healthy servers, avoiding service disruptions.

Horizontal scaling
When the demand for an application increases, adding more servers or application instances can be an effective solution. Load balancers facilitate horizontal scaling by allowing new instances to be added to the environment and distributing traffic among them based on capacity and availability.

Algorithm-based load balancing
Load balancers use different algorithms to distribute the load. Some common algorithms include round-robin load balancing (each request is sequentially directed to available servers), weighted load balancing (assigning different weights to each server based on its capacity), and IP-based load balancing (directing requests from the same IP to the same server to maintain persistence).

Session persistence
Some applications require user sessions to be maintained on the same server throughout their interaction with the system. Load balancers can provide session persistence features, ensuring that all requests from a particular user are directed to the same server, even if it goes against normal load balancing.

SSL offloading
In scenarios where traffic is encrypted using SSL/TLS, load balancers can act as SSL terminators. They can receive encrypted requests, decrypt them, and then forward the requests to the non-encrypted servers or application instances, reducing the computational load on the servers.

These are just some of the ways load balancers are used to scale applications. They play a crucial role in improving the availability, performance, and reliability of distributed applications, ensuring that resources are used efficiently and traffic is evenly distributed among servers.

Cache

Caches are used to improve query performance in computer systems by storing previously accessed data in a fast-access memory area. When a query is made, the system first checks if the data is present in the cache. If so, the data is returned directly from the cache, avoiding the need to fetch the data from a slower data source, such as a database or an external service. This results in a significant improvement in response time and overall query performance.

Here’s a more detailed explanation of how caches are used to enhance queries:

Cache Hierarchy
Systems often have a cache hierarchy with different levels of cache memory. This is because cache memory is faster but limited in capacity compared to main memory or data storage. Lower-level caches, such as L1, L2, and L3 caches in CPUs, have smaller capacity but offer faster access times. Frequently accessed data is kept in the higher levels of the cache hierarchy to maximize performance.

Query Cache
In databases or storage systems, a query cache is used to store results from previous queries. When a query is executed, the system first checks if the same query has been executed before and if the result is in the query cache. If so, the result is returned directly from the cache, avoiding the need to execute the query again. This is especially useful for frequently accessed queries, which can be quickly answered from the cache.

Object Cache
In object-oriented systems, an object cache is used to store objects that have been previously retrieved from a database or other storage. When an object is requested, the system checks if it’s present in the object cache. If so, the object is returned directly from the cache, saving the need to fetch the object from the underlying storage. This is particularly useful when a large number of read operations are performed, as objects can be reused from the cache, eliminating the need for repeated retrieval from storage.

Code Cache
In compiled or interpreted programming languages, a code cache is used to store previously compiled or interpreted program code. When a code section is executed again, the system checks if the code is present in the code cache. If so, the code is executed directly from the cache, avoiding the need for recompilation or re-interpretation. This results in faster execution time for frequently accessed code sections.

In summary, caches enhance queries by storing data, query results, objects, or code previously accessed in a fast-access memory area. By avoiding repetitive searches in slower data sources, caches significantly reduce response time and improve overall system performance.

These two approaches are so common that often DevOps teams treat them as prerequisites for a production application.

Service Decoupling

Service decoupling is another important technique for enabling efficient application scaling. This involves breaking down a monolithic application into smaller, independent services, known as a microservices architecture. Here are some common approaches and practices used by developers to decouple services from a monolith and scale the application:

Identifying bounded contexts
Developers analyze the monolith to identify functional areas or bounded contexts within the application. These domains can correspond to specific resources or functionalities. The idea is to divide the monolith into smaller services, where each service is responsible for a specific bounded context.

Code and feature separation
Based on the identification of bounded contexts, the monolith’s code is separated into different independent components. Developers restructure the existing code so that each service can be deployed and scaled separately. This often involves creating separate code repositories and defining clear interfaces between the services.

Communication between services
Services need to communicate with each other to function as a distributed system. There are various approaches to facilitate this communication. A common option is to use RESTful APIs or gRPC to enable communication between services through network calls. It is also important to define clear and stable contracts for service interfaces, ensuring that changes in one service do not affect integration with other services.

Independent deployment
One of the advantages of microservices is the ability to deploy and scale services independently of each other. Developers can use automated deployment tools like Docker and Kubernetes to create containers for each service and orchestrate their deployment and scaling. This allows each service to be deployed in different environments and scaled according to demand without affecting other services.

Data management
In a microservices architecture, each service often has its own dedicated database or data storage. This allows data to be isolated, and each service can scale its data storage as needed. However, it may be necessary to consider data consistency across services and implement data management strategies such as asynchronous events to maintain data integrity throughout the system.

Monitoring and fault tolerance
With a distributed system, it is crucial to monitor the performance and availability of each service. Developers implement monitoring strategies to collect metrics and logs from each service, identify bottlenecks and performance issues, and proactively take action to resolve problems. Additionally, implementing fault tolerance strategies such as circuit breakers and retries is essential to ensure system resilience in the face of individual service failures.

Decoupling services from a monolith to scale the application is a challenging process but brings significant benefits in terms of scalability, flexibility, and system maintenance. It requires a careful approach to analysis, design, and implementation, ensuring that services are independent, modularized, and can be scaled separately to meet the application’s growth needs.

View Decoupling

Decoupling the frontend from a monolith is another technique used to scale an application, allowing the user interface to be treated separately from the backend. Here are some common practices used by developers to decouple the frontend from a monolith and scale the application:

Separate presentation logic
Developers identify the existing presentation logic in the monolith and separate it into independent components. This often involves adopting a frontend framework such as React, Angular, or Vue.js to build a modular structure for the user interface. The components can be reusable and isolated, making frontend maintenance and scaling easier.

RESTful API
The frontend is designed to interact with the backend through RESTful APIs. Developers define clear and documented endpoints that the frontend can call to retrieve the necessary data from the backend. This approach allows the frontend to be independent of the backend, enabling them to be scaled separately.

Component-based architecture
Adopting a component-based architecture, such as the UI component design pattern, helps decouple the frontend. This involves dividing the user interface into reusable and independent components. Each component has its own logic and can be developed, tested, and scaled separately. This approach enables more agile development, facilitates maintenance, and scales the frontend effectively.

Deployment and caching of static content
To improve frontend performance and scalability, it is common to deploy static content to content delivery network (CDN) servers. Static files such as HTML, CSS, JavaScript, and images can be cached on multiple servers around the world, reducing the load on backend servers and improving the speed of content delivery to users.

Use of build and automation tools
Developers utilize build and automation tools such as Webpack or Gulp to package and optimize frontend code. These tools help create compact bundles, minimize and combine files, and enable the creation of optimized versions for production. This improves frontend performance and facilitates deployment in different environments.

Decoupling the frontend from a monolith is a gradual process and requires careful analysis of the existing architecture. Adopting a modular approach and utilizing best practices allow the frontend to be scaled independently of the backend, improving the overall scalability of the application.

Conclusion

If everything has gone well, you will reach a stage where you have a highly complex application with a multifaceted ecosystem composed of various interdependent modules, services, and databases. This chaotic interconnection of elements, though challenging, offers immense potential for innovation and growth. By dealing with this complexity, you will have the opportunity to explore new solutions, optimize processes, and strengthen the overall system’s robustness. While it may be challenging, the chaotic environment stimulates creativity and fosters continuous adaptation, allowing you to build a resilient and high-performance application.

In this chaotic environment, you will have the opportunity to develop skills in managing complexity and solving problems in challenging scenarios. Dealing with such a diverse ecosystem will require establishing efficient coordination mechanisms, implementing robust communication strategies, and creating comprehensive monitoring systems. These skills and practices will result in a more stable, resilient application capable of adapting to constantly evolving demands.

Furthermore, by tackling the interdependence of different services and databases, you will have the opportunity to explore emerging technologies such as microservices architectures, containers, container orchestration, and others. These advanced solutions will enable you to make the most of the system’s complexity, ensuring scalability, flexibility, and efficiency.

By overcoming challenges and achieving the necessary balance in this chaotic environment, your application will become an example of resourcefulness and innovation. You will be prepared to face future obstacles, seize emerging opportunities, and meet the demands of a constantly changing market.

Therefore, embrace the chaos, as it is in this scenario that you will find the maximum potential of your application. With dedication, skill, and the right mindset, you will be positioned for success and become a leader in your industry. Embrace this challenging journey and build something extraordinary from the interconnected complexity that is presented to you.