Containers allow applications to run quicker across many different development environments, and a single container encapsulates everything needed to run an application. Container technologies have exploded in popularity in recent years, leading to diverse use cases as well as new and unexpected challenges. This Zone offers insights into how teams can solve these challenges through its coverage of container performance, Kubernetes, testing, container orchestration, microservices usage to build and deploy containers, and more.
Container orchestration is a critical aspect of modern software development, enabling organizations to deploy and manage large-scale containerized applications. In this article, we will discuss what container orchestration is, why it is important, and some of the popular container orchestration tools available today. What Is Container Orchestration? Container orchestration is the process of automating the deployment, scaling, and management of containerized applications. Containers are lightweight, portable software units that can run anywhere, making them ideal for modern, distributed applications. However, managing containerized applications can be complex, as they typically consist of multiple containers that must be deployed, configured, and managed as a single entity. Container orchestration tools provide a platform for automating tasks, enabling organizations to manage large-scale containerized applications easily. They typically provide features such as automated deployment, load balancing, service discovery, scaling, and monitoring, making it easier to manage complex containerized applications. One of the most popular container orchestration tools is Kubernetes, which was developed by Google. Kubernetes provides a platform for automating the deployment, scaling, and management of containerized applications and has a large and active community. Other popular container orchestration tools include Docker Swarm, Apache Mesos, and Nomad. Container orchestration is important for organizations that develop and deploy modern, distributed applications. Containerization provides several benefits, including improved portability, scalability, and agility. However, managing containerized applications can be challenging, particularly as the number of containers and applications increases. Container orchestration tools provide a way to automate the management of containerized applications, enabling organizations to deploy and manage complex applications with ease. They also help ensure applications are highly available, scalable, and reliable, making it easier to deliver high-quality services to customers. Why Is Container Orchestration Important? Container orchestration is important for several reasons, particularly for organizations that develop and deploy modern, distributed applications. Here are some of the key reasons why container orchestration is important: Automation Container orchestration tools enable organizations to automate the deployment, scaling, and management of containerized applications. This reduces the need for manual intervention, making it easier to manage large-scale applications. Scalability Container orchestration tools provide features, such as automatic scaling and load balancing, which make it easier to scale applications up or down as demand changes. Container orchestration platforms make it easy to scale applications horizontally by adding or removing containers based on demand. Availability Container orchestration tools help ensure applications are highly available and reliable by providing features such as service discovery and self-healing. Portability Containers are portable, meaning they can be run anywhere, from local development environments to public cloud platforms. Container orchestration tools enable organizations to manage containerized applications across different environments and platforms, making it easier to move applications between different infrastructure providers. Container orchestration platforms provide a high degree of portability, enabling developers to run their applications in any environment, from on-premises data centers to public cloud environments. Flexibility Container orchestration tools provide a flexible and modular platform for managing containerized applications, making it easier to customize and extend the platform to meet specific requirements. Efficiency Container orchestration platforms automate many of the tasks involved in managing containerized applications, which can save developers time and reduce the risk of errors. Resiliency Container orchestration platforms offer self-healing capabilities that ensure applications remain available and responsive even with failures. Overall, container orchestration is essential for organizations that are developing and deploying modern, distributed applications. By automating the deployment, scaling, and management of containerized applications, container orchestration tools enable organizations to deliver high-quality services to customers, while also reducing the complexity and cost of managing containerized applications. Popular Container Orchestration Tools There are several container orchestration tools available, each with its own strengths and weaknesses. The most popular container orchestration tool is Kubernetes, which is an open-source platform for managing containerized applications. Kubernetes provides a robust set of features for managing containers, including container deployment, scaling, and health monitoring. Other popular container orchestration tools include Docker Swarm, which is a simple and lightweight orchestration tool, and Apache Mesos, which is a highly scalable and flexible orchestration tool. Kubernetes Kubernetes is one of the most popular container orchestration tools and is widely used in production environments. It provides a rich set of features, including automatic scaling, load balancing, service discovery, and self-healing. Docker Swarm Docker Swarm is a container orchestration tool that is tightly integrated with the Docker ecosystem. It provides a simple and easy-to-use platform for managing containerized applications but has fewer features than Kubernetes. Apache Mesos Apache Mesos is a distributed systems kernel that provides a platform for managing resources across clusters of machines. It can be used to manage a wide range of workloads, including containerized applications. Nomad Nomad is a container orchestration tool developed by HashiCorp. It provides a simple and flexible platform for managing containerized applications and can be used to manage containers and non-container workloads. OpenShift OpenShift is a container application platform developed by Red Hat. It is based on Kubernetes but provides additional features and capabilities, such as integrated developer tools and enterprise-grade security. Amazon ECS Amazon Elastic Container Service (ECS) is a fully managed container orchestration service provided by Amazon Web Services. It provides a simple and easy-to-use platform for managing containerized applications on the AWS cloud platform. Google Cloud Run Google Cloud Run is a fully managed serverless container platform provided by Google Cloud. It allows developers to run containerized applications without the need to manage the underlying infrastructure. Overall, the choice of container orchestration tool will depend on a range of factors, including the specific requirements of the organization, the size and complexity of the application, and the preferred infrastructure platform. Container Orchestration Best Practices To ensure successful container orchestration, there are several best practices that organizations should follow. These include: Standardize Container Images Use standardized container images to ensure consistency and repeatability in deployments. Monitor Container Health Use container monitoring tools to ensure containers are healthy and performing as expected. Automate Deployments Use automated deployment tools to reduce the risk of human error and ensure consistent deployments. Implement Resource Quotas Implement resource quotas to ensure containerized applications are not overprovisioned and to optimize resource utilization. Plan for Disaster Recovery Plan for disaster recovery by implementing backup and restore processes, and testing disaster recovery plans regularly. Conclusion Container orchestration is an essential aspect of modern software development, enabling organizations to manage large-scale containerized applications with ease. By automating the deployment, scaling, and management of containerized applications, container orchestration tools enable organizations to deliver high-quality services to customers, while also reducing the complexity and cost of managing containerized applications. With several popular container orchestration tools available, organizations have a wide range of options for managing containerized applications and can choose the platform that best meets their needs. Container orchestration is a critical element of modern software development and deployment. It enables organizations to manage containerized applications at scale, ensuring they are highly available and resilient. By following best practices and leveraging container orchestration tools like Kubernetes, organizations can optimize resource utilization, accelerate the software development lifecycle, and reduce the risk of human error.
Docker has revolutionized the way we build and deploy applications. It provides a platform-independent environment that allows developers to package their applications and dependencies into a single container. This container can then be easily deployed across different environments, making it an ideal solution for building and deploying applications at scale. Building Docker images from scratch is a must skill that any DevOps engineer needs to acquire for working with Docker. It allows you to create custom images tailored to your application's specific needs, making your deployments more efficient and reliable. Here, in this blog, we'll explore Docker images, its benefits, the process of building Docker images from scratch, and the best practices for building a Docker image. What Is a Docker Image? A Docker image is a lightweight, standalone, executable package that includes everything needed to run the software, including code, libraries, system tools, and settings. Docker images are built using a Dockerfile, which is a text file that contains a set of instructions for building the image. These instructions specify the base image to use, the packages and dependencies to install, and the configuration settings for the application. Docker images are designed to be portable and can be run on any system that supports Docker. They are stored in a central registry, such as Docker Hub, where others can easily share and download. By using Docker images, developers can quickly and easily deploy their applications in a consistent and reproducible manner, regardless of the underlying infrastructure. This makes Docker images an essential tool for modern software development and deployment. Benefits of Building a Docker Image By building image Docker, you can improve the consistency, reliability, and security of your applications. In addition, Docker images make it easy to deploy and manage applications, which helps to reduce the time and effort required to maintain your infrastructure. Here are some major benefits of building a Docker image: Portability: Docker images are portable and can run on any platform that supports Docker. This makes moving applications between development, testing, and production environments easy. Consistency: Docker images provide a consistent environment for running applications. This ensures that the application behaves the same way across different environments. Reproducibility: Docker images are reproducible, which means you can recreate the same environment every time you run the image. Scalability: Docker images are designed to be scalable, which means that you can easily spin up multiple instances of an application to handle increased traffic. Security: Docker images provide a secure way to package and distribute applications. They allow you to isolate your application from the host system and other applications running on the same system. Efficiency: Docker images are lightweight and take up minimal disk space. This makes it easy to distribute and deploy applications quickly. Versioning: Docker images can be versioned, which allows you to track changes and roll back to previous versions if necessary. Structure of a Docker Image A Docker image is a read-only template that contains the instructions for creating a Docker container. Before you learn how to build a Docker image, let's read about its structure first. The structure of a Docker image includes the following components: Base Image A Docker image is built on top of a base image, which is the starting point for the image. The base image can be an official image from the Docker Hub registry or a custom image created by another user. Filesystem The filesystem of a Docker image is a series of layers that represent the changes made to the base image. Each layer contains a set of files and directories that represent the differences from the previous layer. Metadata Docker images also include metadata that provides information about the image, such as its name, version, author, and description. This metadata is stored in a file called the manifest. Dockerfile The Dockerfile is a text file that contains the instructions for building the Docker image. It specifies the base image, the commands to run in the image, and any additional configuration needed to create the image. Before learning how to build the docker image using the Docker build command from Dockerfile, knowing how dockerfile works will be helpful. Configuration Files Docker images may also include configuration files that are used to customize the image at runtime. These files can be mounted as volumes in the container to provide configuration data or environment variables. Runtime Environment Finally, Docker images may include a runtime environment that specifies the software and libraries needed to run the application in the container. This can include language runtimes such as Python or Node.js or application servers such as Apache or Nginx. The structure of a Docker image is designed to be modular and flexible, allowing technology teams to create images tailored to their specific needs while maintaining consistency and compatibility across different environments. How to Build a Docker Image? To build a Docker image, you need to follow these steps: Create a Dockerfile A Dockerfile is a script that contains instructions on how to build your Docker image. The Dockerfile specifies the base image, dependencies, and application code that are required to build the image. After creating a Dockerfile and understanding how Dockerfile works, move to the next step. Define the Dockerfile Instructions In the Dockerfile, you need to define the instructions for building the Docker image. These instructions include defining the base image, installing dependencies, copying files, and configuring the application. Build the Docker Image To build a Docker image, you need to use the Docker build command. This command takes the Dockerfile as input and builds the Docker image. After using the Docker build command with Dockerfile, you can also specify the name and tag for the image using the -t option. Test the Docker Image Once the Docker image is built, you can test it locally using the docker run command. This command runs a container from the Docker image and allows you to test the application. Push the Docker Image to a Registry Once you have tested the Docker image, you can push it to a Docker registry such as Docker Hub or a private registry. This makes it easy to share the Docker image with others and deploy it to other environments. Let's see this Docker build command example. Once you've created your Dockerfile, you can use the "docker build" command to build the image. Here's the basic syntax for the docker build command with dockerfile: (php) docker build -t <image-name> <path-to-Dockerfile> Here, in this Docker build command example, if your Dockerfile is located in the current directory and you want to name your image "my-app," you can use the following Docker build command from dockerfile. (perl) docker build -t my-app This Docker builds command builds the Docker image using the current directory as the build context and sets the name and tag of the image to "my-app." Best Practices for Building a Docker Image Here are some best practices to follow when building a Docker image: First, use a small base image: Use a small base image such as Alpine Linux or BusyBox while building an image Docker. This helps to reduce the size of your final Docker image and improves security by minimizing the attack surface. Use a .dockerignore file: Use a .dockerignore file to exclude files and directories that are not needed in the Docker image. This helps to reduce the size of the context sent to the Docker daemon during the build process. Use multistage builds: Use multistage builds to optimize your Docker image size. Multistage builds allow you to build multiple images in a single Dockerfile, which can help reduce the number of layers in your final image. Minimize the number of layers: Minimize the number of layers in your Docker image to reduce the build time and image size. Each layer in a Docker image adds overhead, so it's important to combine multiple commands into a single layer. Use specific tags: Use specific tags for your Docker image instead of the latest tag. This helps to ensure that you have a consistent and reproducible environment. Avoid installing unnecessary packages: Avoid installing unnecessary packages in your Docker image to reduce the image size and improve security. Use COPY instead of ADD: Use the COPY command instead of ADD to copy files into your Docker image. The COPY command is more predictable and has fewer side effects than the ADD command. Avoid using root user: Avoid using the root user in your Docker image to improve security. Instead, create a non-root user and use that user in your Docker image. Docker Images: The Key to Seamless Container Management By following these steps and practices outlined in this blog, you can create custom Docker images tailored to your application's specific needs. This will not only make your deployments more efficient and reliable, but it will also help you to save time and resources. With these skills, you can take your Docker knowledge to the next level and build more efficient and scalable applications. Docker is a powerful tool for building and deploying applications, but it can also be complex and challenging to manage. Whether you're facing issues with image compatibility, security vulnerabilities, or performance problems, it's important to have a plan in place for resolving these issues quickly and effectively.
Native Image technology is gaining traction among developers whose primary goal is to accelerate startup time of applications. In this article, we will learn how to turn Java applications into native images and then containerize them for further deployment in the cloud. We will use: Spring Boot 3.0 with baked-in support for Native Image as the framework for our Java application; Liberica Native Image Kit (NIK) as a native-image compiler; Alpaquita Stream as a base image. Building Native Images from Spring Boot Apps Installing Liberica NIK It would be best to utilize a powerful computer with several gigabytes of RAM to work with native images. Opt for a cloud service provided by Amazon or a workstation so as not to overload the laptop. We will be using Linux bash commands further on because bash is a perfect way of accessing the code remotely. macOS commands are similar. As for Windows, you can use any alternative, for instance, bash included in the Git package for Windows. Download Liberica Native Image Kit for your system. Choose a Full version for our purposes. Unpack tar.gz with: tar -xzvf ./bellsoft-liberica.tar.gz Now, put the compiler to $PATH with: GRAALVM_HOME=/home/user/opt/bellsoft-liberica export PATH=$GRAALVM_HOME/bin:$PATH Check that Liberica NIK is installed: java -version openjdk version "17.0.5" 2022-10-18 LTS OpenJDK Runtime Environment GraalVM 22.3.0 (build 17.0.5+8-LTS) OpenJDK 64-Bit Server VM GraalVM 22.3.0 (build 17.0.5+8-LTS, mixed mode, sharing) native-image --version GraalVM 22.3.0 Java 17 CE (Java Version 17.0.5+8-LTS) If you get the error "java: No such file or directory" on Linux, you installed the binary for Alpine Linux, not Linux. Check the binary carefully. Creating a Spring Boot Project The easiest way to create a new Spring Boot project is to generate one with Spring Initializr. Select Java 17, Maven, JAR, and Spring SNAPSHOT-version (3.0.5 at the time of writing this article), then fill in the fields for project metadata. We don’t need any dependencies.Add the following code to you main class:System.out.println("Hello from Native Image!"); Spring has a separate plugin for native compilation, which utilizes multiple context dependent parameters under the hood. Let’s add the required configuration to our pom.xml file: XML <profiles> <profile> <id>native</id> <build> <plugins> <plugin> <groupId>org.graalvm.buildtools</groupId> <artifactId>native-maven-plugin</artifactId> <executions> <execution> <id>build-native</id> <goals> <goal>compile-no-fork</goal> </goals> <phase>package</phase> </execution> </executions> </plugin> </plugins> </build> </profile> </profiles> Let’s build the project with the following command: ./mvnw clean package -Pnative The resulting native image is in the target directory. Write a Dockerfile We need to write a Dockerfile to generate a Docker image container. Put the following file into the application folder: Dockerfile FROM bellsoft/alpaquita-linux-base:stream-musl COPY target/native-image-demo . CMD ["./native-image-demo"] Where we: Create an image with Alpaquita Linux base image (the native image doesn’t need a JVM to execute); Copy the app into the new image; Run the program inside the container. We can also skip the step with Liberica NIK installation and we build a native image straight in a container, which is useful when the development and deployment architectures are different. For that purpose, create another folder and put there your application and the following Dockerfile: Dockerfile FROM bellsoft/liberica-native-image-kit-container:jdk-17-nik-22.3-stream-musl as builder WORKDIR /home/myapp ADD native-image-demo /home/myapp/native-image-demo RUN cd native-image-demo && ./mvnw clean package -Pnative FROM bellsoft/alpaquita-linux-base:stream-musl WORKDIR /home/myapp COPY --from=builder /home/myapp/native-image-demo/target/native-image-demo . CMD ["./native-image-demo"] Where we: Specify the base image for Native Image generation; Point to the directory where the image will execute inside Docker; Copy the program to the directory; Build a native image; Create another image with Alpaquita Linux base image (the native image doesn’t need a JVM to execute); Specify the executable directory; Copy the app into the new image; Run the program inside the container. Build a Native Image Container To generate a native image and containerize it, run: docker build . Note that if you use Apple M1, you may experience troubles with building a native image inside a container. Check that the image was create with the following command: Dockerfile docker images REPOSITORY TAG IMAGE ID CREATED SIZE <none> <none> 8ebc2a97ef8e 18 seconds ago 45.2MB Tag the newly created image: docker tag 8ebc2a97ef8e nik-example Now you can run the image with: docker run -it --rm 8ebc2a97ef8e Hello from Native Image! Conclusion Native image containerization is as simple as creating Docker container images of standard Java apps. Much trickier is to migrate a Java application to Native Image. We used a simple program that didn’t require any manual configuration. But dynamic Java features (Reflection, JNI, Serialization, etc.) are not supported by GraalVM, so you have to make the native-image tool aware of them.
IBM App Connect Enterprise (ACE) has provided support for the concept of “shared classes” for many releases, enabling various use cases including providing supporting Java classes for JMS providers and also for caching data in Java static variables to make it available across whole servers (plus other scenarios). Some of these scenarios are less critical in a containerized server, and others might be handled by using shared libraries instead, but for the remaining scenarios there is still a need for the shared classes capability in containers. What Is the Equivalent of /var/mqsi/shared-classes in Containers? Adding JARs to shared classes is relatively simple when running ACE in a virtual machine: copying the JAR files into a specific directory such as /var/mqsi/shared-classes allows all flows in all servers to make use of the Java code. There are other locations that apply only to certain integration nodes or servers, but the basic principle is the same, and only needs to be performed once for a given version of supporting JAR as the copy action is persistent across redeploys and reboots. The container world is different, in that it starts with a fixed image every time, so copying files into a specific location must either be done when building the container image, or else done every time the container starts (because changes to running containers are generally non-persistent). Further complicating matters is the way flow redeploy works with containers: the new flow is run in a new container, and the old container with the old flow is deleted, so any changes to the old container are lost. Two main categories of solution exist in the container world: Copy the shared classes JARs into the container image during the container build, and Deploy the shared classes JARs in a BAR file or configuration in IBM Cloud Pak for Integration (CP4i) and configure the server to look for them. There is also a modified form of the second category that uses persistent volumes to hold the supporting JARs, but from an ACE point of view it is very similar to the CP4i configuration method. The following discussion uses an example application from the GitHub repo at https://github.com/trevor-dolby-at-ibm-com/ace-shared-classes to illustrate the question and some of the answers. Original Behavior With ACE in a Virtual Machine Copying the supporting JAR file into /var/mqsi/shared-classes was sufficient when running in a virtual machine, as the application would be able to use the classes without further configuration: The application would start and run successfully, and other applications would also be able to use the same shared classes across all servers. Container Solution 1: Copy the Shared Classes JARs in While Building the Container Image This solution has several variants, but they all result in the container starting up with the support JAR already in place. ACE servers will automatically look in the “shared-classes” directory within the work directory, and so it is possible to simply copy the JARs into the correct location; the following example from the Dockerfile in the repo mentioned above shows this: # Copy the pre-built shared JAR file into placeRUN mkdir /home/aceuser/ace-server/shared-classesCOPY SharedJava.jar /home/aceuser/ace-server/shared-classes/ and the server in the container will load the JAR into the shared classloader: Note that this solution also works for servers running locally during development in a virtual machine. It also means that any change to the supporting JAR requires a rebuild of the container image, but this may not be a problem if a CI/CD pipeline is used to build application-specific container images. The server may also be configured to look elsewhere for shared classes by setting the additionalSharedClassesDirectories parameter in server.conf.yaml. This parameter can be set to a list of directories to use, and then the supporting JAR files can be placed anywhere in the container. The following example shows the JAR file in the “/git/ace-shared-classes” directory: This solution would be most useful for cases where the needed JAR files are already present in the image, possibly as part of another application installation. Container Solution 2: Deploy the Shared Classes JARs in a BAR File or Configuration in CP4i For many CP4i use cases, the certified container image will be used unmodified, so the previous solution will not work as it requires modification of the container image. In these cases, the supporting JAR files can be deployed either as a BAR file or else as a “generic files” configuration. In both cases, the server must be configured to look for shared classes in the desired location. If the JAR files are small enough or if the shared artifacts are just properties files, then using a “generic files” configuration is a possible solution, as that type of configuration is a ZIP file that can contain arbitrary contents. The repo linked above shows an example of this, where the supporting JAR file is placed in a ZIP file in a subdirectory called “extra-classes” and additionalSharedClassesDirectories is set to “/home/aceuser/generic/extra-classes”: (If a persistent volume is used instead, then the “generic files” configuration is not needed and the additionalSharedClassesDirectories setting should point to the PV location; note that this requires the PV to be populated separately and managed appropriately (including allowing multiple simultaneous versions of the JARs in many cases)). The JAR file can also be placed in a shared library and deployed in a BAR file, which allows the supporting JARs to be any size and also allows a specific version of the supporting JARs to be used with a given application. In this case, the supporting JARs must be copied into a shared library and then additionalSharedClassesDirectories must be set to point the server at the shared library to tell it to use it as shared classes. This example uses a shared library called SharedJavaLibrary and so additionalSharedClassesDirectories is set to “{SharedJavaLibrary}”: Shared libraries used this way cannot also be used by applications in the server. Summary Existing solutions that require the use of shared classes can be migrated to containers without needing to be rewritten, with two categories of solution that allow this. The first category would be preferred if building container images is possible, while the second would be preferred if a certified container image is used as-is. For further reading on container image deployment strategies, see Comparing Styles of Container-Based Deployment for IBM App Connect Enterprise; ACE servers can be configured to work with shared classes regardless of which strategy is chosen.
Docker Swarm: Simplifying Container Orchestration In recent years, containers have become an increasingly popular way to package, distribute, and deploy software applications. They offer several advantages over traditional virtual machines, including faster start-up times, improved resource utilization, and greater flexibility. However, managing containers at scale can be challenging, especially when running large, distributed applications. This is where container orchestration tools come into play, and Docker Swarm is one of the most popular options available. What Is Docker Swarm? Docker Swarm is a container orchestration tool that allows you to deploy and manage a cluster of Docker nodes. Each node is a machine that hosts one or more Docker containers, and together, they form a swarm. Docker Swarm provides a simple and intuitive interface for managing and monitoring your containers, making it an ideal tool for large-scale container deployments. Docker Swarm makes it easy to deploy and manage containerized applications across multiple hosts. It provides features such as load balancing, automatic service discovery, and fault tolerance. With Docker Swarm, you can easily scale your applications up or down by adding or removing Docker nodes from the cluster, making it easy to handle changes in traffic or resource usage. How Does Docker Swarm Work? Docker Swarm allows you to deploy and manage a cluster of Docker nodes. The nodes are machines that host one or more Docker containers, and they work together to form a swarm. When you deploy an application to Docker Swarm, you define a set of services that make up the application. Each service consists of one or more containers that perform a specific function. For example, you might have a service that runs a web server and another service that runs a database. Docker Swarm automatically distributes the containers across the nodes in the swarm, ensuring that each service is running on the appropriate nodes. It also provides load balancing and service discovery, making it easy to access your applications from outside the swarm. Docker Swarm uses a leader-follower model to manage the nodes in the swarm. The leader node is responsible for managing the overall state of the swarm and coordinating the activities of the follower nodes. The follower nodes are responsible for running the containers and executing the tasks assigned to them by the leader node. Docker Swarm is built on top of the Docker Engine, which is the core component of the Docker platform. The Docker Engine runs on each node in the swarm and manages the lifecycle of containers running on that node. When you deploy an application to a Docker Swarm, you define a set of services that make up the application. Each service consists of one or more containers that perform a specific function. For example, you might have a service that runs a web server and another service that runs a database. Docker Swarm automatically distributes the containers across the nodes in the swarm, ensuring that each service is running on the appropriate nodes. It also provides load balancing and service discovery, making it easy to access your applications from outside the swarm. Docker Swarm provides several features that make it easy to manage containers at scale, including: Load Balancing Docker Swarm automatically distributes incoming traffic across the nodes running the containers in the swarm, ensuring that each container receives a fair share of the traffic. Docker Swarm provides built-in load balancing to distribute traffic evenly across containers in a cluster. This helps to ensure that each container receives an equal share of the workload and prevents any single container from becoming overloaded. Automatic Service Discovery Docker Swarm automatically updates a DNS server with the IP addresses of containers running in the swarm. This makes it easy to access your containers using a simple domain name, even as the containers move around the swarm. Docker Swarm automatically assigns unique DNS names to containers, making it easy to discover and connect to services running within the swarm. This feature simplifies the management of large, complex, containerized applications. Fault Tolerance Docker Swarm automatically detects when a container fails and automatically restarts it on another node in the swarm. This ensures that your applications remain available even if individual containers or nodes fail. Scaling Docker Swarm makes it easy to scale your applications up or down by adding or removing nodes from the swarm. This makes it easy to handle changes in traffic or resource usage. Docker Swarm enables easy scaling of containerized applications. As your application traffic grows, you can add more nodes to the cluster, and Docker Swarm automatically distributes the containers across the new nodes. Rolling Updates Docker Swarm allows for rolling updates, where you can update containers without disrupting the application’s availability. This is achieved by updating containers one at a time while other containers continue to handle the traffic. Security Docker Swarm provides built-in security features to help protect your containerized applications. For example, it supports mutual TLS encryption for securing communication between nodes in the cluster. Ease of Use Docker Swarm is designed to be easy to use, with a simple API and command-line interface that makes it easy to deploy and manage containerized applications. High Availability Docker Swarm is designed to provide high availability for containerized applications. It automatically distributes containers across multiple nodes in a cluster and provides fault tolerance so that even if a node or container fails, the application remains available. Overall, Docker Swarm provides a range of powerful features that make it an ideal choice for managing containers at scale. With its support for high availability, scalability, load balancing, service discovery, rolling updates, security, and ease of use, Docker Swarm simplifies the management of containerized applications, allowing you to focus on delivering value to your customers. Benefits of Docker Swarm Docker Swarm offers several benefits for organizations that are deploying containerized applications at scale. These include: Simplified Management Docker Swarm provides a simple and intuitive interface for managing containers at scale. This makes it easy to deploy, monitor, and scale your applications. High Availability Docker Swarm provides built-in fault tolerance, ensuring that your applications remain available even if individual containers or nodes fail. Scalability Docker Swarm makes it easy to scale your applications up or down by adding or removing nodes from the swarm. This makes it easy to handle changes in traffic or resource usage. Compatibility Docker Swarm is fully compatible with the Docker platform, making it easy to use alongside other Docker tools and services. Portability Docker Swarm allows you to easily deploy and manage containerized applications across different environments, including on-premises and in the cloud. This helps to ensure that your applications can be easily moved and scaled as needed, providing flexibility and agility for your business. Conclusion Docker Swarm is a powerful tool for managing containers at scale. It provides a simple and intuitive interface for deploying and managing containerized applications across multiple hosts while also providing features such as load balancing, automatic service discovery, and fault tolerance. Docker Swarm is a very powerful tool for anyone looking to deploy and manage containerized applications at scale. It provides a simple and intuitive interface for managing a cluster of Docker nodes, allowing you to easily deploy and manage services across multiple hosts. With features such as load balancing, service discovery, and fault tolerance, Docker Swarm makes it easy to run containerized applications in production environments. If you’re using Docker for containerization, Docker Swarm is definitely worth checking out.
This is an article from DZone's 2023 Software Integration Trend Report.For more: Read the Report In recent years, the rise of microservices has drastically changed the way we build and deploy software. The most important aspect of this shift has been the move from traditional API architectures driven by monolithic applications to containerized microservices. This shift not only improved the scalability and flexibility of our systems, but it has also given rise to new ways of software development and deployment approaches. In this article, we will explore the path from APIs to containers and examine how microservices have paved the way for enhanced API development and software integration. The Two API Perspectives: Consumer and Provider The inherent purpose of building an API is to exchange information. Therefore, APIs require two parties: consumers and providers of the information. However, both have completely different views. For an API consumer, an API is nothing more than an interface definition and a URL. It does not matter to the consumer whether the URL is pointing to a mainframe system or a tiny IoT device hosted on the edge. Their main concern is ease of use, reliability, and security. An API provider, on the other hand, is more focused on the scalability, maintainability, and monetization aspects of an API. They also need to be acutely aware of the infrastructure behind the API interface. This is the place where APIs actually live, and it can have a lot of impact on their overall behavior. For example, an API serving millions of consumers would have drastically different infrastructure requirements when compared to a single-consumer API. The success of an API offering often depends on how well it performs in a production-like environment with real users. With the explosion of the internet and the rise of always-online applications like Netflix, Amazon, Uber, and so on, API providers had to find ways to meet the increasing demand. They could not rely on large monolithic systems that were difficult to change and scale up as and when needed. This increased focus on scalability and maintainability, which led to the rise of microservices architecture. The Rise of Microservices Architecture Microservices are not a completely new concept. They have been around for many years under various names, but the official term was actually coined by a group of software architects at a workshop near Venice in 2011/2012. The goal of microservices has always been to make a system flexible and maintainable. This is an extremely desirable target for API providers and led to the widespread adoption of microservices architecture styles across a wide variety of applications. The adoption of microservices to build and deliver APIs addressed several challenges by providing important advantages: Since microservices are developed and deployed independently, they allow developers to work on different parts of the API in parallel. This reduces the time to market for new features. Microservices can be scaled up or down to meet the varying demands of specific API offerings. This helps to improve resource use and cost savings. There is a much better distribution of API ownership as different teams can focus on different sets of microservices. By breaking down an API into smaller and more manageable services, it becomes theoretically easier to manage outages and downtimes. This is because one service going down does not mean the entire application goes down. The API consumers also benefit due to the microservices-based APIs. In general, consumer applications can model better interactions by integrating a bunch of smaller services rather than interfacing with a giant monolith. Figure 1: APIs perspectives for consumer and provider Since each microservice has a smaller scope when compared to a monolith, there is less impact on the client application in case of changes to the API endpoints. Moreover, testing for individual interactions becomes much easier. Ultimately, the rise of microservices enhanced the API-development landscape. Building an API was no longer a complicated affair. In fact, APIs became the de facto method of communication between different systems. Nonetheless, despite the huge number of benefits provided by microservices-based APIs, they also brought some initial challenges in terms of deployments and managing dependencies. Streamlining Microservices Deployment With Containers The twin challenges of deployment and managing dependencies in a microservices architecture led to the rise in container technologies. Over the years, containers have become increasingly popular, particularly in the context of microservices. With containers, we can easily package the software with its dependencies and configuration parameters in a container image and deploy it on a platform. This makes it trivial to manage and isolate dependencies in a microservices-based application. Containers can be deployed in parallel, and each deployment is predictable since everything that is needed by an application is present within the container image. Also, containers make it easier to scale and load balance resources, further boosting the scalability of microservices and APIs. Figure 2 showcases the evolution from monolithic to containerized microservices: Figure 2: Evolution of APIs from monolithic to containerized microservices Due to the rapid advancement in cloud computing, container technologies and orchestration frameworks are now natively available on almost all cloud platforms. In a way, the growing need for microservices and APIs boosted the use of containers to deploy them in a scalable manner. The Future of Microservices and APIs Although APIs and microservices have been around for numerous years, they have yet to reach their full potential. Both are going to evolve together in this decade, leading to some significant trends. One of the major trends is around API governance. Proper API governance is essential to make your APIs discoverable, reusable, secure, and consistent. In this regard, OpenAPI, a language-agnostic interface to RESTful APIs, has more or less become the prominent and standard way of documenting APIs. It can be used by both humans and machines to discover and understand an API's capabilities without access to the source code. Another important trend is the growth in API-powered capabilities in the fields of NLP, image recognition, sentiment analysis, predictive analysis, chatbot APIs, and so on. With the increased sophistication of models, this trend is only going to grow stronger, and we will see many more applications of APIs in the coming years. The rise of tools like ChatGPT and Google Bard shows that we are only at the beginning of this journey. A third trend is the increased use of API-driven DevOps for deploying microservices. With the rise of cloud computing and DevOps, managing infrastructure is an extremely important topic in most organizations. API-driven DevOps is a key enabler for Infrastructure as Code tools to provision infrastructure and deploy microservices. Under the covers, these tools rely on APIs exposed by the platforms. Apart from major ones, there are also other important trends when it comes to the future of microservices and APIs: There is a growing role of API enablement on the edge networks to power millions of IoT devices. API security practices have become more important than ever in a world of unprecedented integrations and security threats. API ecosystems are expanding as more companies develop a suite of APIs that can be used in a variety of situations to build applications. Think of API suites like Google Maps API. There is an increased use of API gateways and service meshes to improve reliability, observability, and security of microservices-based systems. Conclusion The transition from traditional APIs delivered via monolithic applications to microservices running on containers has opened up a world of possibilities for organizations. The change has enabled developers to build and deploy software faster and more reliably without compromising on the scalability aspects. They have made it possible to build extremely complex applications and operate them at an unprecedented scale. Developers and architects working in this space should first focus on the key API trends such as governance and security. However, as these things become more reliable, they should explore cutting-edge areas such as API usage in the field of artificial intelligence and DevOps. This will keep them abreast with the latest innovations. Despite the maturity of the API and microservices ecosystem, there is a lot of growth potential in this area. With more advanced capabilities coming up every day and DevOps practices making it easier to manage the underlying infrastructure, the future of APIs and microservices looks bright. References: "A Brief History of Microservices" by Keith D. Foote "The Future of APIs: 7 Trends You Need to Know" by Linus Håkansson "Why Amazon, Netflix, and Uber Prefer Microservices over Monoliths" by Nigel Pereira "Google Announces ChatGPT Rival Bard, With Wider Availability in 'Coming Weeks'" by James Vincent "Best Practices in API Governance" by Janet Wagner "APIs Impact on DevOps: Exploring APIs Continuous Evolution," xMatters Blog This is an article from DZone's 2023 Software Integration Trend Report.For more: Read the Report
Developers and DevOps teams have embraced the use of containers for application development and deployment. They offer a lightweight and scalable solution to package software applications. The popularity of containerization is due to its apparent benefits, but it has also created a new attack surface for cybercriminals, which must be protected against. Industry-leading statistics demonstrate the wide adoption of this technology. For example, a 2020 study from Forrester mentioned, "container security spending is set to reach $1.3 billion by 2024". In another report, Gartner stated, "by 2025, over 85% of organizations worldwide will be running containerized applications in production, a significant increase from less than 35% in 2019". On the flip side, various statistics indicate that the popularity of containers has also made them a target for cybercriminals who have been successful in exploiting them. In the 2019 report, Aqua Security published that 94% of US organizations use containers for production applications, up from 68% in 2018. The same survey reported that 65% of organizations had experienced at least one container-related security incident, a steep increase from 60% in the previous year. A more recent study conducted by StackRox in 2021 found that 94% of surveyed organizations had experienced a security incident in their container environment in the past 12 months. Finally, in a survey by Red Hat, 60% of respondents cited security as the top concern when adopting containerization. These data points emphasize the significance of container security, making it a critical and pressing topic for discussion among organizations that are currently using or planning to adopt containerized applications. To comprehend the security implications of a containerized environment, it is crucial to understand the fundamental elements of a container deployment network. Container Deployment Network The above illustration outlines a standard container deployment using Kubernetes. However, before designing a strong security framework for this system, it is crucial to understand its basic components and how they interact with each other. Load Balancers are the entry point for the ingress traffic. They help in distributing incoming traffic to nodes that reside within a cluster. In general, their purpose is to maintain a balanced flow of requests inside the container environment. Kubernetes Cluster consists of a master node that manages the cluster, multiple worker nodes that run containerized applications, and a Kubernetes control plane that is utilized to manage all nodes. The cluster's primary job is managing, scaling, and orchestrating the containerized environment. Nodes are physical or virtual machines that use 'container runtime' to manage the containers. The nodes work in close coordination with the control plane using Kubelet (agent used to schedule and manage pods using the control plane), Kube Proxy (network proxy used to route traffic to the right pod), and cAdvisor (container monitoring tool used to send performance metrics from containers to control plane). Pods contain one or more containers within it. They are the smallest deployment units in Kubernetes that run on worker nodes. Container is an executable software package that provides everything to run an application or a service. It contains code, libraries, system tools, and settings. The system is built using container images, which are read-only templates that are utilized to run applications. They are isolated from other containers and also from the host operating system. High-level container environment traffic flow: Load Balancer receives the ingress traffic and distributes it across various nodes. Based on the service or requested application, the cluster directs the traffic to the appropriate container. Once the container processes the request and generates a response, it is sent back to the requesting entity through the same route. For the egress traffic, the container sends information to the cluster and directs it to the load balancer. The balancer then transmits the request to the required entity. Containers do provide some built-in security controls. Each containerized environment is isolated, and traffic does not travel within the host network. This prevents lateral movement of data that aids in improving overall security. These environments can be further segregated using network segmentation to control traffic flow within the container environment. However, this architecture may also introduce many security risks if adequate measures are not taken during its implementation. We should comprehend, use, and comply with the technical and design security requirements to ensure the security of containers. Host Security The host is considered to be one of the most crucial components from a security perspective. Although containers are kept isolated from one another, they are built on top of a host operating system (OS). Hence, the host OS needs to be free from any vulnerabilities. This will reduce the likelihood of unauthorized access. Here are some measures to ensure a secure connection between the container and the host: Periodically scan the host OS for security vulnerabilities, if any, and patch the system regularly. Disable unused or unnecessary services, protocols, or functionality. Replace insecure protocols like telnet with widely popular products like SSH. Review access to the host OS annually (or more frequently, depending on the risk level of the applications running on it) and limit it to authorized personnel only. Enable MFA (multi-factor authentication) and RBAC (Role-based access control). Use container isolation technologies like namespaces and cgroups to ensure that containers are isolated from each other and the host. Install host-based firewalls and virtual private networks (VPNs) for container network security. Log container activity using monitoring tools like Auditd, Sysdig, Falco, and Prometheus. They will help you track anomalous user behavior, detect known threats and address them. Create backups for data recovery in case of failures. Also, perform business impact analysis (BIA) testing at regular intervals to measure the backups' effectiveness. Image Hardening Containers are built using software, configuration settings, and libraries. These are collectively referred to as container images and stored in the form of read-only templates. Since these images are the source of truth, it is important to harden them, i.e., keep them free from malware and other vulnerabilities. Below are some ways to do it: First, remove packages that are not used or are unnecessary. Use only secure images (that come from a trusted source) to build a new container. Finally, configure them using secure defaults. Implement access controls for container images; limit user access to containers and use secure credential storage for container authentication. The trusted repository used within the organization should only allow the storage of hardened images. One method to implement this measure is to ensure that any images being uploaded to the secure repository have been signed and verified beforehand. Tools like Docker Content Trust or Docker Notary can be used for the same. Use and implement secure container image management, distribution, caching, tagging, and layering. Use tools like Clair or Trivy to perform vulnerability scanning in the container environment. Container Security Configuration Another important component is the configuration of the container where the application is running. Here are some settings that can be configured to reduce exposure: Run the containers with the least privileged access for all system resources, including memory, CPU, and network. Use tools like SELinux and AppArmor for container runtime security. These can prevent unauthorized access and protect container resources. Manage secure deployment of containers using orchestration tools like Kubernetes and Docker Swarm. Network Security The network is a critical component for all systems. Therefore, it is important to restrict network access and ensure that data at rest and in transit is always encrypted. A few specific network security requirements for containers are: Limit the attack surface by implementing network segmentation for container clusters. To limit access to a containerized application, it is recommended to employ a container firewall and HIDS on the container host while also setting resource limits for the container. Periodically scan the containers for vulnerabilities and conduct security testing. Monitor container network traffic and enable secure container logging. Generate alerts if any suspicious activity is detected. Use tools like Calico and Weave Net for securing network environments. Container and Network Security Policy and Protocols Policies are guidelines and rules for securing containerized applications and their associated networks. The policy may include protocols for deploying, monitoring, and managing containers to ensure that they operate securely and do not pose a threat to the host system or network. Store container images using the secure registry. Implement container backups and encrypt sensitive data to protect against loss of data. Implement a secure container by using only trusted certificates for container communication. Enable secure boot for containers and secure DNS resolution. Implement secure container network drivers, entry points, networking policies, network plugins, bridges, overlay networks, DNS configuration, and network load balancer. Implement secure container network virtual switches, routing policies, firewalls, routing protocols, security groups, access control lists, load balancing algorithms, service recovery, and service mesh. Use container configuration management and orchestration tools to enforce these policies. Application and Platform Security Container application uses several application programming interfaces (APIs) for connecting and gathering information from various systems. There are some basic container application security requirements that should be tested and validated in a timely fashion: Third-party libraries used in coding should be secure and free from vulnerabilities. Developers should be trained to implement only secure coding practices and secure container development practices. Use container orchestration tools alongside implementing secure application deployment and management processes. Implement container host and image - hardening, scanning, signing, verification, and management. Compliance With Security and Regulatory Standards Containers host multiple applications; hence, they must comply with regulatory requirements and security standards like PCIDSS and HIPAA. Some common requirements for container security to meet various compliance standards: Conduct periodic security risk assessments of the applications and the container environment. Set up incident response and change management procedures for container security. Implement backup and restore procedures along with disaster recovery plans. Organizations should have secure container lifecycle management policies and procedures. In addition, regular audits should be conducted to test their effectiveness. Mandate user training to create awareness about secure container practices. Utilize resources such as Open Policy Agent and Kyverno to verify compliance with relevant regulations and recommended security protocols. It is crucial for organizations to implement security measures to mitigate potential risks posed by security breaches in containerization. This involves ensuring that both applications and container environments are thoroughly checked for vulnerabilities. In addition to that, adopting technical measures such as restricting access, implementing access controls, conducting regular risk assessments, and continuously monitoring container environments have proved very effective in minimizing potential security threats. This article outlines a proactive and strategic container security approach that is aimed at aligning all stakeholders, including developers, operations, and security teams. By implementing these requirements, organizations can ensure that their container security is well-coordinated and effectively managed.
Secrets management in Docker is a critical security concern for any business. When using Docker containers, it is essential to keep sensitive data, such as passwords, API keys, and other credentials, secure. This article will discuss some best practices for managing secrets in Docker, including how to store them securely and minimize their exposure. We will explore multiple solutions: using Docker Secrets with Docker Swarm, Docker Compose, or Mozilla SOPS. Feel free to choose what’s more appropriate to your use case. But most importantly is to remember to never hard-code your Docker secrets in plain text in your Dockerfile! Following these guidelines ensures your organization’s sensitive information remains safe even when running containerized services. 4 Ways To Store and Manage Secrets in Docker 1. Using Docker Secrets and Docker Swarm Docker Secrets and Docker Swarm are two official and complimentary tools allowed to securely manage secrets when running containerized services. Docker Secrets provides a secure mechanism for storing and retrieving secrets from the system without exposing them in plain text. It enables users to keep their credentials safe by encrypting the data with a unique key before passing it to the system. Docker Swarm is a powerful tool for managing clusters of nodes for distributed applications. It provides an effective means of deploying containerized applications at scale. With this tool, you can easily manage multiple nodes within a cluster and automatically distribute workloads among them. This helps ensure your application has enough resources available at all times, even during peak usage periods or unexpected traffic spikes. Together, these two tools provide an effective way to ensure your organization’s sensitive information remains safe despite ever-evolving security needs. Let’s see how to create and manage an example secret. Creating a Secret To create a secret, we need to first initialize Docker Swarm. You can do so using the following command: docker swarm init Once the service is initialized, we can use the docker secret create command to create the secret: ssh-keygen -t rsa -b 4096 -N "" -f mykey docker secret create my_key mykey rm mykey In these commands, we first create an SSH key using the ssh-keygen command and write it to mykey. Then, we use the Docker secret command to generate the secret. Ensure you delete the mykey file to avoid any security risks. You can use the following command to confirm the secret is created successfully: docker secret ls We can now use this secret in our Docker containers. One way is to pass this secret with –secret flag when creating a service: docker service create --name mongodb --secret my_mongodb_secret redis:latest We can also pass this secret to the docker-compose.yml file. Let’s take a look at an example file: version: '3.7' services: myapp: image: mydummyapp:latest secrets: - my_secret volumes: - type: bind source: my_secret_key target: /run/secrets/my_secret read_only: true secrets: my_secret: external: true In the example compose file, the secrets section defines a secret named my_secret_key (discussed earlier). The myapp service definition specifies that it requires my_secret_key , and mounts it as a file at /run/secrets/my_secret in the container. 2. Using Docker Compose Docker Compose is a powerful tool for defining and running multi-container applications with Docker. A stack is defined by a docker-compose file allowing you to define and configure the services that make up your application, including their environment variables, networks, ports, and volumes. With Docker Compose, it is easy to set up an application in a single configuration file and deploy it quickly and consistently across multiple environments. Docker Compose provides an effective solution for managing secrets for organizations handling sensitive data such as passwords or API keys. You can read your secrets from an external file (like a TXT file). But be careful not to commit this file with your code: version: '3.7' services: myapp: image: myapp:latest secrets: - my_secret secrets: my_secret: file: ./my_secret.txt 3. Using a Sidecar Container A typical strategy for maintaining and storing secrets in a Docker environment is to use sidecar containers. Secrets can be sent to the main application container via the sidecar container, which can also operate a secrets manager or another secure service. Let’s understand this using a Hashicorp Vault sidecar for a MongoDB container: First, create a Docker Compose (docker-compose.yml) file with two services: mongo and secrets. In the secrets service, use an image containing your chosen secret management tool, such as a vault. Mount a volume from the secrets container to the mongo container so the mongo container can access the secrets stored in the secrets container. In the mongo service, use environment variables to set the credentials for the MongoDB database, and reference the secrets stored in the mounted volume. Here is the example compose file: version: '3.7' services: mongo: image: mongo volumes: - secrets:/run/secrets environment: MONGO_INITDB_ROOT_USERNAME_FILE: /run/secrets/mongo-root-username MONGO_INITDB_ROOT_PASSWORD_FILE: /run/secrets/mongo-root-password secrets: image: vault volumes: - ./secrets:/secrets command: ["vault", "server", "-dev", "-dev-root-token-id=myroot"] ports: - "8200:8200" volumes: secrets: 4. Using Mozilla SOPS Mozilla SOPS (Secrets Ops) is an open-source platform that provides organizations with a secure and automated way to manage encrypted secrets in files. It offers a range of features designed to help teams share secrets in code in a safe and practical way. The following assumes you are already familiar with SOPS, if that’s not the case, start here. Here is an example of how to use SOPS with docker-compose.yml: version: '3.7' services: myapp: image: myapp:latest environment: API_KEY: ${API_KEY} secrets: - mysecrets sops: image: mozilla/sops:latest command: ["sops", "--config", "/secrets/sops.yaml", "--decrypt", "/secrets/mysecrets.enc.yaml"] volumes: - ./secrets:/secrets environment: # Optional: specify the path to your PGP private key if you encrypted the file with PGP SOPS_PGP_PRIVATE_KEY: /secrets/myprivatekey.asc secrets: mysecrets: external: true In the above, the myapp service requires a secret called API_KEY. The secrets section uses a secret called mysecrets, which is expected to be stored in an external key/value store, such as Docker Swarm secrets or HashiCorp Vault. The sops service uses the official SOPS Docker image to decrypt the mysecrets.enc.yaml file, which is stored in the local ./secrets directory. The decrypted secrets are mounted to the myapp service as environment variables. Note: Make sure to create the secrets directory and add the encrypted mysecrets.enc.yaml file and the sops.yaml configuration file (with SOPS configuration) in that directory. Scan for Secrets in Your Docker Images Hard coding secrets in Docker is a significant security risk, making them vulnerable to attackers. We have seen different best practices to avoid hard-coding secrets in plain text in your Docker images, but security doesn’t stop there. You Should Also Scan Your Images for Secrets All Dockerfiles start with a FROM directive that defines the base image. It’s important to understand when you use a base image, especially from a public registry like Docker Hub, you are pulling external code that may contain hardcoded secrets. More information is exposed than visible in your single Dockerfile. Indeed, it’s possible to retrieve a plain text secret hard-coded in a previous layer starting from your image. In fact, many public Docker images are concerned: in 2021, we estimated that **7% of the Docker Hub images contained at least one secret.** Fortunately, you can easily detect them with ggshield (GitGuardian CLI). For example: ggshield secret scan docker ubuntu:22.04 Conclusion Managing secrets in Docker is a crucial part of preserving the security of your containerized apps. Docker includes several built-in tools for maintaining secrets, such as Docker Secrets and Docker Compose files. Additionally, organizations can use third-party solutions, like HashiCorp Vault and Mozilla SOPS, to manage secrets in Docker. These technologies offer extra capabilities, like access control, encryption, and audit logging, to strengthen the security of your secret management. Finally, finding and limiting accidental or unintended exposure of sensitive information is crucial to handling secrets in Docker. Companies are invited to use secret scanning tools, such as GitGuardian, to scan the Docker images built in their CI/CD pipelines as mitigation to prevent supply-chain attacks. If you want to know more about Docker security, we also summarized some of the best practices in a cheat sheet.
This is an article from DZone's 2023 Software Integration Trend Report.For more: Read the Report Our approach to scalability has gone through a tectonic shift over the past decade. Technologies that were staples in every enterprise back end (e.g., IIOP) have vanished completely with a shift to approaches such as eventual consistency. This shift introduced some complexities with the benefit of greater scalability. The rise of Kubernetes and serverless further cemented this approach: spinning a new container is cheap, turning scalability into a relatively simple problem. Orchestration changed our approach to scalability and facilitated the growth of microservices and observability, two key tools in modern scaling. Horizontal to Vertical Scaling The rise of Kubernetes correlates with the microservices trend as seen in Figure 1. Kubernetes heavily emphasizes horizontal scaling in which replications of servers provide scaling as opposed to vertical scaling in which we derive performance and throughput from a single host (many machines vs. few powerful machines). Figure 1: Google Trends chart showing correlation between Kubernetes and microservice (Data source: Google Trends ) In order to maximize horizontal scaling, companies focus on the idempotency and statelessness of their services. This is easier to accomplish with smaller isolated services, but the complexity shifts in two directions: Ops – Managing the complex relations between multiple disconnected services Dev – Quality, uniformity, and consistency become an issue. Complexity doesn't go away because of a switch to horizontal scaling. It shifts to a distinct form handled by a different team, such as network complexity instead of object graph complexity. The consensus of starting with a monolith isn't just about the ease of programming. Horizontal scaling is deceptively simple thanks to Kubernetes and serverless. However, this masks a level of complexity that is often harder to gauge for smaller projects. Scaling is a process, not a single operation; processes take time and require a team. A good analogy is physical traffic: we often reach a slow junction and wonder why the city didn't build an overpass. The reason could be that this will ease the jam in the current junction, but it might create a much bigger traffic jam down the road. The same is true for scaling a system — all of our planning might make matters worse, meaning that a faster server can overload a node in another system. Scalability is not performance! Scalability vs. Performance Scalability and performance can be closely related, in which case improving one can also improve the other. However, in other cases, there may be trade-offs between scalability and performance. For example, a system optimized for performance may be less scalable because it may require more resources to handle additional users or requests. Meanwhile, a system optimized for scalability may sacrifice some performance to ensure that it can handle a growing workload. To strike a balance between scalability and performance, it's essential to understand the requirements of the system and the expected workload. For example, if we expect a system to have a few users, performance may be more critical than scalability. However, if we expect a rapidly growing user base, scalability may be more important than performance. We see this expressed perfectly with the trend towards horizontal scaling. Modern Kubernetes systems usually focus on many small VM images with a limited number of cores as opposed to powerful machines/VMs. A system focused on performance would deliver better performance using few high-performance machines. Challenges of Horizontal Scale Horizontal scaling brought with it a unique level of problems that birthed new fields in our industry: platform engineers and SREs are prime examples. The complexity of maintaining a system with thousands of concurrent server processes is fantastic. Such a scale makes it much harder to debug and isolate issues. The asynchronous nature of these systems exacerbates this problem. Eventual consistency creates situations we can't realistically replicate locally, as we see in Figure 2. When a change needs to occur on multiple microservices, they create an inconsistent state, which can lead to invalid states. Figure 2: Inconsistent state may exist between wide-sweeping changes Typical solutions used for debugging dozens of instances don't apply when we have thousands of instances running concurrently. Failure is inevitable, and at these scales, it usually amounts to restarting an instance. On the surface, orchestration solved the problem, but the overhead and resulting edge cases make fixing such problems even harder. Strategies for Success We can answer such challenges with a combination of approaches and tools. There is no "one size fits all," and it is important to practice agility when dealing with scaling issues. We need to measure the impact of every decision and tool, then form decisions based on the results. Observability serves a crucial role in measuring success. In the world of microservices, there's no way to measure the success of scaling without such tooling. Observability tools also serve as a benchmark to pinpoint scalability bottlenecks, as we will cover soon enough. Vertically Integrated Teams Over the years, developers tended to silo themselves based on expertise, and as a result, we formed teams to suit these processes. This is problematic. An engineer making a decision that might affect resource consumption or might impact such a tradeoff needs to be educated about the production environment. When building a small system, we can afford to ignore such issues. Although as scale grows, we need to have a heterogeneous team that can advise on such matters. By assembling a full-stack team that is feature-driven and small, the team can handle all the different tasks required. However, this isn't a balanced team. Typically, a DevOps engineer will work with multiple teams simply because there are far more developers than DevOps. This is logistically challenging, but the division of work makes more sense in this way. As a particular microservice fails, responsibilities are clear, and the team can respond swiftly. Fail-Fast One of the biggest pitfalls to scalability is the fail-safe approach. Code might fail subtly and run in non-optimal form. A good example is code that tries to read a response from a website. In a case of failure, we might return cached data to facilitate a failsafe strategy. However, since the delay happens, we still wait for the response. It seems like everything is working correctly with the cache, but the performance is still at the timeout boundaries. This delays the processing. With asynchronous code, this is hard to notice and doesn't put an immediate toll on the system. Thus, such issues can go unnoticed. A request might succeed in the testing and staging environment, but it might always fall back to the fail-safe process in production. Failing fast includes several advantages for these scenarios: It makes bugs easier to spot in the testing phase. Failure is relatively easy to test as opposed to durability. A failure will trigger fallback behavior faster and prevent a cascading effect. Problems are easier to fix as they are usually in the same isolated area as the failure. API Gateway and Caching Internal APIs can leverage an API gateway to provide smart load balancing, caching, and rate limiting. Typically, caching is the most universal performance tip one can give. But when it comes to scale, failing fast might be even more important. In typical cases of heavy load, the division of users is stark. By limiting the heaviest users, we can dramatically shift the load on the system. Distributed caching is one of the hardest problems in programming. Implementing a caching policy over microservices is impractical; we need to cache an individual service and use the API gateway to alleviate some of the overhead. Level 2 caching is used to store database data in RAM and avoid DB access. This is often a major performance benefit that tips the scales, but sometimes it doesn't have an impact at all. Stack Overflow recently discovered that database caching had no impact on their architecture, and this was because higher-level caches filled in the gaps and grabbed all the cache hits at the web layer. By the time a call reached the database layer, it was clear this data wasn't in cache. Thus, they always missed the cache, and it had no impact. Only overhead. This is where caching in the API gateway layer becomes immensely helpful. This is a system we can manage centrally and control, unlike the caching in an individual service that might get polluted. Observability What we can't see, we can't fix or improve. Without a proper observability stack, we are blind to scaling problems and to the appropriate fixes. When discussing observability, we often make the mistake of focusing on tools. Observability isn't about tools — it's about questions and answers. When developing an observability stack, we need to understand the types of questions we will have for it and then provide two means to answer each question. It is important to have two means. Observability is often unreliable and misleading, so we need a way to verify its results. However, if we have more than two ways, it might mean we over-observe a system, which can have a serious impact on costs. A typical exercise to verify an observability stack is to hypothesize common problems and then find two ways to solve them. For example, a performance problem in microservice X: Inspect the logs of the microservice for errors or latency — this might require adding a specific log for coverage. Inspect Prometheus metrics for the service. Tracking a scalability issue within a microservices deployment is much easier when working with traces. They provide a context and a scale. When an edge service runs into an N+1 query bug, traces show that almost immediately when they're properly integrated throughout. Segregation One of the most important scalability approaches is the separation of high-volume data. Modern business tools save tremendous amounts of meta-data for every operation. Most of this data isn't applicable for the day-to-day operations of the application. It is meta-data meant for business intelligence, monitoring, and accountability. We can stream this data to remove the immediate need to process it. We can store such data in a separate time-series database to alleviate the scaling challenges from the current database. Conclusion Scaling in the age of serverless and microservices is a very different process than it was a mere decade ago. Controlling costs has become far harder, especially with observability costs which in the case of logs often exceed 30 percent of the total cloud bill. The good news is that we have many new tools at our disposal — including API gateways, observability, and much more. By leveraging these tools with a fail-fast strategy and tight observability, we can iteratively scale the deployment. This is key, as scaling is a process, not a single action. Tools can only go so far and often we can overuse them. In order to grow, we need to review and even eliminate unnecessary optimizations if they are not applicable. This is an article from DZone's 2023 Software Integration Trend Report.For more: Read the Report
Node.js has been a favorite among serious programmers for the last five years running. The JavaScript Runtime Environment for Maximum Throughput is a free and open-source program that aims to improve the performance of JavaScript across several platforms. Because of its event-driven, non-blocking I/O approach, Node.js is small in size and quick in processing requests, making it an excellent choice for data-intensive, real-time, and distributed applications. Developers are increasingly turning to node.js application optimization services; thus, it's important to streamline the process of designing and releasing cross-platform applications. So, let's get into the context of the article. Suggestions for Containerizing and Optimizing Node Apps Here are listed seven ways of containerizing your node.js application, so let's have a look at them in brief. 1. Use a Specific Base Image Tag Instead of "Version:Latest" Useful tags that define version information, intended destination (prod or test, for example), stability, or other relevant information for distributing your application across environments should always be included when creating Docker images. Outside of the development environment, you shouldn't depend on the most recent tag that Docker automatically downloads. The usage of the most recent version of a program might result in strange or even harmful effects. Suppose you're constantly updating to the most recent version of an image. In that case, eventually, one of those updates is certain to include a brand-new build or untested code that will cause your app to stop functioning as intended. Take this example Dockerfile that targets that node: # Create image based on the official Node image from dockerhub FROM node:lts-buster # Create app directory WORKDIR /usr/src/app # Copy dependency definitions COPY package.json ./package.json COPY package-lock.json ./package-lock.json # Install dependencies #RUN npm set progress=false \ # && npm config set depth 0 \ # && npm i install RUN npm ci # Get all the code needed to run the app COPY . . # Expose the port the app runs in EXPOSE 3000 # Serve the app CMD ["npm", "start"] Instead of using node:latest, you should use the lts-buster Docker image. Considering that lts-buster is a static picture, this method may be preferable. 2. Use a Multi-Stage Build One single Docker base image may be used throughout several stages of a build, including compilation, packaging, and unit testing. However, the actual code that executes the program is stored in a different image. As the finished image won't have any development or debugging tools, it'll be more secure and take up less space. In addition, if you use Docker's multi-stage build process, you can be certain that your builds will be both efficient and repeatable. You can create multiple stages within a Dockerfile to control how you build that image. You can containerize your Node application using a multi-layer approach. Different parts of the application, like code, assets, and even snapshot dependencies, may be located in each of the many layers that make up the program. What if we wish to create an independent image of our application? To see an example Dockerfile of this in action, please check the following: FROM NODE:LTS-BUSTER-SLIM AS DEVELOPMENT WORKDIR /USR/SRC/APP COPY PACKAGE.JSON ./PACKAGE.JSON COPY PACKAGE-LOCK.JSON ./PACKAGE-LOCK.JSON RUN NPM CI COPY . . EXPOSE 3000 CMD [ "NPM", "RUN", "DEV" ] FROM DEVELOPMENT AS DEV-ENVS RUN <<EOF APT-GET UPDATE APT-GET INSTALL -Y --NO-INSTALL-RECOMMENDS GIT EOF # INSTALL DOCKER TOOLS (CLI, BUILDX, COMPOSE) COPY --FROM=GLOURSDOCKER/DOCKER / / CMD [ "NPM", "RUN", "DEV" ] We first add an AS development label to the node:lts-buster-slim statement. This lets us refer to this build stage in other build stages. Next, we add a new development stage labeled dev-envs. We'll use this stage to run our development. Now, let's rebuild our image and run our development. To execute just the development build stage, we'll use the same docker build command as before, but this time we'll use the —target development parameter. docker build -t node-docker --target dev-envs 3. Fix Security Vulnerabilities in Your Node Image In order to create modern services, programmers often use preexisting third-party software. However, it's important to be cautious when integrating third-party software into your project since it may present security holes. Using verified image sources and maintaining vigilant container monitoring are both useful security measures. Docker Desktop will notify you to do security checks on the newly created node:lts-buster-slim Docker image. Let's have a look at our Node.js app with the help of the Snyk Plugin for Docker Desktop. Begin by setting up Docker Desktop 4.8.0+ on your Mac, Windows, or Linux PC. Next, choose the Allow Docker Extensions checkbox under Settings > Extensions. After that, you can search for Snyk in the Extensions Marketplace by selecting the "Add Extensions" option on the left sidebar. Put in the Snyk and log onto the network: lts-buster-slim Type "Node Docker Official Image" into the "Choose image name" box. In order to begin scanning, you will need to log in to Docker Hub. If you don't have an account, don't fret; making one is easy, quick, and completely free. With Docker Desktop, the outcome of a scan looks like this: During this scan, Snyk discovered 70 vulnerabilities of varied severity. After you've identified them, you may start fixing them to improve your reputation. Not just that. Using the docker scan command on your Dockerfile will execute a vulnerability scan: 4. Leverage HEALTHCHECK The HEALTHCHECK directive instructs Docker on how to check the health of a container. For example, this may be used to determine whether or not a web server is in an endless loop and unable to accept new connections, even while the server process is still active. # syntax=docker/dockerfile:1.4 FROM node:lts-buster-slim AS development # Create app directory WORKDIR /usr/src/app COPY package.json ./package.json COPY package-lock.json ./package-lock.json RUN npm ci COPY . . EXPOSE 3000 CMD [ "npm", "run", "dev" ] FROM development as dev-envs RUN <<EOF apt-get update apt-get install -y --no-install-recommends git EOF RUN <<EOF useradd -s /bin/bash -m vscode groupadd docker usermod -aG docker vscode EOF HEALTHCHECK CMD curl --fail http://localhost:3000 || exit 1 # install Docker tools (cli, buildx, compose) COPY --from=gloursdocker/docker / / CMD [ "npm", "run", "dev" ] In the production stage, applications are often managed by an orchestrator such as Kubernetes or a service fabric. HEALTHCHECK allows you to inform the orchestrator about the health of your containers, which may be used for configuration-based management. Here's a case in point: BACKEND: CONTAINER_NAME: BACKEND RESTART: ALWAYS BUILD: BACKEND VOLUMES: - ./BACKEND:/USR/SRC/APP - /USR/SRC/APP/NODE_MODULES DEPENDS_ON: - MONGO NETWORKS: - EXPRESS-MONGO - REACT-EXPRESS EXPOSE: - 3000 HEALTHCHECK: TEST: ["CMD", "CURL", "-F", "HTTP://LOCALHOST:3000"] INTERVAL: 1M30S TIMEOUT: 10S RETRIES: 3 START_PERIOD: 40S 5. Use .dockerignore We suggest creating a.dockerignore file in the same folder as your Dockerfile to improve build times. This guide requires a single line in your.dockerignore file: NODE_MODULES The node modules directory, which includes Maven's output, is not included in the Docker build context, thanks to this line. There are numerous advantages to having a well-organized.dockerignore file, but for the time being, this simple file will suffice. Next, I'll describe the built environment and why it's so important. Docker images may be created using the Docker build command by combining a Dockerfile and a "context." In this setting, everything you do applies to the directory structure or URL you just gave me. Any of these files may be used in the construction process. Meanwhile, the node developer operates in the compilation context. A directory on Mac, Windows, or Linux. Everything required to run the program may be found in this folder, including the source code, settings, libraries, and plugins. If you provide a.dockerignore file, we may use it to skip over certain parts of your project while creating your new image: code, configuration files, libraries, plugins, etc. For example, if you want to keep the node modules directory out of your build, you may do so by adding the following to your.dockerignore file. Backend Frontend 6. Run as a Non-Root User for Security Purposes It is safer to run apps with the user's permission since this helps reduce vulnerabilities. Even with Docker containers. Docker containers and their contents automatically get root access to the host system. That's why it's recommended to never run Docker containers as the root user. This may be accomplished by including certain USER directives in your Dockerfile. When executing the image and for any future RUN, CMD, or ENTRYPOINT instructions, the USER command specifies the desired user name (or UID) and, optionally, the user group (or GID): FROM NODE:LTS-BUSTER AS DEVELOPMENT WORKDIR /USR/SRC/APP COPY PACKAGE.JSON ./PACKAGE.JSON COPY PACKAGE-LOCK.JSON ./PACKAGE-LOCK.JSON RUN NPM CI COPY . . EXPOSE 3000 CMD ["NPM", "START"] FROM DEVELOPMENT AS DEV-ENVS RUN <<EOF APT-GET UPDATE APT-GET INSTALL -Y --NO-INSTALL-RECOMMENDS GIT EOF RUN <<EOF USERADD -S /BIN/BASH -M VSCODE GROUPADD DOCKER USERMOD -AG DOCKER VSCODE EOF # INSTALL DOCKER TOOLS (CLI, BUILDX, COMPOSE) COPY --FROM=GLOURSDOCKER/DOCKER / / CMD [ "NPM", "START" ] 7. Explore Graceful Shutdown Options for Node Temporary storage spaces created in Docker for Node. They are easy to prevent, destroy, and then either replace or repurpose. It is possible to kill containers by giving the process the SIGTERM signal. In order to make the most of this brief window of opportunity, your app must be able to process incoming requests and free up any associated resources without delay. Node.js, on the other hand, is crucial for a successful shutdown of your app since it takes and passes signals like SIGINT and SIGTERM from the OS. Because of Node.js, your app may select how to respond to the signals it receives. If you don't program for them or use a module that does, your app won't terminate properly. However, it will continue to operate normally until Docker or Kubernetes terminates it due to a timeout. If you're unable to modify your application's code, you may still use the docker run —init or tini init option inside your Dockerfile. It is recommended, however, that you provide code to manage appropriate signal handling for graceful shutdowns. Conclusion In this tutorial, we covered a wide range of topics related to Docker image optimization, from constructing a solid Dockerfile to using Snyk to check for vulnerabilities. It's not difficult to make better Node.js applications. If you master some basic skills, you'll be in good condition.
Yitaek Hwang
Software Engineer,
NYDIG
Abhishek Gupta
Principal Developer Advocate,
AWS
Alan Hohn
Director, Software Strategy,
Lockheed Martin
Marija Naumovska
Product Manager,
Microtica