# Cloud Native Principles

The concepts of immutable infrastructure, declarative apis, and microservices each deserve individual treatment. The following statements are high level generalizations that can help lead towards an understanding of these three concepts before diving deeply into each of them individually.

***P1*** *- If a project is* ***cloud native*** \[1],\[2]*, it uses* ***immutable infrastructure*** \[3], ***declarative apis**, and* ***microservices**.*

***P2*** *- If infrastructure is* ***immutable**, it is easily* ***reproduced*** \[4],\[5]*,* ***consistent*** \[6]*,* ***disposable*** \[7],\[8]*, will have a* ***repeatable*** \[9] ***deployment process**, and will not have configuration or artifacts that are modifiable in place.*

The process that implements immutable infrastructure needs to be reproducible (without any needing to ‘think’ each time provisioning occurs) and repeatable (automated). This process also needs to be consistent (infrastructure elements should be identical) and disposable (designed to be easily created, destroyed, replaced, resized, etc). A project’s immutable infrastructure, which can include everything from the physical hardware at the lowests levels up to the platforms that the application is installed on, has its configuration protected from change (not modifiable) after it is deployed into an environment. This configuration is stored in such a way that it can be used to recreate the infrastructure as needed. Furthermore, the process should be idempotent allowing a state to be applied multiple times and the same desired state still being achieved.

***P3*** *- If a project has an efficient and repeatable deployment process, its process is* ***versioned*** \[10], ***automated*** \[11], *and has* ***low overhead/coarse grained packaging*** \[12],\[13],\[14],\[15]

The core of cloud native development rests in coarse-grained packaging such as that found in container technologies such as Docker. Any light weight / low overhead technology that satisfies the requirements for low overhead and coarse grained packaging (packaging all of the dependencies together with the application) can satisfy the deployment requirements for cloud native applications. Normal CI/CD best practices apply for the deployment practice itself.

***P4*** *- If a project’s deployment is* ***automated**,* ***configuration*** \[16], ***environment*** \[17], *and* ***artifacts*** \[18] *are completely managed by a* \_**pipeline**.\_

***P5*** *- If a projects deployment is managed completely by a* ***pipeline**, the project’s* ***environment*** *is* ***protected*** \[19]

Production environments should be only directly modified by the automated pipeline process and therefore not *directly* modifiable by anyone. This protects against snowflake configuration.

***P6*** *- If a project’s environment is protected, it provides* ***observability*** \[21] *of the project’s internal components.*

In order to maintain, debug, and have insight into a protected environment, its infrastructure elements must have the property of being observable. This means these elements must externalize their internal states in some way that lends itself to metrics, tracing, and logging.

***P7*** *- If a project's uses* ***declarative APIs*** \[22], *its* ***configuration*** *is* ***declarative*** \[23],\[24]

***P8*** *- If a project’s configuration is* ***declarative*** \[25], *it designates* ***what*** *to do,* ***not how*** *to do it.*

Declarative APIs for an immutable infrastructure are anything that configures the infrastructure element. This declaration can come in the form of a YAML file or a script, as long as the configuration designates the desired outcome, not how to achieve said outcome.

***P9*** *- If a project exists as a* ***microservice*** \[26],\[28],\[29]*, it is* ***not monolithic**, it is* ***resilient**, it follows* ***12-factor principles*** \[30], *and is* ***discoverable*** \[31].

When a service is monolithic, multiple business capabilities are tightly coupled, therefore requiring coordination with multiple groups within the organization that are developing the service. A microservice separates concerns based on business capability (features or groups of features). This allows for a more rapid deployment of services with a faster feedback loop.

***P10*** *- If a microservice is* ***resilient**, it is* ***self-healing*** *and* ***distributed*** \[32].

A microservice is also resilient, in that it is accompanied by some kind of strategy for healing itself. This includes strategies for restarting after failures and distributive scaling in response to load. A microservice scales out to handle load (more processes are spawned on more machines) instead of scaling up (increasing the capacity of the individual machines)

***P11*** *- If a microservice is* ***self-healing*** \[33], *it is compatible with* ***declarative configuration*** *and orchestration* \[34]*.*

Once a microservice is coupled with a declarative strategy (a strategy that outlines what the system should look like), it can then be handed over to an orchestrator in order to implement that strategy.

**LICENSE**

This work is licensed under a [Creative Commons Attribution 4.0 International License](http://creativecommons.org/licenses/by/4.0/).

**LIST OF CONTRIBUTORS**

If you would like credit for helping with these documents (for either this document or any of the other four documents linked above), please add your name to the list of contributors.

W Watson Vulk Coop Taylor Carpenter Vulk Coop

Denver Williams Vulk Coop

Jeffrey Saelens Charter Communications

Bill Mulligan Loodse

## Endnotes

1. “**Cloud** **native** technologies empower organizations to build and run scalable applications in modern, **dynamic** environments such as public, private, and hybrid clouds. **Containers, service meshes**, **microservices**, **immutable infrastructure,** and **declarative APIs** exemplify this approach. These techniques enable loosely coupled systems that are **resilient**, **manageable**, and **observable**. Combined with robust **automation**, they allow engineers to make high-impact changes frequently and predictably with minimal toil.
2. <https://youtu.be/lmGFgZ889kY?t=318>
3. “**Immutable infrastructure** makes configuration changes by **completely** **replacing** **servers**. Changes are made by **building new server templates**, and then rebuilding relevant servers using those templates. This increases predictability, as there is **little** **variance** between servers as **tested**, and servers in **production**. It requires sophistication in **server template management**.” Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud (Kindle Locations 1611-1614). O'Reilly Media. Kindle Edition.
4. It should be possible to **effortlessly** and reliably rebuild any element of an infrastructure. Effortlessly means that there is **no need to make any significant decisions** about **how** to **rebuild** the thing. Decisions about which software and versions to install on a server, how to choose a hostname, and so on should be captured in the scripts and tooling that provision it. Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud (Kindle Locations 349-352). O'Reilly Media. Kindle Edition.
5. When problems are discovered, fixes may not be rolled out to all of the systems that could be affected by them. Differences in versions and configurations across servers mean that software and scripts that work on some machines don’t work on others. This leads to **inconsistency** across the **servers**, called **configuration drift**. \[...] Even when servers are initially created and configured consistently, **differences** can creep in **over time**: \[...]. But **variations should be captured and managed in a way that makes it easy to reproduce and to rebuild servers and services.** Unmanaged variation between servers leads to **snowflake servers** and automation **fear**. Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud (Kindle Locations 278-290). O'Reilly Media. Kindle Edition. .
6. Given **two infrastructure elements** providing a **similar service** for example, two application servers in a cluster the servers should be nearly **identical**. Their system software and configuration should be the same, except for those **bits** of **configuration** that differentiate them, like their **IP addresses**. Letting inconsistencies slip into an infrastructure keeps you from being able to trust your automation. If one file server has an 80 GB partition, while another has 100 GB, and a third has 200 GB, then you can’t rely on an action to work the same on all of them. This encourages doing special things for servers that don’t quite match, which leads to **unreliable** **automation**. Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud (Kindle Locations 380-384). O'Reilly Media. Kindle Edition.
7. One of the **benefits** of **dynamic infrastructure** is that **resources** can be easily **created, destroyed, replaced, resized, and moved**. In order to take advantage of this, systems should be designed to **assume** that the infrastructure will **always** be **changing**. **Software** should **continue running** even when **servers** **disappear**, appear, and when they are resized. Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud (Kindle Locations 357-359). O'Reilly Media. Kindle Edition.
8. A popular expression is to “**treat your servers like cattle, not pets**.” ,Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud (Kindle Locations 362-363). O'Reilly Media. Kindle Edition.
9. Building on the **reproducibility** principle, any action you carry out on your infrastructure should be **repeatable**. This is an obvious benefit of **using scripts and configuration management tools** **rather than** making changes **manually**, but it can be hard to stick to doing things this way, especially for experienced system administrators. Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud (Kindle Locations 393-395). O'Reilly Media. Kindle Edition.
10. “**Everything** you need to **build**, **deploy**, **test**, and **release** your application should be kept in some form of **versioned** storage. This includes requirement documents, test scripts, automated test cases, network configuration scripts, deployment scripts, database creation, upgrade, downgrade, and initialization scripts, application stack configuration scripts, libraries, toolchains, technical documentation, and so on. All of this stuff should be version-controlled, and the relevant **version** should be **identifiable** for any given **build**. That is, these change sets should have a single identifier, such as a **build** **number** or a version control changeset number, that references every piece.” Humble, Jez. Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation (Addison-Wesley Signature Series (Fowler)) (p. 26). Pearson Education. Kindle Edition.
11. “In general, your **build** **process** should be automated up to the point where it needs specific human direction or decision making. This is also true of your **deployment process** and, in fact, your entire software **release process**. Acceptance tests can be automated. Database upgrades and downgrades can be automated too. Even network and firewall configuration can be automated. You should automate as much as you possibly can.” Humble, Jez. Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation (Addison-Wesley Signature Series (Fowler)) (p. 25). Pearson Education. Kindle Edition.
12. **Containerized services** works by packaging applications and services in **lightweight containers** (as popularized by Docker). This **reduces coupling** between **server configuration** and the things that **run on** the **servers**. **So host servers tend to be very simple, with a lower rate of change.** One of the other change management **models** still needs to be **applied** to these **hosts**, but their implementation becomes much simpler and easier to maintain. **Most effort and attention goes into packaging, testing, distributing, and orchestrating the services and applications**, but this follows something similar to the immutable infrastructure model, which again is simpler than managing the configuration of full-blown virtual machines and servers. Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud (Kindle Locations 1617-1621). O'Reilly Media. Kindle Edition.
13. “The **value** of a **containerization** **system** is that it provides a **standard** **format** for **container** **images** and tools for **building**, **distributing**, and **running** those **images**. Before Docker, teams could isolate running processes using the same operating system features, but Docker and similar tools make the process much simpler.” Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud (Kindle Locations 1631-1633). O'Reilly Media. Kindle Edition.
14. “**Configuration** **management** refers to the process by which all **artifacts** relevant to your project, and the **relationships** between them, are **stored**, **retrieved**, uniquely **identified**, and **modified**.” Humble, Jez. Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation (Addison-Wesley Signature Series (Fowler)) (p. 31). Pearson Education. Kindle Edition.
15. Stine, Matt. Migrating to Cloud-Native Application Architecture, O'reilly, 2015, pp. 25-26. “**Containers** leverage modern **Linux kernel primitives** such as **control** **groups** (cgroups) and **namespaces** to provide similar resource allocation and **isolation** features as those provided by virtual machines with much **less** **overhead** and much greater **portability**.”
16. “... we consider it **bad practice** to **inject** **configuration** **information** at **build** or **packaging** time. This follows from the principle that you should be able to **deploy** the **same** **binaries** to **every environment** so you can ensure that the **thing** that you **release** is the **same** thing that you **tested**. The corollary of this is that anything that **changes** **between deployments** needs to be **captured** as **configuration**, and **not baked** in when the application is **compiled** or **packaged**.” Humble, Jez. Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation (Addison-Wesley Signature Series (Fowler)) (pp. 41-42). Pearson Education. Kindle Edition.
17. “An **environment** is **all** of the **resources** that your **application** needs to **work** and their **configuration**.” Humble, Jez. Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation (Addison-Wesley Signature Series (Fowler)) (p. 277). Pearson Education. Kindle Edition.
18. “The key **characteristic** of **binaries** is that you should be able to **copy** them onto a **new** **machine** and, given an appropriately **configured** **environment** and the correct **configuration** for the **application** in that environment, start your application—**without** relying on any part of your **development** **toolchain** being installed on that machine.”Humble, Jez. Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation (Addison-Wesley Signature Series (Fowler)) (p. 134). Pearson Education. Kindle Edition.
19. “**Don’t Make Changes Directly on the Production Environment**: Most downtime in production environments is caused by uncontrolled changes. Production environments should be completely **locked** **down**, so that **only** your **deployment** **pipeline** can make **changes** to it. That includes everything from the **configuration** of the environment to the **applications** deployed on it and their **data**.” Humble, Jez. Continuous Delivery: Reliable Software Releases through Build, Test, and Deployment Automation (Addison-Wesley Signature Series (Fowler)) (p. 273). Pearson Education. Kindle Edition.
20. \--
21. Stine, Matt. Migrating to Cloud-Native Application Architecture, O'reilly, 2015, pp. 27–28. “**Visibility**: Our **architectures** must provide us with the **tools** **necessary** to **see** **failure** when it happens. We need the ability to **measure** everything, establish a **profile** for “what’s **normal**,” **detect** **deviations** from the norm (including absolute values and rate of change), and **identify** the **components** contributing to those **deviations**. Feature-rich **metrics**, **monitoring**, **alerting**, and data **visualization** frameworks and tools are at the heart of all cloud-native application architecture”:
22. “**Declarative** **configuration** is **different** from **imperative** **configuration** , where you simply take a series of actions (e.g., apt-get install foo ) to modify the world. Years of production experience have taught us that maintaining a written **record** of the system’s **desired** **state** leads to a more **manageable**, **reliable** system. Declarative configuration enables numerous **advantages**, including **code** **review** for configurations as well as **documenting** the **current** **state** of the world for distributed teams. Additionally, it is the **basis** for all of the **self-healing** behaviors in Kubernetes that keep applications running **without user action.**” Hightower, Kelsey; Burns, Brendan; Beda, Joe. Kubernetes: Up and Running: Dive into the Future of Infrastructure (Kindle Locations 892-896). Kindle Edition.
23. “To understand these **two approaches**, consider the task of producing three replicas of a piece of software. With an **imperative** approach, the configuration would say: “**run A, run B, and run C.**” The corresponding **declarative** configuration would be “**replicas equals three**.” Hightower, Kelsey; Burns, Brendan; Beda, Joe. Kubernetes: Up and Running: Dive into the Future of Infrastructure (Kindle Locations 181-183). Kindle Edition.
24. “The **combination** of **declarative** **state** stored in a **version** control system and Kubernetes’s ability to make **reality** **match** this declarative **state** makes **rollback** of a change trivially **easy**. It is simply restating the previous declarative state of the system. With **imperative** **systems** this is usually **impossible**, since while the **imperative** **instructions** describe how to get you from point A to point B, they **rarely** **include** the **reverse** instructions that can get you back. “Hightower, Kelsey; Burns, Brendan; Beda, Joe. Kubernetes: Up and Running: Dive into the Future of Infrastructure (Kindle Locations 186-190). Kindle Edition.
25. “Because it describes the state of the world, **declarative** **configuration** does **not** have to be **executed** to be **understood**. Its impact is concretely declared. Since the effects of declarative configuration can be understood before they are executed, declarative configuration is far **less error-prone**. Further, the traditional tools of software development, such as **source control, code review, and unit testing**, can be used in **declarative** configuration in ways that are **impossible** for **imperative** instructions. “ Hightower, Kelsey; Burns, Brendan; Beda, Joe. Kubernetes: Up and Running: Dive into the Future of Infrastructure (Kindle Locations 183-186). Kindle Edition.
26. Stine, Matt. Migrating to Cloud-Native Application Architecture, O'reilly, 2015, pp. 16.. “As we **decouple** the **business domain** into independently deployable **bounded contexts** of **capabilities**, we also **decouple** the associated **change** **cycles**. As long as the changes are restricted to a single bounded context, and the service continues to **fulfill** its existing **contracts**, those changes can be made and **deployed** **independent** of any **coordination** with the rest of the business. The result is enablement of **more** frequent and rapid **deployments**, allowing for a continuous flow of value.”
27. \--
28. Stine, Matt. Migrating to Cloud-Native Application Architecture, O'reilly, 2015, pp. 27–28. “**Adoption** of new technology can be **accelerated**. Large **monolithic** application architectures are typically associated with **long-term commitments** to technical **stacks**. These commitments exist to **mitigate** the **risk** of adopting new technology by simply not doing it. Technology **adoption** **mistakes** are more **expensive** in a **monolithic** architecture, as those mistakes can pollute the entire enterprise architecture. If we adopt new technology within the scope of a single monolith, we isolate and **minimize** the **risk** in much the same way that we isolate and minimize the risk of runtime failure.”
29. Stine, Matt. Migrating to Cloud-Native Application Architecture, O'reilly, 2015, pp. 27–28. “**Microservices** offer independent, **efficient** **scaling** of services. **Monolithic** architectures can scale, but **require** us to **scale** **all** **components**, not simply those that are under heavy load. Microservices can be scaled if and only if their associated load requires it.”
30. Stine, Matt. Migrating to Cloud-Native Application Architecture, O'reilly, 2015, pp. 10–11 “***Codebase*** Each deployable app is **tracked** as one codebase tracked in **revision** control. It may have many deployed instances across multiple environments. ***Dependencies*** An app explicitly declares and **isolates dependencie**s via appropriate tooling (e.g., Maven, Bundler, NPM) rather than depending on implicitly realized dependencies in its deployment environment. ***Config*** Configuration, or **anything** that is likely to **differ** between deployment **environments** (e.g., development, staging, production) is **injected** via operating system-level **environment** **variables**. ***Backing services*** Backing services, such as **databases** or message brokers, are treated as **attached resources** and consumed **identically** across all environments. ***Build, release, run*** The **stages** of building a **deployable** app artifact, **combining** that **artifact** with **configuration**, and **starting** one or more **processes** from that artifact/configuration combination, are strictly **separated**. ***Processes*** The app executes as one or more **stateless** **processes** (e.g., master/workers) that **share** **nothing**. Any necessary state is externalized to **backing** **services** (cache, object store, etc.). ***Port binding*** The app is self-contained and **exports** any/all **services** via **port binding** (including HTTP). ***Concurrency*** Concurrency is usually accomplished by **scaling out app processes horizontally** (though processes may also multiplex work via internally managed threads if desired). ***Disposability*** Robustness is maximized via **processes** that **start up** quickly and **shut down gracefully**. These aspects allow for **rapid elastic scaling**, deployment of changes, and **recovery** from crashes. ***Dev/prod parity*** Continuous delivery and deployment are enabled by **keeping** **development**, **staging**, and **production** environments as **similar** as possible. ***Logs*** Rather than managing logfiles, **treat logs as event streams**, allowing the execution environment to **collect**, **aggregate**, **index**, and **analyze** the **events** via **centralized** services. ***Admin processes*** Administrative or **management tasks**, such as database migrations, are executed as **one-off processes** in environments identical to the app’s long-running processes.”
31. “**Service discovery** tools help solve the problem of **finding** which **processes** are listening at which **addresses** for which **services**. A good service discovery system will enable users to **resolve** this information **quickly** and **reliably**. A good system is also **low-latency**; clients are updated soon after the information associated with a service change. Finally, a good service discovery system can **store** a **richer** **definition** of what that **service** is. For example, perhaps there are multiple ports associated with the service.” Hightower, Kelsey; Burns, Brendan; Beda, Joe. Kubernetes: Up and Running: Dive into the Future of Infrastructure (Kindle Locations 1423-1426). Kindle Edition.
32. “In many cases **decoupling** **state** from **applications** and building your **microservices** to be as **stateless** as possible results in **maximally reliable, manageable systems**. However, nearly **every** **system** that has any complexity has **state** in the system somewhere, from the records in a **database** to the index shards that serve results for a web search engine. At some point you have to **have data stored somewhere. Integrating** this **data** with containers and container orchestration solutions is often the most **complicated** aspect of building a distributed system. This complexity largely stems from the fact that the move to containerized architectures is also a move toward decoupled, immutable, and declarative application development. These patterns are relatively easy to apply to stateless web applications, but even “cloud-native” storage solutions like Cassandra or MongoDB involve some sort of **manual or imperative steps to set up a reliable, replicated solution**. “ Hightower, Kelsey; Burns, Brendan; Beda, Joe. Kubernetes: Up and Running: Dive into the Future of Infrastructure (Kindle Locations 2908-2915). Kindle Edition.
33. “A **self-healing** infrastructure is an inherently **smart deployment** that is **automated** to **respond** to known and **common failures**. Depending on the failure, the architecture is inherently **resilient** and takes appropriate measures to **remediate** the error.” Laszewski, Tom. Cloud Native Architectures: Design high-availability and cost-effective applications for the cloud (pp. 131-132). Packt Publishing. Kindle Edition.
34. Container **orchestration** tools have emerged following the rise of containerization systems like Docker. Most of these run agents on a pool of container hosts and are able to **automatically** **select** **hosts** to run new **container** **instances**, **replace** **failed** instances, and **scale** numbers of instances **up** and **down**. Some tools also handle **service** **discovery**, **network** **routing**, **storage**, scheduled **jobs**, and other capabilities. Morris, Kief. Infrastructure as Code: Managing Servers in the Cloud (Kindle Locations 2063-2066). O'Reilly Media. Kindle Edition.
