What is the a CloudFirst architecture?

In CNCF’s definition, CloudNative architecture’s key characteristics should include:

Containerization: Packaging applications and their dependencies in lightweight, portable containers for fast, consistent deployment across environments.
Microservices: Designing applications as a collection of small, independent services, each performing a single business function, communicating through well-defined APIs.
Dynamic Orchestration: Using modern, declarative automation tools for dynamic scheduling of containerized applications, including auto-scaling, self-healing, and optimizing resource use.
Declarative APIs: Employing declarative APIs to describe and maintain the desired state of applications, enabling automated management without manual intervention.

However, if we carefully examine the rules mentioned above, we’ll find that this set of rules isn’t quite suitable for foundational services like databases. Microservices enhance developmental and managerial efficiency, but the data transfer between services can lead to significant bandwidth pressure and increased latency. Moreover, the inherently distributed nature brings about consistency challenges. Containerization merely allows services to run within containers, which naturally are not friendly to I/O performance. And dynamic orchestration essentially refers to scaling within a Kubernetes cluster, which does not conveniently allow for real-time procurement of computing resources across cloud computing regions. Thus, these rules are more Kubernetes-native rather than truly cloud-native.

We propose a Cloud First architecture, to distinguish it from Cloud Native.

CloudFirst was designed for software deployment on cloud infrastructure providers. Here are the specific principles:

Use object storage as the storage solution.
Leverage cloud infrastructure to implement functionalities like Message Queuing, Load Balancing, and Key Management Services (KMS), ensuring durability, high availability, and the elimination of Single Points of Failure (SPOF).
Primarily manage permissions through IAM (Identity and Access Management).
Rely on Infrastructure as Code (IaC) for deploying applications rather than scripts, dynamically allocating resources based on workload needs.
Prefer Serverless over traditional services, treating cloud-provided virtual machines as higher-cost, higher-performance cloud functions.

we describe the reasons below:

Prefer object storage to Elastic Block Storage

Object storage offers higher throughput, automatic tiering, and pay-as-you-go pricing. In contrast, EBS requires provisioning of space, which can be costly, and its throughput is relatively limited. Traditional software typically relies on low-latency IO storage and does not require high IO throughput. However, object storage is much more cost-effective and offers very high durability. To use object storage, one often needs to change the way data is written to and read from the storage. Despite this, using object storage can significantly improve both cost-efficiency and throughput.

Choose cloud infrastructure instead of open source software

In the cloud, open source software is deployed on virtual machines (VMs) because they interface with hardware through a virtual layer. VMs often lack sufficient SLA guarantees on the cloud, necessitating multiple replicas to ensure high availability and durability. Additionally, managing VM-based software can be quite challenging. However, cloud-provided managed services come with SLA commitments and maintenance included. Most of these services charge based on usage, which can lead to lower costs and higher workload capacity when dealing with unpredictable workloads, compared to using provisioned computing resources.

IAM is a superior option for cloud security compared to using passwords or access keys

Using IAM to manage permissions not only enhances security but also supports operational efficiency, compliance, and scalability in managing digital identities and access rights within an organization.

IaC is the only choice for deployment on Cloud Era

Kubernetes YAML is a common type of Infrastructure as Code (IaC), but in cloud computing, IaC can do much more than just deploy images to virtual machines, Lambda functions, or containers. It also allows for the definition of required Platform as a Service (PaaS) offerings, permissions, and other useful cloud services such as LB and KMS. IaC can even be used to set up necessary logging services. This approach is the only correct method for utilizing cloud computing to its full potential.

Serverless is the perfect blend of cost efficiency and performance

For workloads with significant variability, serverless computing offers the highest return on investment (ROI) because it eliminates the need to reserve resources for peak workload levels, thereby reducing waste. Traditional multi-tenant architectures often rely on interpreted execution to use the same infrastructure for different workloads, but this approach sacrifices both complexity management and resource efficiency. In data analytics services, for example, the same dataset and query can result in vastly different workloads. Therefore, not using serverless can mean either substantial idle waste or long cold start times.

The challenge of Cloud First architecture

Due to the significant differences between IDC (Internet Data Center) infrastructure and cloud infrastructure, a Cloud First architecture requires many design changes distinct from traditional software architectures to fully leverage the true potential and value of cloud infrastructure.

Dynamically allocate resources instead of overloading a single server

Dynamic resource allocation on the cloud can be achieved through IaC (Infrastructure as Code) techniques, but the challenge isn’t just allocating resources; it’s allocating the right resources based on the workload. Appropriately assigning resources is relatively difficult, and handling allocation failures also presents a significant challenge.

Leverage IO over CPU

In traditional computing environments, CPU resources are generally in surplus, while IO is slow. Therefore, strategies often involve trading time for space. However, in cloud computing, CPUs are bundled with memory and sold together, making them more expensive. By utilizing object storage, the same data can be accessed across multiple instances without bandwidth interference from one to another. Improving IO usage represents a complete overhaul of all data storage methods.

Reduce memory retention

Traditional software often utilizes memory to save state, such as metadata that is typically obtained through costly loading or computation. Therefore, traditional applications usually require a warm-up period because they are designed to run continuously for days, or even years, after startup. However, this approach is not well-suited to cloud computing environments where compute resources should be used and then discarded to minimize billing time.

Use Local storage as /tmp

Traditional software treats local disk storage as a completely reliable repository, aiming to flush data to the local disk even in events like power failures, to facilitate data recovery upon the next startup. However, in cloud environments, the data on a VM’s local storage cannot be guaranteed to be preserved; data loss is still a possibility even with three copies. Theoretically, achieving high availability and durability in the cloud incurs significant costs.

Therefore, the correct approach is to use high-performance instance storage as an expanded /tmp (temporary storage), rather than as a reliable storage solution, almost all traditional software must be rewritten to adapt to the characteristics of the cloud.

Minimize resource occupancy

Cloud computing bills based on the duration of resource usage, so traditional optimization methods like reducing memory usage are not always suitable in the cloud. Instead, the cloud emphasizes resource * time. Often, it is more cost-effective to use more resources for a shorter duration and then release them. The challenge for CloudFirst architecture is to maximize resource utilization in bursts, rather than optimizing performance within a fixed set of resources.

Conclusion

Using a CloudFirst architecture, you can achieve better functionality and lower costs compared to traditional software, but as a developer, you need to redesign your architecture and software to align with cloud characteristics. Optimization techniques important in the IDC era are less relevant in the cloud era, which prioritizes reducing state, providing burst capabilities, optimizing IO usage, avoiding reliance on unreliable single-machine storage, and using object storage instead. Additionally, integrating automatic resource allocation and release directly into the software is essential.