The evolution of cloud in the last decade has radically simplified software deployment. 

Undoubtedly, major providers like AWS, Azure, GCP, Alibaba cloud have removed the barrier of entry of any developer to either launch a new product or scale infinitely. However, no software company can keep scaling without taking a look at the cost of revenue. Andreessen & Horowitz reports that in extreme cases, large cloud deployments may incur a whopping 50% cloud spend of the total cost revenue! 

Barring the extreme cases, 10%-25% of cloud spend of the total cost of revenue is quite common. In our experience, at least for SaaS companies, any cloud spend % that spills over to double digit is eventually questioned which lead to conversations around optimization, migrations and even repatriation.

Many solutions including those provided by the cloud providers have emerged to provide better visibility and optimization. Zesty, Apptio and Harness and many others have been solving this problem for many companies. 

In this article, we will focus on addressing cost visibility and optimization with a design first principle for any environments powered by

What is Cost Optimized by Design?

Cost optimizations can be draining once the environments on the cloud grows to become sizable. 

Tagging, co-relating and optimizing the resources across a technology team that otherwise has a roadmap to meet is non-trivial!

Even when it is done once, the net-new development often clutters the visibility if not managed through a strict process. Instead of doing the optimizations again and again, it may be a better idea to ensure you are cost optimized by design.

We discuss four aspects of cost sustenance below.

1. Cataloguing reduces Leakage Cost 

Leakage costs, however small, are pure wastage! Even a 10$ per day bucket, that remains undetected for 1 year will cost 3,650$ annually. At times, these smaller costs are extremely hard to spot.

 In, by scanning the Software Product Catalog, it can be derived if there are any resource that has no dependencies left. Additionally, in the above S3 bucket example, it is possible to mandate programmatically a lifecycle rule for every bucket created. This can apply to any cloud resource provisioned.

2. Propagating Tags and drawing co-relation 

Tagging of resources and co-relating the usage is the most common way companies approach grouping and building visibility of their usage. This includes manual tagging or changing the IaC to ensure tags are present consistently in every resource. 

In an ideal case, where cost visibility is required to be broken down by assets, teams, products and PnLs, tags also evolve over time and need to propagate to other systems like metric collection systems as well. For e.g., if you are deploying a micro-service in a shared service like Kubernetes, the underlying tags of the Kubernetes nodes will be insufficient to indicate how much this particular micro-service cost.

In order to handle such situations, Facets software product catalog takes a more granular approach of accepting tags at each micro-service definition level that would eventually propagate to the Kubernetes cluster with a shared tag (a collective cost) and to the prometheus metrics to identify the share of each service out of the collective cost.

3. Optimizing Non-production 

Non-production environments (Like Dev, QA, Stage, UAT or Load test environments) can be anywhere between 10%-25% of your overall cloud spend. Since these environments usually don't have a sustained workload, they usually are considered first under the cost optimization axe !

Inadequate non-production environments hamper productivity and can even lead to major issues leakages to the production. employs various aggressive optimizations for non-production environments so that they are adequate at the same time are cost effective.

  1. Ephemeral clusters: From a software product catalog, any number of ephemeral clusters can be launched in minutes to perform a specific task such load testing. These clusters can be kept alive only during the test and no effort or cost is spent in transitional phase of creating and terminating these clusters
  2. Compaction: Facets exposes a simple to use configuration called "compaction factor" that packs more applications to a particular node as compared to a similar deployment in production.
  3. Spot nodes: Especially on AWS, Facets can provision the non-production cluster entirely on spot nodes. With a trade-off of occasional potential disruption (notified promptly on channels), this approach saves a ton of cost !

4. Spot Maximization in Production 

Spot instances provide a great plan C among:

  1. Discounts through reservations (Higher Commitment)  
  2. On-demand instances (Higher Optionality).

They provide high discounts without any commitments but can be disruptive to the deployments by getting terminated and at times, not be available for long hours.

Conclusion provides the following instrumentation for a typical deployment which can be customized to suite the criticality of the service to the business

  1. You can specify non-cloud-ready applications (applications that can't sustain restarts gracefully) in the catalog which would pin all such applications to on-demand node-groups. All such on-demand node base capacity is known upfront so you can make potential reservations of those.
  2. Cloud-ready apps are mapped to spot fleet with on-demand fall back. Hence, if there is any sustained period of spot unavailability, the node group falls back to on-demand instances with appropriate notification on the chat-ops tools. The Ops team can then manually move away from on-demand node group to spot-fleet  when spot nodes are available for a consistent period.