How to become a cloud engineer ?

Publish Date: 23 March 2024

If you are looking to become a cloud engineer, this guide is for you.

In this guide, I tell you about

  1. What are the fundamentals that you need to master to become a cloud engineer.
  2. What are the cloud services you need to focus on when you are getting started. This is important because the cloud is vast and it is easy to get lost without having a direction.

With these skills you should be able to solve a lot of problems on the cloud and clear any interview for cloud engineering.

What does a cloud engineer do ?

A cloud engineer is a person who will:

  1. Architect a solution on the cloud such that the solution will be reliable, secure, performant, cost optimized and has operational excellence.
  2. Understands the intricacies, differences and features of the different services in the cloud so as to choose the best service.
  3. Can investigate issues in the cloud and find solutions to them.

Cloud Engineering vs Devops Engineering

Devops engineering is a lot about smoothening the deployment of applications.

Since most of the applications today are deployed on the cloud, cloud engineering forms a major part of devops engineering. So if you want to become a devops engineer, you become a cloud engineer by default.

For all practical purposes, you can say that these two roles are going to heavily overlap with each other.

Do I need to learn about multiple cloud services to become a cloud engineer ?

There are three major cloud providers in the market today. They are AWS, GCP and Microsoft Azure. If you want to become a cloud engineer today, you need to focus on one of the three cloud providers at the beginning and eventually expand your skills to become better in multiple cloud providers.

All cloud providers provide almost the same set of services but with differences in details of implementation.

Becoming a Cloud Engineer

There are two parts to becoming a cloud engineer. The first one is about mastering the fundamentals of computer engineering. The second one is about mastering the cloud specific details like what services they provide, how they work and how the services compare to each other.

Fundamentals

The fundamental skills that you need to master as a cloud engineer are -

  1. Scripting and automation skills:

    Your scripting and automation skills should be really good because creating and changing cloud infrastructure is going to be a big part of your role. If you are not good at automation, you will have to manually set up all the infrastructure which can be time consuming and error prone. Also once you set up the infrastructure, it will be replicated across multiple environments. With no automation, this part is going to be very hard and boring. Also it’s not scalable.

  2. Computer networking skills:

    A lot of the issues faced in the cloud are related to networking, hence having a command over networking is important. I have written a guide here showing how you can improve your computer networking skills.

  3. Operating System and Linux skills:

    Linux is a big component of running applications in the cloud. So it is good to spend time learning how Linux works. Learn about virtualization, become very good with using the command line and spend time setting up your command line such that it speeds up your workflow.

  4. Debugging Skills:

    These skills are going to develop over time as you work on more and more issues. However, to be able to debug issues, it is important to have the mindset of not giving up. Also it is very important to communicate with others so you can learn from their insights and experience. This might give you a direction in how to solve a particular issue.

  5. Container Skills:

This is not really a fundamental skill and more of a technology. But since a lot of applications in the cloud today run on docker and kubernetes, it is important to learn about how containers work. Start with learning Docker and then slowly move on to learning Kubernetes.

Cloud Services

Once you have mastered the fundamentals, the next part is to focus on the actual cloud and put that fundamentals to use. If your fundamentals are strong you should be at ease with operating the cloud.

The cloud is made up of hundreds of services, but there are a few fundamental services that you should be a master in. Being a master in the below services will cover 80% of the needs. I am going to reference the names of services from AWS in the below section but the other cloud providers also have equivalent services with different names. These services are:

  1. Compute Services:

    A good cloud engineer should be able to choose the right compute service for the task at hand. Compute services can be broadly be classified into three parts:

    1. VMs: This is the oldest form of compute available on the cloud. Here you need to learn about how to spin up a VM and operate one. Learn about the different options available when spinning up a VM, the ways you can connect to a VM, what are the different families of VMs and when to choose which family .
    2. Serverless: Serverless is about running code on the cloud without managing the underlying infrastructure. Compare it with VMs on the following counts - pricing model, deployment strategies, etc.
    3. Containers: Though a container eventually runs on server or serverless infrastructure, I have kept it separate here because of the way containers are managed. They involve some form of a framework like Docker or Kubernetes in between. Learn about the different services that can be used to power containers like EKS and ECS.

    Spend time going into details in each one of the above.

  2. Storage Services:

    Storage services include services to store data. These are broadly classified into three types:

    1. Object Storage: Object store includes S3 and is used to store different kinds of files for images, videos, text, etc
    2. Database Storage: There are tons of different types of databases available on the cloud. However they can broadly be split into two categories - relational and non relational. Understand when to use which one.
    3. Filesystems: This includes the disks that are attached to the VMs. Like your SSDs and your hard drives attached to your local machine. It is important to know when to choose which one.
  3. Networking Services:

    Every cloud provider has a concept called VPC (Virtual Private Cloud) where all the infrastructure operates. It is important to understand how to work with VPCs. Learn how to secure the network using WAF, firewalls and security groups. Learn the difference between public and private subnets.

  4. Identity And Access Management:

    Identity and Access management is all about controlling access to the cloud for different entities. There is a concept called principle of least privilege which is about ensuring that every entity is given the minimum access that he/she/it requires. Learn about how access can be granted to different entities in the most secure way.

  5. Logging and Monitoring Services

    To be able to debug issues, it is very important to have the correct logging and monitoring setup in place. To ensure that there is no downtime, proper alerting should also be set up. Today cloud providers provide a lot of tools that can help in making these tasks easier. AWS has an umbrella service for all such services which is called Cloudwatch. Learn about the best practices around setting up logging and monitoring services.

  6. CDN:

    A CDN helps in serving content to customers on their devices faster by storing the content as close to a customer as possible. AWS provides a service called Cloudfront which provides CDN services. Learn how to set one.

  7. Autoscaling and High Availability:

    Auto Scaling is important for two reasons. The first one is that it helps to save costs, the second one is that it helps to handle a higher amount of traffic during peak hours leading to a better customer experience. Do spend some time learning about AWS Autoscaling groups, Elastic load balancers and AMIs.

  8. Disaster recovery and Backup:

    Disasters can always happen and a good cloud engineer should be able to recover the infrastructure with minimum down time for the customer. Understanding how to create backups and knowing how to create a disaster recovery plan are important skill sets for a cloud engineer.

Once you are done with the above, you must focus on a few aspects whenever designing a solution. These are covered by AWS in a framework called the Well Architected Framework. A good cloud engineer should be able to implement this framework irrespective of what cloud he chooses. These include:

  1. Operational Excellence
  2. Security
  3. Reliability
  4. Performance and Efficiency
  5. Cost Optimization
  6. Sustainability

Read the well architected framework to have a better understanding of each one of the above.

I hope this guide helps you become a Cloud Engineer and subscribe to my newsletter for more such content.

Let me know if you have any questions at [email protected]

Your Man

Sagar Gulabani