In the modern cloud-dominated tech industry, certifications are an excellent way to upskill and showcase your proficiency in cloud technologies to employers. AWS certifications offer significant benefits for professionals looking to advance their careers to the next step. They enhance career progress, open new opportunities, act as skill validation and knowledge, and even translate to higher salary potential.
We will cover:
The AWS Certified DevOps Engineer – Professional certification is one of the more elaborate and demanding cloud certifications. It is designed explicitly for DevOps, Cloud, and Site Reliability Engineers and validates skills around operating, designing, deploying, and managing AWS environments at scale according to best practices.
The AWS DevOps Engineer Professional exam targets professionals who have already advanced skills with DevOps concepts on AWS, and it’s not a beginners’ exam. If you are just starting your AWS journey, it would be better to look at the AWS Certified Cloud Practitioner, or if you have some experience, any of the Associate level certifications such as the SysOps Administrator, Developer, Data Engineer, or Solutions Architect Associate.
To start with preparation, look at the comprehensive AWS Certified DevOps Engineer – Professional (DOP-C02) Exam Guide. There, you will find helpful information about the exam’s content, format, results, key services, and topics in scope. Also, it will give you an overall understanding of the six domains that are part of the exam:
- Domain 1: SDLC Automation (22% of scored content)
- Domain 2: Configuration Management and IaC (17% of scored content)
- Domain 3: Resilient Cloud Solutions (15% of scored content)
- Domain 4: Monitoring and Logging (15% of scored content)
- Domain 5: Incident and Event Response (14% of scored content)
- Domain 6: Security and Compliance (17% of scored content)
A great resource to study and sharpen your skills is AWS Skill Builder. There, you can find resources for free, such as the Exam Readiness: AWS Certified DevOps Engineer – Professional course with explanations of key concepts, details for each domain, and a few sample exam questions. If you want to go deeper, follow the 6-hour Exam Prep Standard Course. Other paid resources include Udemy or PluralSight courses.
To practice, check out the Official Exam Prep Practice Question Set. Even more elaborate resources, such as the Exam Prep Enhanced Course and the Exam Prep Official Practice Exam, require a paid subscription to AWS Skill Builder. Other paid resources that offer practice tests are Whizlabs, TutorialsDojo, and Udemy.
A complementary way to study and prepare for the exam is to review the AWS documentation for different services in scope and read relevant DevOps-focused whitepapers and blogs for in-depth knowledge of services and best practices.
Apart from the theory, get your hands dirty, build demo environments recommended in the documentation, and practice automating infrastructure provisioning and application deployment using AWS services like AWS CloudFormation, AWS CodeDeploy, and AWS CodePipeline.
Here’s an overview of the Key Technical Topics that you should focus on for this exam:
- Implement CI/CD pipelines.
- Handle secrets
- Deployment strategies for Application Load Balancer (ALB), Autoscaling Groups (ASG), Lambda, API Gateway, and Elastic Container Service (ECS)
- Reusable Infrastructure as Code (IaC) and configuration management services
- Multi-account, multi-region, multi-AZ best practices and deployments
- Governance and policies
- High availability, replication, and failover methods
- Autoscaling, load balancing, and baching solutions
- Disaster recovery & backup strategies
- Monitoring and observability
- Incident and event response
- Security and identity management
If you think that you aren’t familiar with any of these topics/domains, spend some time studying and reading the relevant AWS Services documentation and recent blog posts.
In the meantime, go ahead and learn how a platform like Spacelift can help you and your organization fully manage cloud resources within minutes.
Spacelift is a CI/CD platform for infrastructure-as-code that supports tools like Terraform, Pulumi, Kubernetes, and more. For example, it enables policy-as-code, which lets you define policies and rules that govern your infrastructure automatically. You can even invite your security and compliance teams to collaborate on and approve certain workflows and policies for parts that require a more manual approach. Book a demo today with one of our engineers to learn more about the platform, or open a free account here.
Use these notes as complementary material, not complete study material for the exam. Check out the official AWS Certified DevOps Engineer Professional Exam Guide for comprehensive information.
These notes only cover some of the knowledge you should have for some of these services. In these notes, I gathered more advanced points and topics regarding these services in one place while I was studying for the exam. If you aren’t familiar at all with any of these services, spend some time understanding them first.
AWS frequently changes information, configuration, and options of different services, so some of the content may become outdated. Make sure to cross-check and validate the information you are getting from online sources with the official AWS Documentation and FAQs of each service before your exam.
These notes are an extract of knowledge and tips I gathered during the exam, use them as a complement to your own notes.
1. AWS CodeCommit
- To migrate a Git repository to CodeCommit, clone the Git repository with a mirror argument to the local computer and push the repo to CodeCommit.
2. AWS CodePipeline
- You can use the
runOrder
to specify parallel actions and use the same integer for each action you want to run in parallel. - Pattern: Deploy to pre-prod, then manual approval, then deploy prod.
- To invoke an external action, trigger a Lambda or Step Functions.
- CodePipeline does not support conditional actions. You can retry a failed action in a particular stage, but you cannot skip a failed action to invoke the next stage. You should use AWS EventBridge to respond to failures and other pipeline-related actions.
- CodePipeline cannot invoke another CodePipeline directly. This is something you can achieve using a Custom Action and a Lambda function,
3. AWS CodeBuild
- By default, the builds don’t have access to VPCs. You will need to enable this to access AWS resources.
- Some default environment variables are provided already (
AWS_DEFAULT_REGION
,CODEBUILD_BUILD_ARN
). - You need a service role that allows CodeBuild to access AWS resources.
- Use AWS KMS to encrypt build output artifacts.
- Leverage build badges that display the status of the latest build accessed through a public URL for your CodeBuild project. Badges are available at the branch level.
CODEBUILD_SOURCE_VERSION
is exposed at runtime within CodeBuild and represents the branch name of the code being tested for CodeCommit.- To create builds for PRs, use an EventBridge rule for new CodeCommit PRs and trigger Codebuild. Use another EventBridge rule to listen for the success/failures of the builds and update the PR with the build outcome.
4. AWS CodeDeploy
- The CodeDeploy agent must run on the target EC2 instances. The instances should be able to access S3 to get deployment manifests.
- No need to configure the agent for ECS deployment, or Lambda.
- Deployment speed: AllAtOnce, HalfAtATime, OneAtATime, Custom(define the %), blue/green.
- Automate the traffic shift for lambda aliases, integrated within the SAM framework.
- CodeDeploy on EC2: Hooks to run scripts during the deployment (before/after block traffic, application stop, before/after install, application start, validate service, before/after allowing traffic).
- Hooks on ECS are Lambda functions to execute.
- CodeDeploy cannot deploy a CloudFormation stack.
- The AWS Service Catalog deploy action does not exist in CodeDeploy. Add a Lambda function as an action in CodePipeline to verify and push new versions of products into the AWS Service Catalog by invoking the Service Catalog API.
- You can configure a deployment group or deployment to automatically roll back when a deployment fails or when a specific monitoring threshold is met.
5. AWS CodeArtifact
- For cross-account access, a principal can only read all the packages in a repository or none of them.
- Upstream repositories to handle dependencies(up to 10) and only one external connection.
- Only one external connection to one CodeArtifact repository that acts as a cache.
- If you have multiple intermediate repositories as upstreams to fetch a package, they aren’t stored in the intermediate repositories.
- A domain defines a common storage for repositories. An asset only needs to be stored once in a domain, and metadata records are updated to other repositories.
- A Policy on domain defines which accounts have access to repositories in the domain.
6. Amazon CodeGuru
- Reviewer: ML Automated code review for static code analysis.
- Secrets Detector: Identify hard-coded secrets on code, config, and documentation and suggest remediation.
- Profiler: Application performance recommendation during runtime. AWS or on-premise. Minimal overhead on application. Works for Lambda functions.
7. EC2 image builder
- Share images using Resource Access Manager (RAM).
- Store the latest AMI ID in SSM Parameter Store.
- A recipe is a document that defines the base image and the components applied to the base image to produce the desired configuration for the output AMI.
- Use an Image Builder component to customize an instance before image creation or test an instance launched from the created image.
8. AWS CloudFormation
- Parameters: Leverage them to reuse templates. Some inputs can’t be determined ahead of time.
- Pseudo Parameters: Predefined references (e.g. AWS::AccountId).
- Mappings: Hardcoded key value pairs within the template.
- Outputs: Export and import to other stacks and link different templates. You can’t delete a stack that has output references somewhere else. Import with Fn::ImportValue function.
- Conditions: To control the creation of resources or outputs based on a condition.
- By default, if there is a failure, everything gets rolled back – option to preserve the successfully created resources.
- Stack Policies define what update actions are allowed on specific resources during stack updates. When you set a Stack Policy, all resources in the Stack are protected by default. Specify explicit ALLOW for the resources you want to be allowed to be updated.
- Nested stacks: Update always the root (parent) stack first. It’s considered best practice to split stacks.
- cfn-init: Function to install packages, start services etc.
- cfn-signal: Wrapper to signal with CreationPolicy or WaitFor condition enabling you to synchronize other resources. Run it after
cfn-init
to signal CloudFormation if the resource was successfully created or not. Used with a WaitCondition until it receives a signal fromcfn-signal
. - cfn-hup: Daemon to check for updates to the metadata and execute custom hooks when changes are detected.
- When in a
UPDATE_ROLLBACK_FAILED
state, either skip resources that can’t be rolled back or fix the errors outside of CloudFormation and continue the rollback. - For nested stacks, rolling back the parent stack will try to roll back all the stacks below as well.
- Custom Resources: SNS-backed or Lambda-backed. Use case for an AWS resource that is not covered yet or on-prem resource.
- Set a Service Role for the CloudFormation stack to give specific permissions. Without a service role, CloudFormation uses the IAM permissions of the user who created the stack.
- Reference SSM Parameters from Parameter Store. CloudFormation always fetches the latest value.
- Dynamic reference to retrieve during runtime secrets from Secrets Manager or parameters from SSM Parameter Store. Supports types:
ssm
,ssm-secure
(encrypted strings),secretsmanager
. - StackSets: Deploy to multiple regions and accounts. Update always affects all stacks. With Organizations, you can automatically deploy to new accounts and delegate StackSets administration to member accounts (Trusted Access with Organizations).
AWS Service Catalog
- Self-service portal to launch authorized products pre-defined by admins.
- Product = CloudFormation template.
- Portfolio = collection of products.
- Allows users to not use CloudFormation directly but only through Service Catalog. Allow users to launch products without deep AWS knowledge in a self-service fashion, with governance, compliance, and consistency.
- Define product deployment options/restrictions with StackSets for accounts, regions, and permissions.
- Launch Constraints: IAM role assigned to a product.
- Template constraints at the AWS Service Catalog level minimize overhead and provide constrained access to the templates.
AWS Systems Manager
- EC2 and on-prem management at scale for patching or automation use cases.
- Works for both Linux and Windows.
- Use Resource Groups to operate on a group of instances. They can be defined with tags.
- SSM automation for common maintenance tasks such as restarting or starting/stopping instances, creating AMI, EBS snapshots, building a golden AMI, remediating Config items
- SSM run command: Run commands IN EC2 instances
- SSM Session manager: No need for SSH access, keys, and bastion hosts
- Systems Manager Default Host Configuration(DHMC): When enabled, automatically enrolls ec2 instances to SSM as managed instances without using an instance profile role. We need to have IMDSv2 enabled and SSM agent installed. Automatically keeps SSM up to date. It must be enabled per Region.
- It can be used for hybrid, on-prem, IoT devices, and edge devices. Create hybrid activation, get activation code & ID, install SSM agent, and register with activation code & ID.
- For IoT Greengrass, install SSM agent + add Token Exchange Role(IAM role).
- SSM OpsCenter allows you to view, investigate, and remediate issues in one place.
- Use VPC endpoints to connect with SSM Session Manager to EC2 instances in private subnets.
- You must create a managed-instance activation to set up servers and virtual machines (VMs) in your hybrid environment as managed instances. After completing the activation, you will receive an activation code and activation ID. You specify this Code/ID combination when you install SSM agents on servers and VMs in your hybrid environment. The Code/ID provides secure access to the Systems Manager service from your managed instances. In the Instance limit field, specify the total number of on-premises servers or VMs you want to register with AWS as part of the activation. This means you don’t need to create a unique activation Code/ID for each managed instance.
- With AWS AppConfig, you can configure and deploy dynamic configurations independently of any code deployments.
9. AWS Organizations
- To migrate accounts to an Organization, you must manually create the
OrganizationAccountAccessRole
. - All accounts receive the price reduction benefit for RIs. The management account can turn off the Reserved Instances (RIs) discount and Savings Plan discount sharing for any account at the Organization level.
- Move accounts between Organizations: a) remove the member account from Organization A b) send an invitation to the member account from Organization B c) accept the invitation from the member account to join Organization B.
- Within the organization’s management account, you can enable resource sharing at the organization level.
10. Service Control Policies (SCPs)
- SCPs apply to all users within a member account, including the root user.
- SCPs don’t affect Service-linked roles.
- SCPs don’t apply to the management account.
11. AWS Control Tower
- Prescriptive guidance for a multi-account strategy on top for AWS Organizations.
- Account Factory: automates AWS account provisioning and deployments. Uses AWS Service Catalog to provision new accounts. Possible to customize account creation with Account Factory Customization and custom blueprints. Blueprints are basically CloudFormation templates defined as Service Catalog Products stored in a dedicated account.
- Guardrails to detect and remediate policy violations: a) Preventive with SCPs. b) Detective with AWS Config c) Proactive with AWS CloudFormation hooks
- Customizations for CT (CfCT): GitOps customization framework using CloudFormation templates and SCPs. Automatically deploy resources to new AWS accounts created using Account Factory.
- Account Factory for Terraform: Customize accounts with Terraform.
- To successfully enroll an account, the account must have the
AWSControlTowerExecution
role that grantsAdministratorAccess
permission. The role also must have an associated trust relationship.
12. AWS Resource Access Manager (RAM)
- Create a resource share using AWS RAM in the same account containing the resource you need to share.
13. AWS Elastic Beanstalk
- Deployment options: all at once, rolling update, rolling update with additional batch, immutable (instances in new AWS and then swaps instances), green/blue (new environment and switch when ready with swap URLs option), traffic splitting (canary testing, % of traffic to new deployment).
Supported Deployment Policies
Deployment policy | Load-balanced environments | Single-instance environments | Legacy Windows Servers environments |
All at once | Yes ✅ | Yes ✅ | Yes ✅ |
Rolling | Yes ✅ | No ❌ | Yes ✅ |
Rolling with an additional batch | Yes ✅ | No ❌ | No ❌ |
Immutable | Yes ✅ | Yes ✅ | No ❌ |
Traffic Splitting | Yes ✅ (Application Load Balancer) | No ❌ | No ❌ |
14. Amazon ECR
- When a repository is encrypted with KMS, to provide access to all the associated AWS accounts in the organization, the KMS key needs a statement that allows KMS operations with a condition that is based on the organization ID.
- ECR repository policies control access to repositories.
15. AWS Lambda
- Use AWS Application Autoscaling to improve the performance under heavier loads.
- An external extension runs as an independent process in the execution environment and continues to run after the function invocation is fully processed.
- An internal extension runs as part of the runtime process.
- By default, AWS Lambda has a max concurrent execution of 1000 per region. This can be increased with quota increase request.
16. Amazon API Gateway
- Enable API Caching to enhance responsiveness.
- API Gateway lets you use mapping templates to map the payload from a method request to the corresponding integration request and from an integration response to the corresponding method response.
- In a canary release deployment, total API traffic is separated randomly into a production release and a canary release with a preconfigured ratio. The updated API features are only visible to the canary release. The canary release receives a small percentage of API traffic, and the production release takes up the rest.
17. AWS Step Functions
- Opt to use Step Functions when there is a requirement to keep an audit trail of all executions.
18. Amazon SQS
- The dead-letter queue does not have a metric to monitor the approximate number of messages visible in the queue.
19. Amazon EventBridge
- An EventBridge rule can detect CloudTrail AWS login failure events and send notifications to an email address that is subscribed to an SNS topic.
- You can use EventBridge to detect and react to AWS Health events. Depending on the type of event, you can capture event information, initiate additional events, send notifications, and take corrective action. Trigger then a Lambda or Step Functions for complex flows.
20. AWS Serverless Application Framework (SAM)
- Use CodeDeploy to deploy to Lambda.
- It can assist you with running Lambda, DynamoDB, and API Gateway locally.
21. AWS CloudTrail
- IAM user login events are registered only in the
us-east-1
region within the CloudTrail event history. - To determine whether a log file was modified, deleted, or unchanged after CloudTrail delivered it, you can use CloudTrail log file integrity validation. This feature uses industry-standard algorithms: SHA-256 for hashing and SHA-256 with RSA for digital signing.
22. Amazon CloudWatch
- To export logs to an S3 bucket: The S3 bucket must be in the same Region as the log group. S3 Cross-Region Replication (CRR) can copy the logs to the required S3 bucket in a different Region.
- A CloudWatch metric stream can stream metrics to an Amazon Kinesis Data Firehose delivery stream that delivers your CloudWatch metrics to a data lake such as Amazon S3. The Kinesis Data Firehose delivery stream must trust CloudWatch through an IAM role that has write permissions to Kinesis Data Firehose.
- You can use subscriptions to get access to a real-time feed of log events from CloudWatch Logs and have it delivered to other services such as an Amazon Kinesis stream, an Amazon Kinesis Data Firehose stream, or AWS Lambda for custom processing, analysis, or loading to other systems.
23. AWS X-ray
- For granular monitoring, generate subsegments.
- For Lambda functions, configure the monitoring package as a Lambda external extension on the functions.
24. AWS Config
- Organizational rules that you manage and deploy across all accounts via management account or delegated administrator.
- The
cloudformation-stack-drift-detection-check
AWS Config managed rule checks whether the actual configuration of a CloudFormation stack differs, or has drifted, from the expected configuration. - The
iam-user-unused-credentials-check
AWS Config managed rule can check whether IAM users have passwords or active access keys that have not been used within a specified number of days. - An AWS Config conformance pack identifies the impact of enrolling an account in AWS Control Tower. This information can be assessed to increase the likelihood of an uneventful enrollment.
- You cannot configure AWS Config’s automatic remediation to target a Lambda function directly.
- An aggregator is an AWS Config resource type that collects AWS Config configuration and compliance data from 1) Multiple accounts and multiple regions 2) Single account and multiple regions. 3) An organization in AWS Organizations and all the accounts in that organization that have AWS Config enabled.
- To isolate alerts for a specific rule, you have to use EventBridge rules, which can then have a particular SNS topic as a target for alerting.
25. IAM Identity Center
- Access for SAML-2.0 enabled apps, business apps, and AWS accounts.
- You can have only one identity source per organization in AWS Organizations.
- For identity source, you can choose Identity Center directory, Active Directory, or External identity provider (Okta or Microsoft Entra ID).
26. AWS WAF
- Used for layer 7 type of attacks.
- Send logs to CloudWatch, S3, or Kinesis Data Firehose.
- AWS WAF can add a label to any requests that match the rule.
- The scope down statement is a nestable rule statement that you add inside a managed rule group or a rate-based statement to narrow the set of requests that the containing rule evaluates.
27. AWS Firewall Manager
- Manage rules in all accounts of an organization. Manage rules from WAF, Shield Advanced, Security Groups, AWS Network Firewall(VPC lvl), Route53 resolver DNS firewall.
- Apply the configured rules to new accounts in the Organization. Apply AWS WAF rule groups across multiple AWS accounts. Firewall Manager policies for AWS WAF can target entire organizations, specific OUs, or a list of AWS accounts.
- AWS Config is required for the use of Firewall Manager security policies.
28. AWS GuardDuty
- Use a Trusted IP list to add IP addresses and CIDR ranges that GuardDuty doesn’t consider to trigger findings.
29. Amazon Detective
- Detective helps to analyze, investigate, and quickly identify the root cause of security findings.
30. AWS Trusted Advisor
- You can use EventBridge to create a rule with Trusted Advisor as the event source.
- Can check for popular code repositories for access keys that have been exposed to the public and for irregular Amazon Elastic Compute Cloud (Amazon EC2) usage that could result from a compromised access key.
31. Amazon EFS
- Access points are application-specific entry points to the EFS file system and provide clients with access to a specific directory or subdirectory on the file system. Use the EFS access point to give read/write access to the filesystem.
- Configure the least privileged access by using an IAM resource policy (also referred to as an EFS file system policy) and access points. You use the file system policy to grant permissions to clients, which are identified by IAM roles.
32. Amazon RDS
- You can minimize downtime on an upgrade by using a rolling upgrade using read replicas. Amazon RDS doesn’t fully automate one-click rolling upgrades. However, you can still perform a rolling upgrade by creating a read replica, upgrading the replica by using the property
EngineVersion
, promoting the replica, and then routing traffic to the promoted replica. - RDS supports Blue/Green deployments for database updates.
33. Amazon ElastiCache
- If you have cluster mode enabled, the configuration endpoint allows applications to discover primary and read endpoints for each shard in the cluster.
34. Amazon DynamoDB
- The time to live (TTL) will delete old items from the table.
- DynamoDB streams capture information about every modification to data items in the table. DynamoDB Streams only capture item-level events. Streams operate asynchronously, so there is no performance impact on a table if you enable streams.
- DynamoDB is integrated with AWS Lambda so that you can create triggers to respond to events in DynamoDB Streams. At most, two processes(e.g Lambda functions) should be reading from the same stream’s shard simultaneously. Having more than two readers per shard can result in throttling. Therefore, you need to use a fan-out pattern with SNS being perfect for that.
35. AWS Database Migration Service (DMS)
- You can modify the existing replication instance by turning on Multi-AZ support. Multi-AZ support creates a standby replica of the replication instance in another Availability Zone for failover.
36. AWS Glue
- AWS Glue crawlers do not support include patterns. AWS Glue crawlers support only exclude patterns.
In this blog post, we deep-dived into the AWS DevOps Engineer Professional Certification and discussed its scope, content, and preparation material. Even more, we listed a few more advanced key topics to look at before the exam.
Thank you for reading, and I hope you enjoyed this as much as I did!
The Most Flexible CI/CD Automation Tool
Spacelift is an alternative to using homegrown solutions on top of a generic CI. It helps overcome common state management issues and adds several must-have capabilities for infrastructure management.