AWS Developer Prep

Tricky Sneaky Nasty questions

EBS vs S3 for storing logs

S3 notifications

Difference between SSE-KMS and SSE-S3

AWS SAM CLI

AppSpec.yml order of hooks

ECS Task Placement Group vs Cluster Query Language

AWS Secrets Manager

Route 53 Weighted Routing Policy

.ebextenstions to control BeanStalk’s configuraiton

Kinesis Streams and Shards

CloudFront Edge Location Cache Invalidation

SAR template for Serverless function

CodeDeploy Dev, Test, Prod

S3 when nothing else work

CloudWatch Stand-Resolution vs High-Resolution metrics

Target tracking scaling policy

Task Placement Group =~ weird syntax means memberOf

Auto-Scaling Groups and Custom CloudWatch metrics

CloudWatch Logs and SNS notifications

CloudFront HTTPs encryption b/w client and b/w CF and Origin

CloudFormation ChangeSets

Immutable deployment option to use

ECR docker pull commands

CloudHSM?

SAM package and deploy

Web Identity Federation

S3 Overwrites

Really Cool Questions

Difference b/w Data Encryption Key and CMK

To encrypt large quantities of data with the AWS Key Management Service (KMS), you must use a data encryption key rather than a customer master keys (CMK). This is because a CMK can only encrypt up to 4KB in a single operation and in this scenario the objects are 1 GB in size.

SQS FIFO queues MessageDeduplicationID vs MessageGroupID

SQS Cheat sheet

Data Engineer Questions

Kinesis Data Streams Shards

  • CORRECT: “Merge the cold shards to decrease the capacity of the stream” is the correct answer.

  • INCORRECT: “Split the cold shards to increase the capacity of the stream” is incorrect as this would increase the cost of the stream as you are charged on a per-shard basis.

  • INCORRECT: “Replace the shards with fewer, higher-capacity shards” is incorrect you cannot change the capacity of shards. However, it is wise to reduce the number of shards in this scenario.

  • INCORRECT: “Reduce the number of EC2 instances” is incorrect as this will not reduce the cost of the Kinesis Data Stream (consumers are EC2 instances that process the data). In this case we need to decrease the number of shards as they are underutilized. We also don’t know from the question how many EC2 instances there are, the optimum number is equal to the number of shards so decreasing below this ratio would result in not having enough consumers to process the data in the shards.

Caching Strategies: Write through vs Lazy Loading

  • CORRECT: “Use a lazy loading caching strategy” is the correct answer.

  • INCORRECT: “Use a write-through caching strategy” is incorrect as this will load all database items into the cache increasing cost.

  • INCORRECT: “Only cache database writes” is incorrect as you cannot cache writes, only reads.

  • INCORRECT: “Enable a TTL on cached data” is incorrect. This would help expire stale items but it is not a cache optimization strategy that will cache only items that are requested.

SNS weds SQS

  • CORRECT: “Publish the messages to an Amazon SNS topic and subscribe each SQS queue to the topic” is the correct answer.

  • INCORRECT: “Publish the messages to an Amazon SQS queue and configure an AWS Lambda function to duplicate the message into multiple queues” is incorrect as this seems like an inefficient solution. By using SNS we can eliminate the initial queue and Lambda function.

  • INCORRECT: “Create an Amazon SWF workflow that receives the messages and pushes them to multiple SQS queues” is incorrect as this is not a workable solution. Amazon SWF is not suitable for pushing messages to SQS queues.

  • INCORRECT: Create and AWS Step Functions state machine that uses multiple Lambda functions to process and push the messages into multiple SQS queues”” is incorrect as this is an inefficient solution and there is not mention on how the functions will be invoked with the message data

SQS Dead Letter Queue

Ugly Security Questions

What the heck are EC2 Instance profiles?

Developer-authenticated identities .. eh?

Separate IAM roles

Parameter Store