AWS Developer Prep
Tricky Sneaky Nasty questions
EBS vs S3 for storing logs
S3 notifications
Difference between SSE-KMS and SSE-S3
AWS SAM CLI
AppSpec.yml order of hooks
ECS Task Placement Group vs Cluster Query Language
AWS Secrets Manager
Route 53 Weighted Routing Policy
.ebextenstions
to control BeanStalk’s configuraiton
Kinesis Streams and Shards
CloudFront Edge Location Cache Invalidation
SAR template for Serverless function
CodeDeploy Dev, Test, Prod
S3 when nothing else work
CloudWatch Stand-Resolution vs High-Resolution metrics
Target tracking scaling policy
Task Placement Group =~
weird syntax means memberOf
Auto-Scaling Groups and Custom CloudWatch metrics
CloudWatch Logs and SNS notifications
CloudFront HTTPs encryption b/w client and b/w CF and Origin
CloudFormation ChangeSets
Immutable deployment option to use
ECR docker pull commands
CloudHSM?
SAM package and deploy
Web Identity Federation
S3 Overwrites
Really Cool Questions
Difference b/w Data Encryption Key
and CMK
To encrypt large quantities of data with the AWS Key Management Service (KMS), you must use a data encryption key rather than a customer master keys (CMK). This is because a CMK can only encrypt up to 4KB in a single operation and in this scenario the objects are 1 GB in size.
SQS FIFO queues MessageDeduplicationID
vs MessageGroupID
Data Engineer Questions
Kinesis Data Streams Shards
-
CORRECT: “Merge the cold shards to decrease the capacity of the stream” is the correct answer.
-
INCORRECT: “Split the cold shards to increase the capacity of the stream” is incorrect as this would increase the cost of the stream as you are charged on a per-shard basis.
-
INCORRECT: “Replace the shards with fewer, higher-capacity shards” is incorrect you cannot change the capacity of shards. However, it is wise to reduce the number of shards in this scenario.
-
INCORRECT: “Reduce the number of EC2 instances” is incorrect as this will not reduce the cost of the Kinesis Data Stream (consumers are EC2 instances that process the data). In this case we need to decrease the number of shards as they are underutilized. We also don’t know from the question how many EC2 instances there are, the optimum number is equal to the number of shards so decreasing below this ratio would result in not having enough consumers to process the data in the shards.
Caching Strategies: Write through vs Lazy Loading
-
CORRECT: “Use a lazy loading caching strategy” is the correct answer.
-
INCORRECT: “Use a write-through caching strategy” is incorrect as this will load all database items into the cache increasing cost.
-
INCORRECT: “Only cache database writes” is incorrect as you cannot cache writes, only reads.
-
INCORRECT: “Enable a TTL on cached data” is incorrect. This would help expire stale items but it is not a cache optimization strategy that will cache only items that are requested.
SNS weds SQS
-
CORRECT: “Publish the messages to an Amazon SNS topic and subscribe each SQS queue to the topic” is the correct answer.
-
INCORRECT: “Publish the messages to an Amazon SQS queue and configure an AWS Lambda function to duplicate the message into multiple queues” is incorrect as this seems like an inefficient solution. By using SNS we can eliminate the initial queue and Lambda function.
-
INCORRECT: “Create an Amazon SWF workflow that receives the messages and pushes them to multiple SQS queues” is incorrect as this is not a workable solution. Amazon SWF is not suitable for pushing messages to SQS queues.
-
INCORRECT: Create and AWS Step Functions state machine that uses multiple Lambda functions to process and push the messages into multiple SQS queues”” is incorrect as this is an inefficient solution and there is not mention on how the functions will be invoked with the message data