Overview of AWS Lambda
AWS Lambda can be used to run and scale your code. AWS Lambda is a serverless compute service which has been designed to allow you to run your application code without having to manage and provision your own EC2 instances. This saves you having to maintain and administer an additional layer of technical responsibility within your solution. Instead, that responsibility is passed over AWS to manage for you. If you don’t need to spend time operating, managing, patching, and securing an EC2 instance, then you have more time to focus on the code of your application and its business logic, while at the same time optimizing costs.
With AWS Lambda, you only ever have to pay for the compute power when Lambda is in use via Lambda functions. And I shall explain more on these later. AWS Lambda charges compute power per hundred milliseconds of use only when your code is running, in addition to the number of times your code runs. With subsecond metering, AWS Lambda offers a truly cost-optimized solution for your serverless environment.
How does it work?
So how does it work? Well, there are essentially four steps to its operations.
- Firstly, AWS Lambda needs to be aware of your code that you need to run. So you can either upload this code to AWS Lambda or write it within the code editors that Lambda provides. Currently, AWS Lambda supports Node.js, Python, Java, C#, Go, and also Ruby. It’s worth mentioning that the code that you write or upload can also include other libraries.
- Once your code is within Lambda, you then need to configure Lambda functions to execute your code upon specific triggers from supported event sources such as S3. As an example, a Lambda function could be triggered when an S3 event occurs, such as an object being uploaded to an S3 bucket.
- Once the specific trigger is initiated during your normal operations of AWS, AWS Lambda will run your code, as per your Lambda function, using only the required compute power as defined. Later in this post, I will cover more on when and how this compute power is specified.
- AWS records the compute time in milliseconds and the quantity of Lambda functions run to ascertain the cost of the service.
The Lambda service itself can be found within the AWS Management Console under the Compute category. As remember, Lambda is providing a compute function for your code to run on.
Components of AWS Lambda
For an AWS Lambda application to operate, it requires a number of different elements, so I just want to take a few minutes to explain what each of these are. The following form the key constructs of a Lambda application.
- The Lambda function: The Lambda function is compiled of your own code that you want Lambda to invoke as per the defined triggers.
- Event sources: Event sources are AWS services that can be used to trigger your Lambda functions. Or put another way, they produce the events that your Lambda function essentially responds to by invoking it.
- Downstream resources: These are the resources that are required during the execution of your Lambda function. For example, your function might call upon accessing a specific SNS topic or a particular SQS cue, so they’re not used as the source of the trigger, but instead they are the resources to be used to execute the code within the function upon invocation.
- Log streams: In an effort to help you identify issues and troubleshoot issues with your Lambda function, you can add logging statements to help you identify if your code is operating as expected into a log stream. These log streams will essentially be a sequence of events that all come from the same function and are recorded in CloudWatch.
In addition to log streams, Lambda also sends common metrics of your functions to CloudWatch for monitoring and alerting. Now we have a high level understanding of AWS Lambda is, let me dive deeper into each of these components that I’ve just mentioned so you can understand how they’re linked together to enable AWS Lambda to be used as an event-driven method of executing code within AWS across a serverless architecture.
Selling Lambda to your team by comparing with other services
Some use cases of AWS Lambda are:
- designing advanced materialized views out of DynamoDB tables,
- reacting to uploaded files on S3, and
- processing SNS messages or Kinesis streams.
In short, you can write a stateless Lambda function that will be triggered to react to certain events (or HTTP endpoints).
Data/Events are delivered to your code.
Lambda opens up all kinds of new possibilities and can lower your costs at the same time. When running a job processing server in EC2, you are charged for compute-time as long as your instance is running. Contrast that with Lambda, where you are only charged while actually processing a job, on a 100ms basis. Basically, you never pay for idle time.
This makes Lambda a great fit for spiky or infrequent workloads because it scales automatically and minimizes costs during slow periods. The event-based model Lambda provides makes it perfect for providing a backend for mobile clients, IoT devices, or adding no-stress asynchronous processing to an existing application, without worrying too much about scaling your compute power.
The current computing landscape for AWS looks crowded at first glance with EC2, Elastic Beanstalk, EKS, Lambda, Simple Workflow Service, and more vying for your workload. Before I start the example, you should understand where Lambda fits in compared with other compute services.
EC2 vs Lambda
- EC2 is the most basic service, as it only provides the instance with a base image while you supply the automation, configuration, and code to run. It’s the most flexible option, but it also requires the most work from you.
Lambda is the complete opposite in that it handles: provisioning, underlying OS updates, monitoring, and failover transparently. You only need to provide the code that will run and specify what events should trigger your code. Scaling a Lambda function happens automatically; AWS provisions more instances as needed and only charges you for the time your function runs.
Elastic Beanstalk vs Lambda
Elastic Beanstalk is a PaaS that lets you deploy code without worrying about the underlying infrastructure. However, compared to Lambda, it does provide more choices and controls. You can deploy complete applications to Elastic Beanstalk using a more traditional application model compared to deploying individual functions in Lambda.
EKS vs Lambda
Amazon Elastic Container Service (Amazon ECS) and Amazon Elastic Container Service for Kubernetes (Amazon EKS) are centered around containers compared to the individual functions of Lambda. ECS and EKS require less managerial overhead compared to running containers on EC2 instances, but generally require some operational expertise. Lambda is ideal for developers who just want to focus on their code.
Simple Workflow Service vs Lambda
Simple Workflow Service is a coordination service, and you must provision workers to complete your tasks.
In this example, we will set up a Lambda function, learn how to test code in the AWS Console, and discuss different event sources for bringing data into Lambda. Functions like the ones you will write here can be used to help keep data in sync, fan out writes to users’ news feeds, or update indexes in DynamoDB and other databases.
Event Sources and Event Source Mappings
An event source is an AWS service that produce the events that your Lambda function responds to by invoking it.
Poll or Push based
Event sources can either be poll or push-based. At the time of writing this post, the current poll-based event sources are Amazon Kinesis, Amazon SQS, and DynamoDB. When using these services as an event source, Lambda actually polls the service looking for particular events.
For example, Lambda will poll the message queue for SQS and then Lambda will synchronously invoke the associative function when a matching event is found.
Push-based event sources cover all the remaining supported event sources. Services using this push model publish events in addition to actually invoking your Lambda function. That’s one of the key differences.
The event source service invokes the function instead of Lambda, which is what happens for poll-based event sources.
Event source mappings
Event source mappings: Simply put, an event source mapping is the configuration that links your event source to your Lambda function.
It’s what links the events generated from your event source to invoke your Lambda function. However, depending on if the event source is push or poll-based, determines where this event source mapping is configured and stored. For push-based event sources, the mapping is maintained within the event source. Using the appropriate API calls for the event source service, you are able to create and configure the relevant mappings.
For example, using the API for S3 bucket notifications, you can specify which events to publish within that bucket and which of your Lambda function to invoke based on those notifications. This will require specific access to allow your event source to invoke the function. And this access is granted through the Lambda function policy, which we discussed in the previous lecture. Poll-based event source mappings are sightly different in that the configuration of the mappings are held within your Lambda function instead. Using the CreateEventSourceMapping API, you can set up the relevant event source mapping for your poll-based service.
Again, permissions will be required, but this time, instead of the permissions being modified within the function policy, permission is required in the Execution role policy, as it will be the function that will be polled in the service, looking for a match relating to the mapping.
Synchronous vs Asynchronous invocations
When I explained what event sources were, I mentioned synchronous invocation. But what’s the difference between a synchronous and asynchronous invocation, and what controls which option? When you manually invoke a Lambda function or when your custom-built application invokes it, you have the ability to use the invoke option, which allows you to specify if the function should be invoked synchronously or asynchronously. More information on this option can be found here. When a function is invoked synchronously, it enables you to assess the result before moving onto the next operation required. You may need to know the outcome of one function before using the value or result within another function.
If you want to control the flow of your functions, then synchronous invocations can help you maintain an order. If these points are not of concern to you or your function, then invoking functions asynchronously would be the preferable option. When using event sources to call and invoke your function, then the ability to select an invocation type is removed. In this case, the invocation type is very dependent on the actual service itself.
For poll-based event sources, the invocation type is always synchronous. For push-based event sources, it varies on the service. As you can see from this table, AWS CloudFormation always invokes functions synchronously and AWS Config asynchronously.
Monitoring Lambda using CloudWatch
Here we will cover the different methods to help you monitor and troubleshoot issues that you may experience with your Lambda functions. Thankfully, monitoring statistics related to your Lambda function within CloudWatch is by default already configured. This also includes monitoring your functions as they are running. CloudWatch has the following metrics that are automatically populated by Lambda.
- Invocations: This determines how many times a function has been invoked and will match the number of billed requests that you are charged.
- Errors: This metric counts the number of failed invocations of the function, for example the result of a permissions error.
- DeadLetterErrors: This counts the number of times Lambda failed to write the dead letter queue, for example due to misconfigured resources or permission issues.
- Duration: This metric simply measures how long the function runs for in milliseconds from the point of invocation to when it terminates its execution. Again, used for billing, however Lambda billed requests are measured in per 100 millisecond time frames.
- Throttles: This is to count as how many times the function was invoked and throttled due to the limit of concurrency having been reached.
- IteratorAge: This is only used for stream-based invocations such as Amazon Kinesis. It measures in time how long Lambda took to receive a batch of records to the time of the last record written to the stream. This IteratorAge is measured in milliseconds.
- ConcurrentExecutions: This is a combined metric for all of your Lambda functions that you have running within your AWS account in addition to functions with a custom concurrency limit. It calculates the total sum of concurrent executions at any point in time.
- UnreservedConcurrentExecutions: Again, this is also a combined metric for all of your functions in your account, and it calculates the sum of the concurrency of functions without a custom concurrency limit at any given time.
By utilizing these metrics that are published into CloudWatch you are able to maintain an overview of your functions, and if you are experiencing any unexpected errors you can easily debug them. Using the features of CloudWatch you can easily create a dashboard relating to your functions.
CloudWatch Logs: In addition to these metrics, CloudWatch also gathers log data sent by Lambda which are very useful to drill into if you are noticing erroneous patterns emerging from your CloudWatch metrics. For each function that you have running, CloudWatch will create a different log group. You can also add custom logging statements into your function code. As you can see here, CloudWatch has automatically created two different log groups, one for each function.
The log group name will be prefixed with aws and lambda. So in this example we have the logs from functions MyNewFunction and Testing. You can also add custom logging statements into your function code, which are then used to ensure that your function is operating correctly. These logging statements are used to push data to CloudWatch logs automatically in addition to the managed messages that are sent by default.
Some Common Errors
IAM Role and Function Policy
When working with Lambda people often come across similar issues and I just want to spend a couple of minutes talking about these points to help you easily identify what might be causing the issue. Some of the common issues as to why your functions might not run relate to permissions.
As we know, an IAM role is required for Lambda to assume and execute the code of your function.
In addition to this role, we also have to create a function policy which specifies which AWS resources are allowed to invoke your function.
Having an error in either one of these components can cause your function to fail. If, within the role execution policy for example, you fail to add permissions to operate your function within a VPC to allow the creation of ENIs using the following permissions, then your function would fail.
Alternatively, if your function policy does not include the correct permissions to allow your push based event source to trigger the function required, then again you will receive a failure.
Going back to CloudWatch and logs and how they are used by Lambda. Well, if the CloudWatch permissions were removed from the execution role, your logs would not be published to CloudWatch and would fail to be created. Permissions and access to resources from both your execution role and function policy are the most likely cause of issues as to why a function would fail. Be sure to check your policies and understand which policy is used for which purpose.
Summary of Key Features
Role Execution Policies and Function Execution Policies
The role execution policy determines what resources the function role has access to when the function is being run. The function policy defines which AWS resources are allowed to invoke your function.
The handler within your function allows Lambda to invoke it when the service executes the function on your behalf, and it’s used as the entry point within your code to execute your function.
Environment variables are key value pairs that allow you to incorporate variables into your function without embedding them thoroughly into your code. By default, AWS Lambda encrypts your environment variables after the function has been deployed using KMS.
Other key points
- Basic settings allows you to determine the compute resource that you want to use to execute your code, and you can only alter the amount of memory used. AWS Lambda then calculates the CPU power itself, based off of this selection.
- The function timeout determines how long the function should run before it terminates.
- And by default, AWS Lambda is only allowed to access resources that are accessible over the internet. To access resources within your VPC requires additional configuration. The execution role will need permissions to configure ENIs in your VPC.
- A dead-letter queue is used to receive payloads that were not processed due to a failed execution.
- And failed asynchronous functions would automatically retry the event a further two more times.
- Synchronous invocations do not automatically retry failed attempts.
- Enable active tracing is used to integrate AWS X-Ray to trace event sources that invoked your Lambda function, in addition to tracing other resources that were called upon in response to your Lambda function running.
- Concurrency measures how many functions can be running at the same time, with a default unreserved concurrency set to 1,000.
- AWS CloudTrail integrates with AWS Lambda, aiding with auditing and compliance.
- Throttling sets the reserved concurrency limit of your function to zero, and will stop all future invocations of the function until you change the concurrency setting.
- Lambda qualifiers allow you to change between versions of an alias of your function, and when you create a new version of your function, you’re not able to make any further configuration changes, making it immutable. An alias allows you to create a pointer to a specific version of your function.
- By exporting your function, you can redeploy at a later stage, perhaps within a different AWS region.
- And by creating a test event, you can easily perform different tests against your function.
You should now have a greater understanding of AWS Lambda and how the service is configured and can be used within your environment to help create serverless applications using minimal compute resources.