Every year, re:Invent is a Las Vegas all-you-can-eat buffet of new AWS capabilities being announced. Earlier Onica blog posts have covered the 2019 re:Invent keynote announcements from Andy Jassy and Dr. Werner Vogels in near real time. There is so much going on though, that many other announcements fly under the radar. This humble serverless and web application developer attended his first re:Invent this year and noticed a few new capabilities that were not mentioned in the keynotes, but caught my attention. One may be the answer to a popular question of our times: what comes after serverless?
AWS Lambda Event Destinations, Parallelization, and FIFO, Oh my!
First, there were a few AWS Lambda enhancements released prior to re:Invent. You may have heard of these already:
AWS Lambda Event Destination:
This is a declarative way to have Lambdas invoked asynchronously deliver results somewhere. Up until now you had to deliver results on your own, like into an Amazon SQS queue, Amazon SNS topic, Amazon DynamoDB table… and you still can. This enhancement just makes that a bit simpler.
Read more about it here: AWS Lambda Destinations
AWS Lambda Parallelization Factor:
Lambdas triggered by Amazon Kinesis and Amazon DynamoDB execute just one at a time per shard. (Yes, DynamoDB has sharding too. You just don’t have to deal with it directly.) This is still the default, but you can now configure up to ten Lambdas to execute in parallel per shard, processing data at greater throughput.
Read more about it in the AWS blog here: AWS Lambda Parallelization Factor Announcement
Amazon SQS FIFO Queues into AWS Lambda:
We’ve been able to have standard Amazon SQS queues invoke Lambdas to process items in the queue for just a year and a half. Now FIFO queues can be a source for Lambdas as well.
Read more about it here: AWS Lambda Supports Amazon SQS FIFO
AWS Lambda Provisioned Concurrency
AWS has been busy attacking cold starts! Just a year ago they announced a fix to the massive Lambda-in-VPC penalty. That fix started rolling out just a few months ago, and
us-east-1 is in the process of switching to it right now. The year-old Firecracker also reduced cold-start time. Now it seems AWS has eked out all the performance improvements that they can, and so they have dropped the final option: provisioned concurrency. Specify how many instances of your Lambda to keep around, and AWS will keep those warmed up and ready for a reduced price. (Basically, it still consumes memory but not CPU.) Philosophically it feels like a step backwards for serverless, yet it’s a good option to have. The plethora of tools and blogs about keeping Lambdas warm make it clear that this is something people want. The VPC ENI fix has certainly reduced the need, but there will still be cases where the remaining second or two of cold start is still a problem and provisioned concurrency gives users another option.
Just as tools previously emerged to keep Lambdas warm, I expect a tool to appear any day now to take advantage of the new
ProvisionedConcurrencyUtilization metric to automatically adjust provisioning based on load. It would have been a nice alternative to be able to define a number of idle instances to keep warm, rather than a hard count. Maybe that will come later.
Read more about it here: AWS Lambda Provisioned Concurrency.
Amazon RDS Proxy
One barrier to use Lambdas in some applications has been a need to work with relational databases. These old SQL databases use a rather heavy protocol for establishing a connection which presents its own cold-start penalty. In old stateful application servers running on Amazon EC2, connections would be put into an in-memory pool to be reused by new requests and freed upon becoming stale. This was a challenge with Lambdas running in separate processes with no clean shutdown hook upon being destroyed.
Now along comes Amazon RDS Proxy. This is simply the old stateful server keeping the in-memory pool of reusable connections. Your Lambda can connect to this proxy, managed by AWS, instead of directly to the Amazon RDS database. No more waiting for a connection (unless you exhaust the idle pool). This is still in preview, so don’t use it for production. So far it only works with MySQL, MariaDB, and the MySQL flavor of Amazon Aurora. No word yet on the ability to configure the connection pool or what the service will cost. It can be secured via AWS IAM, which opens the possibility that the Lambdas won’t need to be inside of a VPC anymore.
Read more about it here: Amazon RDS Proxy
HTTP APIs for API Gateway
API Gateway is one of the granddaddy AWS serverless services. Last year API Gateway introduced its v2 API and domains to access a new WebSocket capability. It turns out that was a sign of more to come. Now we have v2 of HTTP. The old v1 API Gateway is still around, full of features, and called the REST API. The new v2 is still in beta and being called HTTP APIs. For most uses, it serves the same purpose. It has fewer features, but that is likely to change as it matures. The point of this new version though is to address the primary complaints of API Gateway REST APIs: high latency (relatively), and high cost (absolutely). This new version should reduce your API Gateway bill by around 70%! That is, once it is generally available and has all the features you need.
Read more about the announcement here: HTTP APIs for API Gateway
Learn more about REST vs HTTP API features here.
On Wednesday, I attended a small session called MOB402: Build data-driven mobile and web apps with AWS AppSync. This title turned out to be code for: Introducing AWS Amplify DataStore!
You may have heard of AWS Amplify, the CLI toolchain, libraries, and CI/CD introduced by the AWS Mobile team in 2018. It manages a variety of AWS services behind its simplified console, CLI, and APIs to make it easier to develop client applications (in web, Android, and iOS) with AWS cloud backends. Because of this initial posture of limited capabilities, it has been largely dismissed as a tool for beginners and maybe rapid development of proof of concepts.
That view was shattered this week after the MOB402 session and my own chats with AWS Mobile team members Richard Threlkeld (who presented MOB402), and Nikhil Dabhade. AWS Amplify is intended for serious development. New features this year like environments and pull make large scale enterprise application development with AWS Amplify possible. However, AWS Amplify was still missing a compelling reason for advanced AWS developers to start using it. There were other tools to accomplish the same tasks with perhaps a bit more complexity, but also more freedom.
DataStore is the killer feature
It turns out that the AWS Mobile team has been working on a secret paradigm changing feature all along. It took two years of development, including someone who previously worked on Hibernate ORM – hint! Here’s how it works:
- Define your relational data model using GraphQL.
- Use the data store in your client application: save, query, and observe changes. Browsers persist the data in IndexedDB, and iOS and Android use SQLite.
Already a useful library for local data storage. Unique in that one GraphQL definition generates classes for you in multiple languages, but otherwise nothing new.
Now use the AWS Amplify CLI to push it to AWS. This will deploy a new AWS AppSync API and two Amazon DynamoDB tables per GraphQL
@model type – one for content and one to track changes. You don’t care about those details though. What matters is that you now have an object relational model (ORM) as a service. The clients will sync data between them, remain fully functional while offline, and automatically conflict resolve when online (there are options here). Access is managed with the existing AWS Amplify authentication category and the
@auth directive in GraphQL.
Serverless is about not dealing with backend infrastructure, and just focusing on application logic. (There are still servers, but we don’t manage them.) Does Amplify DataStore qualify as serverless? It allows us to only focus on our application. More specifically, it allows us to only focus on our client-end application logic. If we don’t have to do anything in the cloud, is this the next step? Cloudless?
What impact does this have?
Consider this: For the past year I have worked with a team building a web application that is essentially a UI for a large relational database. We have had to live with slow Lambda cold-starts in a VPC and deal with connection pooling to Amazon RDS from Lambdas. These Lambdas only exist to move data between Typescript data models and SQL via mapping, normalization, and denormalization, and to share data between clients. We even released Cargoplane to provide data change observability between clients.
Does the above sound exactly like what AWS just solved? Does it sound like anything you have done?
AWS has shown that they have no intent on losing their lead in cloud serverless. Most of the advancements are appreciated evolutionary improvements, while Amplify DataStore may prove to be revolutionary – time will tell. Advanced tooling lowers the barrier of entry to bring more developers and applications into the serverless family. If you need help jumping into serverless or need to dive deeper than AWS Amplify can provide, we’d love to hear from you!