As it turns out, AWS Lambda promises and reality are worlds apart

The story so far

Once upon a time people have used the “server” paradigm. They would wake up each morning, brush their teeth, drink their coffee and decide whether they wanted to deploy a new version of their awesome app on a hosted server. They would use various means of getting their code to the server and running the new application (Ranging all the way from manually copying it to the server, stopping the running app, deploying the new one and starting it and all the way to using sophisticated configuration management tools that would actually do the same thing, albeit automatically).

Although it seemed easy enough (for the devops experienced in the ways of the deployment it was, anyway), terrible things sometimes happened anyway. Operating systems would not be updated for years for fear of breaking deployment which could force antiquated packages to be used. Sometimes, someone courageous would update the system only to find out that even though on the developer machine everything was godly, the new system did not work because . Sometimes there were several systems in place (Different OSes with different versions) and deployment had to be tailored individually for each one. Developers and devops alike spent days trying to make these things work and great sadness has swept the land. Along came application containers, bringing with them the promise of running the application in a contained environment. “Bring us your application”, they said, “And we will put it on a system where it won’t matter on which host you run, it will work”. This was much better and for the most part this has really removed some of the headaches. Unfortunately, hosts were still involved and not all hosts could run the containers and other issues, such as automatic scalability still remained. But even then, there was much rejoicing.

Then, one day, came the serverless functions. A new model with new promises. “Why would you need a server at all?”, they whispered enticingly, “Come to us, we are nothing more than functions running on the cloud. Behold! There are no hosts! No infrastructure (for you to see)! All you must do is write your function, hand it over and we will take care of the rest. We will deploy it for you, we will call it for you, or schedule times for them to run, and even respond to cloud events if you so desire. we will even scale it for you automatically when things get rough. You will not need to deal with pesky hosts ever again!” The call went out across the land and multitudes of companies, startups and corporations alike went to try out the wonder of serverless functions.

Starry eyed, we were swept along with the masses to Amazon, which offers a serverless service by the name of “AWS lambda”. We started using it for new things as well as moving some of our old stuff to it which we thought needed a better home. We have spent a few weeks trying out this transformation, although to our dismay we soon found out that the promise was filled with holes and it wasn’t long before we went back to our old ways waiting for something better to come along or, at least, wait for the technology to get better. You might be wondering how dare we defy the new prophet. Here are our reasons why:

You cannot connect to a database on a large scale

That is, you can BUT a thing about the serverless model is that every function is invoked in a fresh container (That is almost true, lambda might reuse started containers for efficiency but you can’t, and are not supposed to, count on that). This means that loved abstractions such as “Connection pools” are irrelevant and each container needs to use a separate database connection.

So what happens when you have a lot of lambdas competing for coveted database connections? Basically you are screwed. Some databases will just return a “too many connections” error (including amazon’s own databases, which you would expect be able to handle the serverless paradigm, but alas, most of them cannot) and you’ll have to recover from that, by yourself, programatically, making up intricate designs that should not be there in the first late. Even if you do happen to find some nice recovery method, the bottom line is that recovery takes time (and putting SLA aside, amazon charges you for every 100ms your function is up) and is usually not very elegant.

But are all databases like that? That being said, some databases, such as amazon’s own dynamodb have APIs over HTTP and so are not bound by this limitation. This, however, forces you to use a database that you may not have wanted to use in the first place and will cause you to vendor lock as well. Dynamodb, for instance, is basically a key value store whereas we needed something more document oriented. Another point here is that amazon lambda actually allows you to limit the number of individual lambdas running at once but you cannot limit a group of lambdas together. So if your DB can accept X connections and you have more than 1 lambda that needs it you can’t just say “I can’t have more than X lambdas running at a time”

No straightforward API for creating scheduled lambdas or event triggered lambdas

One of the nice things about lambdas is that they can be configured to respond to events. They can respond to S3 events (such as adding a new object) or you can configure them to be scheduled. In contrast it appears that the only straightforward way to configure that is through the aws web console, which is not very practical when you deploy new units with scripts. You would expect the AWS lambda api to have some way of configuring it to respond to events but no, there isn’t any. There might be a way to do this using several different APIs but seeing as the whole point of using it amazon services is that you aren’t supposed to have a super amazon expert it kind of misses the whole point. That being said, there is a way of configuring the triggers with CloudFormation but this approach holds a lot of troubles in itself, more on that later…

Can’t use multiple s3 triggers on the same bucket

That is you can, but only if they don’t share the same suffix or prefix but since the dream is to trigger several lambdas independently ANY object changed this kind of defeats the point. I would have liked to have have 1 lambda write the object to Some DB, another to process it and extract some meaningful data and a third to do something fantastic with it. Alas, i cannot. A way around it would be to have 1 lambda triggered and have IT trigger the other lambdas but that is still somewhat annoying (not to mention costs more $$$).

Memory limits are confusing

In AWS Lambda, you have to define the memory for your lambda ahead of time. You do not want to assign too much memory since you are paying for “Memory requested/second” so if you put too much memory you are actually paying for nothing, if you don’t put enough you will get an OutOfMemory error. But here’s the interesting thing, we have tried loading java functions that basically just print to the screen and we couldn’t get away with less than 256M memory used. Even more interesting is that when lambda methods end they output how much max memory was used in the process. More often than not we’ve seen that lambda used a lot less memory (i.e. 65M) than it forced us to use. Wait, what?

You can only use “printf” debugging

Some people are nostalgic for that so they might like it but modern developers use debuggers. An issue with lambda is that you cannot debug it with your favorite debugging tool and you need to print to the console (or cloudwatch logs or whatever) in order to debug, and there’s no way to hotswap your code. Not the most convenient, especially since deploying the change takes time…

Deploying changes takes time

Developing? Made a typo? Have a small bug? Deploying to aws lambda takes time. Depending on how big is your codebase is, it could take up to several minutes. No, i don’t want to break up my code into thousand smaller modules just to please the lambda, it’s just inconvenient. And by the way, if you found yourself needing to attach a network interface to your lambda (a not uncommon case) the deployment may even take up to as much as 40 minutes…

Can’t call internet resources without putting in a lot of work in the lambda configuration

From the amazon documentation: (https://docs.aws.amazon.com/lambda/latest/dg/vpc.html)

“If your Lambda function needs Internet access, do not attach it to a public subnet or to a private subnet without Internet access. Instead, attach it only to private subnets with Internet access through a NAT instance or an Amazon VPC NAT gateway.” Seems easy enough? It isn’t, for a function that wants to call anything on the internet (by the way, this might include your very own ec2 resources) you need to configure your VPC to include public subnet with NAT gateway and a private subnet to put your function in it. Don’t forget to attach an ENI to your lambda as well. Don’t know what any or some of this stuff means? You just came to put your function on the cloud and see it “just” work? Look elsewhere… On that matter, if you did manage to pull it off (Really, it’s not very hard for an experienced AWS gal with a good devops background to do, just a day or two if she had never done this before), the network interface that you attached to your lambda in order for this thing to work (and which you really didn’t care about in the first place) makes it so your lambda takes around 40 minutes to update (googling why it takes so long turned out some mumbo jumbo about AWS needing this long to unattach your interface for some reason)

Clutter

With AWS this isn’t actually just about lambda. It appears that there’s no way to group stuff in the console. If you take into consideration that every HTTP function you have, every scheduled function, every triggered function take up console space and multiply it by the amount of environments you have (and maybe each developer deploys lambdas too) the console becomes very cluttered very fast. It is true that you can filter lambdas by name but that is not as convenient as just having lambdas by group. Also, if you use cloudformation or serverless this will bloat the configuration file you will be using

A word on cloudformation and serverless.com

If you want to use some of the features of lambda in a more decent way (such as have your lambda triggered by events) you might want to use cloudformation. Cloudformation is basically a way to describe the whole of your stack. That is, which S3 buckets you might want to have, which lambdas, security groups, etc. etc. On paper this is nice but in practice this doesn’t work very well since deploying or undeploying a stack very often gets stuck for various reasons and the description format itself is quite hideous (to say the least). So along came serverless.com (kudos to these guys for the initiative) which aims to abstract the pain of cloudformation in a more convenient way. While it mostly focuses on serverless deployments it, in fact, creates a cloudformation descriptor and uses that to deploy your serverless functions. While this makes deployment of serverless functions much easier it basically relies on cloudformation and as such is prone to the cloudformation’s problems. For instance, CF only lets you create new things and it will reject the notion of using existing stuff (such as s3 buckets). This makes it impossible to create a new lambda function and have it triggered by objects from a bucket that your app has been writing for ages to.

Conclusion

It generally looks like if you want to use AWS Lambda you will still need a full time devops team, only instead of having them manage linux hosts (windows?) they now have to manage your aws lambda infrastructure. After trying out AWS lambda for about a month only to find ourselves writing workaround after workaround for lambda issues we finally gave up and decided to drop AWS lambda altogether. If we’re going to have a devops team anyway we might as well use “traditional” docker deployments where we have more control and are not locked to any cloud architecture. We decided that we will revisit the AWS lambda in a few years, but for now we won’t use it as it just cannot deliver on its promise.

Java/DevOps Architect & Tech Lead

Backend Group
Thank you for your interest!

We will contact you as soon as possible.

Send us a message

Oops, something went wrong
Please try again or contact us by email at info@tikalk.com