Requirement Summary:
NLP Lambda function with a large pre-trained model
Lambda layer became 8.7 GB → Exceeds AWS limits
Function returns RequestEntityTooLargeException
Need: High-performing, portable, low initialization time
Important AWS Limits:
Lambda Layers size limit (combined across all layers): 250 MB (unzipped)
Deployment package size (unzipped): 250 MB
Lambda container image support allows up to 10 GB image size
Evaluate Options:
A: Store model in S3 and load during execution
Leads to cold start latency every time
Model loading from S3 is slower and not suitable for real-time NLP
Not optimal for performance
B: Use EFS mounted to Lambda
⚠️ Valid for large models, but adds latency during cold start as model loads from EFS
Requires EFS setup, VPC, and has added network I/O overhead
Still slower than bundling in container image
C: Split into five Lambda layers
Still violates the total layer size limit of 250 MB (unzipped)
You cannot exceed that even with multiple layers
D: Use Docker container image
Allows bundling up to 10 GB of dependencies and models
High portability and performance
Avoids latency of downloading models at runtime
Ideal for scientific/NLP models
Lambda container image support: https://docs.aws.amazon.com/lambda/latest/dg/images-create.html
Lambda limits: https://docs.aws.amazon.com/lambda/latest/dg/gettingstarted-limits.html
Using large models with Lambda: https://aws.amazon.com/blogs/machine-learning/deploying-large-machine-learning-models-on-aws-lambda-with-container-images/