Real sed patterns for log analysis: extract errors, filter time ranges, anonymize PII, parse …
Boto3 + AWS Lambda: A Production Serverless Pipeline Boto3 + AWS Lambda: A Production Serverless Pipeline

Summary

The Surprisingly High Lambda Bill That Taught Me Boto3
I once built a serverless analytics pipeline. The requirements seemed straightforward:
- Process user activity events from SQS queue
- Enrich events with user data from DynamoDB
- Store processed events in S3 for analysis
- Handle a high volume of events daily, with significant peak traffic
First month’s AWS bill: much higher than expected.
The problem wasn’t Lambda itself. The problem was how I used Boto3 in Lambda. This article documents the optimization journey that reduced costs significantly while improving reliability.
Expand your knowledge with Build and Deploy a Go Lambda Function
The Naive First Implementation (That Cost Way Too Much)
Here’s my initial Lambda function - textbook example but terrible for production:
import boto3
import json
# DON'T DO THIS - creates client on every invocation
def lambda_handler(event, context):
# Cold start penalty - initializing clients inside handler
s3 = boto3.client('s3')
dynamodb = boto3.client('dynamodb')
sqs = boto3.client('sqs')
for record in event['Records']:
# Parse event
event_data = json.loads(record['body'])
# Enrich with user data - SYNCHRONOUS call (slow!)
user_response = dynamodb.get_item(
TableName='users',
Key={'user_id': {'S': event_data['user_id']}}
)
# Process data
processed = {
'event': event_data,
'user': user_response.get('Item', {})
}
# Write to S3 - one file per event (expensive!)
s3.put_object(
Bucket='analytics-raw',
Key=f"events/{event_data['event_id']}.json",
Body=json.dumps(processed)
)
return {'statusCode': 200}
What went wrong:
- Client initialization inside handler: noticeable cold start overhead per invocation
- One S3 PUT per event: every event becomes a PUT request, ballooning PUT costs
- Synchronous DynamoDB calls: slow average execution time
- No batch processing: Each Lambda invoked for single event
- No error handling: Failed events lost forever
- Memory over-provisioned: 1024MB when 256MB sufficient
Where the costs piled up:
Deepen your understanding in Lambda Website Integration: When and Why You Should Use It
- Lambda execution dominated the bill (long durations × huge invocation count)
- S3 PUT requests added up fast (one PUT per event)
- DynamoDB reads were costly (one read per event, with consistent reads)
- Data transfer added a smaller but real chunk
The Optimized Implementation (Major Cost Reduction)
After several weeks of optimization, here’s the production version:
import boto3
import json
from typing import List, Dict
import os
# Initialize clients OUTSIDE handler (reused across invocations)
s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
users_table = dynamodb.Table('users')
# Environment variables
BUCKET_NAME = os.environ['ANALYTICS_BUCKET']
BATCH_SIZE = 100
def lambda_handler(event, context):
"""
Process SQS events in batches
Memory: 256MB (reduced from 1024MB)
Timeout: 60s
Batch size: 10 messages (configured in SQS trigger)
"""
events_buffer = []
failed_items = []
for record in event['Records']:
try:
event_data = json.loads(record['body'])
# Batch DynamoDB requests (10x faster than individual gets)
user_data = get_user_cached(event_data['user_id'])
events_buffer.append({
'event': event_data,
'user': user_data,
'processed_at': context.request_id
})
# Flush buffer when full
if len(events_buffer) >= BATCH_SIZE:
write_batch_to_s3(events_buffer)
events_buffer = []
except Exception as e:
# Send failures to DLQ for reprocessing
failed_items.append({
'itemIdentifier': record['messageId'],
'error': str(e)
})
# Flush remaining events
if events_buffer:
write_batch_to_s3(events_buffer)
# Return partial batch failures
return {
'batchItemFailures': failed_items
}
# In-memory cache (persists across warm invocations)
user_cache = {}
def get_user_cached(user_id: str) -> Dict:
"""Get user with Lambda execution context caching"""
if user_id in user_cache:
return user_cache[user_id]
# Batch read with consistent read disabled (eventual consistency OK)
response = users_table.get_item(
Key={'user_id': user_id},
ConsistentRead=False # 50% cost reduction
)
user = response.get('Item', {})
user_cache[user_id] = user # Cache for warm invocations
return user
def write_batch_to_s3(events: List[Dict]):
"""Write 100 events as single S3 object instead of 100 separate PUTs"""
timestamp = events[0]['event']['timestamp']
date = timestamp[:10] # YYYY-MM-DD
s3.put_object(
Bucket=BUCKET_NAME,
Key=f"events/date={date}/{context.request_id}.json",
Body='\n'.join(json.dumps(e) for e in events),
ContentType='application/json'
)
Key optimizations:
- Client initialization outside handler: Eliminated repeated cold start overhead
- Batch S3 writes: Many events per PUT (a large reduction in PUT requests)
- In-memory caching: Strong cache hit rate on users
- Eventual consistency for DynamoDB: meaningful cost reduction
- Reduced memory: 256MB (sufficient for workload)
- Partial batch failure handling: Failed events automatically retry
Where costs dropped:
Explore this further in Lambda Website Integration: When and Why You Should Use It
- Lambda execution dropped sharply once average duration came down
- S3 PUT requests fell dramatically thanks to batching
- DynamoDB reads got cheaper with eventual consistency and caching
- A small DLQ storage line item appeared, but the overall bill saw a large reduction
Production Lessons from Running This at Scale
Lesson 1: Cold Starts Matter
Initial cold starts were painfully slow with client initialization inside the handler.
Optimizations that worked:
- Initialize Boto3 clients outside handler: noticeably faster cold starts
- Use Lambda layers for dependencies: further cold start improvement
- Minimize deployment package: another nudge faster
- Provisioned concurrency for critical paths: eliminated cold starts entirely
End result: substantially faster cold starts.
Lesson 2: Concurrent Execution Limits Will Hit You
We hit AWS account concurrency limits during a traffic spike. Our queue backed up significantly.
The fix:
- Requested a concurrency limit increase
- Implemented exponential backoff in producers
- Added CloudWatch alarms for queue depth thresholds
Lesson 3: DLQ Configuration is Not Optional
Early on, we lost events due to unhandled errors before implementing DLQ.
Proper error handling:
- Configure SQS DLQ with multi-day retention
- Set a sensible maxReceiveCount (retry failed messages a few times)
- Monitor DLQ depth daily
- Weekly review of DLQ messages to identify systematic issues
Lesson 4: Memory vs Duration is a Trade-off
After testing several memory configurations, doubling memory from 128MB to 256MB roughly halved duration at similar cost — so 256MB ended up being the sweet spot for this workload. Going higher cost more without proportional speedup.
Lesson 5: Boto3 Retries Need Configuration
Default Boto3 retry config caused timeout issues during AWS service hiccups.
Custom retry configuration:
from botocore.config import Config
retry_config = Config(
retries={
'max_attempts': 3,
'mode': 'adaptive' # Uses exponential backoff
},
connect_timeout=5,
read_timeout=10
)
s3 = boto3.client('s3', config=retry_config)
This dramatically reduced the rate of timeout errors across invocations.
Discover related concepts in Sed Cheat Sheet: 30 One-Liners from Real Production Logs
Cost Optimization Checklist
From expensive mistakes:
Uncover more details in The Complete Guide to AWS S3 Hosting for Modern Web Applications
- Initialize Boto3 clients outside handler function
- Batch operations (S3 PUTs, DynamoDB batch operations)
- Use eventual consistency for DynamoDB when possible
- Right-size memory allocation (test different configs)
- Implement caching for frequently accessed data
- Configure proper timeouts to avoid runaway executions
- Use reserved capacity for predictable workloads (notable discount)
- Enable compression for S3 objects
- Clean up old DLQ messages
- Monitor costs daily in first month
When NOT to Use Lambda + Boto3
After building many serverless pipelines, Lambda + Boto3 is NOT appropriate for:
- Long-running tasks (>15 minutes) - Use Fargate or EC2
- Large file processing (>10GB) - Lambda has 10GB storage limit
- Consistent sub-10ms latency requirements - Cold starts are unpredictable
- High-frequency, steady-state workloads - EC2 is cheaper
- Complex dependencies - Deployment packages >250MB don’t work well
Lambda + Boto3 excels at:
Journey deeper into this topic with Build and Deploy a Go Lambda Function
- Event-driven architectures
- Intermittent workloads
- Rapid scaling requirements (idle to many concurrent invocations in seconds)
- Variable traffic patterns
References
What's been your biggest challenge with serverless data pipelines? Cold starts? Cost optimization? Error handling?
Similar Articles
Related Content
More from cloud
The sed gotchas that bite in production: GNU vs BSD differences (`-i` syntax, `-E` support, `\b` …
How to use sed safely in CI/CD pipelines: idempotent edits, exit-code checks, dry-run patterns, and …
You Might Also Like
Build a Go app that sends and processes SQS messages from scratch. Start with one message, discover …
Build a Go CRUD app with DynamoDB from scratch. Start with raw attribute maps, hit the verbosity …
A hands-on, step-by-step guide to building your first AWS Lambda function with Go. Start with a …

