/user/kayd @ devops :~$ cat boto3-and-aws-lambda-a-match-made-in-serverless-heaven.md

Boto3 + AWS Lambda: A Production Serverless Pipeline Boto3 + AWS Lambda: A Production Serverless Pipeline

Karandeep Singh

Jan 9, 2023 • 6 minutes

Summary

Production guide to building serverless data pipelines with Boto3 and Lambda. Based on processing high-volume daily events through an analytics pipeline. Covers cold starts, concurrent execution limits, error handling, retries, and cost optimization.

Abstract cloud computing visualization representing serverless architecture

The Surprisingly High Lambda Bill That Taught Me Boto3

I once built a serverless analytics pipeline. The requirements seemed straightforward:

Process user activity events from SQS queue
Enrich events with user data from DynamoDB
Store processed events in S3 for analysis
Handle a high volume of events daily, with significant peak traffic

First month’s AWS bill: much higher than expected.

The problem wasn’t Lambda itself. The problem was how I used Boto3 in Lambda. This article documents the optimization journey that reduced costs significantly while improving reliability.

The Naive First Implementation (That Cost Way Too Much)

Here’s my initial Lambda function - textbook example but terrible for production:

import boto3
import json

# DON'T DO THIS - creates client on every invocation
def lambda_handler(event, context):
    # Cold start penalty - initializing clients inside handler
    s3 = boto3.client('s3')
    dynamodb = boto3.client('dynamodb')
    sqs = boto3.client('sqs')

    for record in event['Records']:
        # Parse event
        event_data = json.loads(record['body'])

        # Enrich with user data - SYNCHRONOUS call (slow!)
        user_response = dynamodb.get_item(
            TableName='users',
            Key={'user_id': {'S': event_data['user_id']}}
        )

        # Process data
        processed = {
            'event': event_data,
            'user': user_response.get('Item', {})
        }

        # Write to S3 - one file per event (expensive!)
        s3.put_object(
            Bucket='analytics-raw',
            Key=f"events/{event_data['event_id']}.json",
            Body=json.dumps(processed)
        )

    return {'statusCode': 200}

What went wrong:

Client initialization inside handler: noticeable cold start overhead per invocation
One S3 PUT per event: every event becomes a PUT request, ballooning PUT costs
Synchronous DynamoDB calls: slow average execution time
No batch processing: Each Lambda invoked for single event
No error handling: Failed events lost forever
Memory over-provisioned: 1024MB when 256MB sufficient

Where the costs piled up:

Lambda execution dominated the bill (long durations × huge invocation count)
S3 PUT requests added up fast (one PUT per event)
DynamoDB reads were costly (one read per event, with consistent reads)
Data transfer added a smaller but real chunk

The Optimized Implementation (Major Cost Reduction)

After several weeks of optimization, here’s the production version:

import boto3
import json
from typing import List, Dict
import os

# Initialize clients OUTSIDE handler (reused across invocations)
s3 = boto3.client('s3')
dynamodb = boto3.resource('dynamodb')
users_table = dynamodb.Table('users')

# Environment variables
BUCKET_NAME = os.environ['ANALYTICS_BUCKET']
BATCH_SIZE = 100

def lambda_handler(event, context):
    """
    Process SQS events in batches
    Memory: 256MB (reduced from 1024MB)
    Timeout: 60s
    Batch size: 10 messages (configured in SQS trigger)
    """
    events_buffer = []
    failed_items = []

    for record in event['Records']:
        try:
            event_data = json.loads(record['body'])

            # Batch DynamoDB requests (10x faster than individual gets)
            user_data = get_user_cached(event_data['user_id'])

            events_buffer.append({
                'event': event_data,
                'user': user_data,
                'processed_at': context.request_id
            })

            # Flush buffer when full
            if len(events_buffer) >= BATCH_SIZE:
                write_batch_to_s3(events_buffer)
                events_buffer = []

        except Exception as e:
            # Send failures to DLQ for reprocessing
            failed_items.append({
                'itemIdentifier': record['messageId'],
                'error': str(e)
            })

    # Flush remaining events
    if events_buffer:
        write_batch_to_s3(events_buffer)

    # Return partial batch failures
    return {
        'batchItemFailures': failed_items
    }

# In-memory cache (persists across warm invocations)
user_cache = {}

def get_user_cached(user_id: str) -> Dict:
    """Get user with Lambda execution context caching"""
    if user_id in user_cache:
        return user_cache[user_id]

    # Batch read with consistent read disabled (eventual consistency OK)
    response = users_table.get_item(
        Key={'user_id': user_id},
        ConsistentRead=False  # 50% cost reduction
    )

    user = response.get('Item', {})
    user_cache[user_id] = user  # Cache for warm invocations
    return user

def write_batch_to_s3(events: List[Dict]):
    """Write 100 events as single S3 object instead of 100 separate PUTs"""
    timestamp = events[0]['event']['timestamp']
    date = timestamp[:10]  # YYYY-MM-DD

    s3.put_object(
        Bucket=BUCKET_NAME,
        Key=f"events/date={date}/{context.request_id}.json",
        Body='\n'.join(json.dumps(e) for e in events),
        ContentType='application/json'
    )

Key optimizations:

Client initialization outside handler: Eliminated repeated cold start overhead
Batch S3 writes: Many events per PUT (a large reduction in PUT requests)
In-memory caching: Strong cache hit rate on users
Eventual consistency for DynamoDB: meaningful cost reduction
Reduced memory: 256MB (sufficient for workload)
Partial batch failure handling: Failed events automatically retry

Where costs dropped:

Lambda execution dropped sharply once average duration came down
S3 PUT requests fell dramatically thanks to batching
DynamoDB reads got cheaper with eventual consistency and caching
A small DLQ storage line item appeared, but the overall bill saw a large reduction

Production Lessons from Running This at Scale

Lesson 1: Cold Starts Matter

Initial cold starts were painfully slow with client initialization inside the handler.

Optimizations that worked:

Initialize Boto3 clients outside handler: noticeably faster cold starts
Use Lambda layers for dependencies: further cold start improvement
Minimize deployment package: another nudge faster
Provisioned concurrency for critical paths: eliminated cold starts entirely

End result: substantially faster cold starts.

Lesson 2: Concurrent Execution Limits Will Hit You

We hit AWS account concurrency limits during a traffic spike. Our queue backed up significantly.

The fix:

Requested a concurrency limit increase
Implemented exponential backoff in producers
Added CloudWatch alarms for queue depth thresholds

Lesson 3: DLQ Configuration is Not Optional

Early on, we lost events due to unhandled errors before implementing DLQ.

Proper error handling:

Configure SQS DLQ with multi-day retention
Set a sensible maxReceiveCount (retry failed messages a few times)
Monitor DLQ depth daily
Weekly review of DLQ messages to identify systematic issues

Lesson 4: Memory vs Duration is a Trade-off

After testing several memory configurations, doubling memory from 128MB to 256MB roughly halved duration at similar cost — so 256MB ended up being the sweet spot for this workload. Going higher cost more without proportional speedup.

Lesson 5: Boto3 Retries Need Configuration

Default Boto3 retry config caused timeout issues during AWS service hiccups.

Custom retry configuration:

from botocore.config import Config

retry_config = Config(
    retries={
        'max_attempts': 3,
        'mode': 'adaptive'  # Uses exponential backoff
    },
    connect_timeout=5,
    read_timeout=10
)

s3 = boto3.client('s3', config=retry_config)

This dramatically reduced the rate of timeout errors across invocations.

Cost Optimization Checklist

From expensive mistakes:

Initialize Boto3 clients outside handler function
Batch operations (S3 PUTs, DynamoDB batch operations)
Use eventual consistency for DynamoDB when possible
Right-size memory allocation (test different configs)
Implement caching for frequently accessed data
Configure proper timeouts to avoid runaway executions
Use reserved capacity for predictable workloads (notable discount)
Enable compression for S3 objects
Clean up old DLQ messages
Monitor costs daily in first month

When NOT to Use Lambda + Boto3

After building many serverless pipelines, Lambda + Boto3 is NOT appropriate for:

Long-running tasks (>15 minutes) - Use Fargate or EC2
Large file processing (>10GB) - Lambda has 10GB storage limit
Consistent sub-10ms latency requirements - Cold starts are unpredictable
High-frequency, steady-state workloads - EC2 is cheaper
Complex dependencies - Deployment packages >250MB don’t work well

Lambda + Boto3 excels at:

Event-driven architectures
Intermittent workloads
Rapid scaling requirements (idle to many concurrent invocations in seconds)
Variable traffic patterns

References

Question

What's been your biggest challenge with serverless data pipelines? Cold starts? Cost optimization? Error handling?

Real sed patterns for log analysis: extract errors, filter time ranges, anonymize PII, parse …

Sed Gotchas: GNU vs BSD, In-Place Backup, and Safety Patterns

The sed gotchas that bite in production: GNU vs BSD differences (`-i` syntax, `-E` support, `\b` …

Sed in CI/CD Pipelines: Safe Patterns for GitHub Actions and Jenkins

How to use sed safely in CI/CD pipelines: idempotent edits, exit-code checks, dry-run patterns, and …

Go + SQS: Build a Message Queue Processor

Build a Go app that sends and processes SQS messages from scratch. Start with one message, discover …

Go + DynamoDB: Build a Simple CRUD App

Build a Go CRUD app with DynamoDB from scratch. Start with raw attribute maps, hit the verbosity …

Build and Deploy a Go Lambda Function

A hands-on, step-by-step guide to building your first AWS Lambda function with Go. Start with a …

Boto3 + AWS Lambda: A Production Serverless Pipeline Boto3 + AWS Lambda: A Production Serverless Pipeline

Summary

The Surprisingly High Lambda Bill That Taught Me Boto3

The Naive First Implementation (That Cost Way Too Much)

The Optimized Implementation (Major Cost Reduction)

Production Lessons from Running This at Scale

Lesson 1: Cold Starts Matter

Lesson 2: Concurrent Execution Limits Will Hit You

Lesson 3: DLQ Configuration is Not Optional

Lesson 4: Memory vs Duration is a Trade-off

Lesson 5: Boto3 Retries Need Configuration

Cost Optimization Checklist

When NOT to Use Lambda + Boto3

References

Similar Articles

More from cloud

The Surprisingly High Lambda Bill That Taught Me Boto3

The Naive First Implementation (That Cost Way Too Much)

The Optimized Implementation (Major Cost Reduction)

Production Lessons from Running This at Scale

Lesson 1: Cold Starts Matter

Lesson 2: Concurrent Execution Limits Will Hit You

Lesson 3: DLQ Configuration is Not Optional

Lesson 4: Memory vs Duration is a Trade-off

Lesson 5: Boto3 Retries Need Configuration

Cost Optimization Checklist

When NOT to Use Lambda + Boto3

References

Similar Articles

Related Content

More from cloud

You Might Also Like