← Back to Work

Automation · 2025 · Pass Gallery

Pass Gallery Migration

500k+ images per client migrated in under 24 hours

Role: Backend Developer

Node.jsAWS ECSSQSLambdaMongoDBPuppeteerDatadog

24 hrs

Migration Time

Down from 3 days

85%

Speed Improvement

Faster processing

40%

Cost Reduction

Infrastructure savings

500k+

Images / Client

Handled per run

The Challenge

  • Migrating millions of images from PicTime, Pixieset, ShootProof and Zenfolio while preserving structure and metadata
  • Multi-platform integration using APIs and Puppeteer across different provider formats
  • Running parallel scraping and upload processes across multiple virtual machines simultaneously
  • Designing fail-safe resumable pipelines that recover from failures without restarting the entire job

Process

  • 01Designed a step-based decoupled pipeline to process galleries, sets and photos in stages
  • 02Built distributed scraping engine using Puppeteer and APIs with 10+ parallel workers per VM
  • 03Leveraged AWS SQS dual-queue system, ECS task workers and Lambda triggers for event-driven processing
  • 04Synced all metadata and job states to MongoDB for full traceability and resumability
  • 05Created empty gallery structures in Pass Gallery in parallel to scraping to optimize migration speed
  • 06Built retry-friendly error handling that resumes from last successful state after any interruption

The Solution

  • End-to-end automation handling complete gallery migration lifecycle from scraping to upload
  • Processed 500k+ images per client across distributed infrastructure with 99.9% accuracy
  • AWS-powered resilience using ECS, SQS and Lambda for fault-tolerant event-driven processing
  • Reduced total migration time from 3 to 5 days down to under 24 hours per client
  • 40% reduction in operational costs with near-zero manual involvement required

Architecture

1

SQS-Driven Job Queue

Images are batched and enqueued into SQS with visibility timeouts tuned for worst-case processing. Dead-letter queues capture failures for inspection without losing work.

2

ECS Worker Pool

Containerised Node.js workers on ECS consume from SQS concurrently. Auto-scaling triggers on queue depth, scaling from 2 to 20 workers based on backlog size.

3

Observability with Datadog

Custom metrics track images/second, error rate per batch, and per-client progress. Dashboards give ops teams real-time visibility without needing to SSH into containers.

Takeaways

  • Successfully automated gallery migration for multiple clients handling millions of images
  • Delivered highly scalable distributed system with AWS orchestration and MongoDB state management
  • Enabled Pass Gallery to onboard clients faster with reliable low-downtime migrations