The workload is containerized, runs for hours, and is event-driven by nightly data arrival in Amazon S3. The current architecture uses EC2 instances and cron jobs, which results in operational overhead (managing instances, patching, scaling, scheduling) and idle compute between processing windows.
A key constraint is that the processing tasks can take hours. AWS Lambda has maximum execution duration limits that make it unsuitable for multi-hour batch processing. Even though Lambda can run container images, it still must complete within Lambda’s runtime limit. Packaging container images as Lambda layers is also not an appropriate pattern for long-running container workloads and adds complexity.
A modern, low-ops approach for long-running, containerized batch jobs is to run containers on AWS Fargate. Fargate removes the need to manage EC2 instances and allows tasks to run for extended periods as needed, scaling based on demand. Because the workload is composed of several data-processing services that likely need orchestration (for example, fan-out, sequencing, retries, parallelism), AWS Step Functions is well suited to coordinate the workflow and invoke the appropriate ECS tasks.
For triggering based on new S3 data, Amazon EventBridge provides a managed, scalable event bus for AWS service events, including S3 object events, and can route events to targets such as Step Functions state machines. Using EventBridge reduces the need for direct point-to-point notification wiring and provides centralized event routing, filtering, and monitoring.
Option C combines all the right elements: it runs the containers as ECS tasks on Fargate to eliminate EC2 management and idle capacity, uses Step Functions to orchestrate tasks that can run for hours, and uses EventBridge to trigger the state machine when new data is uploaded to S3. This replaces the per-instance cron scheduling with an event-driven serverless orchestration model and significantly reduces operational overhead.
Option B is close but is less appropriate as written because S3 Event Notifications are typically configured to send to Amazon SQS, Amazon SNS, or AWS Lambda. Triggering Step Functions directly is more naturally handled through EventBridge rules. EventBridge is also the recommended event routing layer for integrating service events into workflows.
Option A is not suitable because Lambda is not designed for multi-hour processing jobs due to runtime limits.
Option D is incorrect because Lambda layers are for sharing libraries and runtime dependencies, not for packaging multi-hour container workloads. It also still depends on Lambda runtime limits and does not match the operational model for long-running batch processing.
Therefore, option C is the best modernization approach with the least operational overhead.
[References:AWS documentation on AWS Fargate for running container workloads without managing EC2 instances and supporting long-running tasks.AWS documentation on AWS Step Functions for orchestrating long-running workflows, retries, parallelism, and service integrations including Amazon ECS.AWS documentation on Amazon EventBridge for routing Amazon S3 object events to targets such as Step Functions state machines for event-driven architectures., , ]