Apache Iceberg is a table format designed for large-scale data lakes that supports ACID transactions, schema evolution, time travel, and row-level updates and deletes. Using S3 Tables with Apache Iceberg provides a fully managed experience that integrates natively with Amazon Athena, Amazon Redshift, and Amazon EMR.
By using AWS Glue with the Iceberg catalog, the data engineer can perform daily updates and deletions without managing Spark clusters, compaction scheduling, or metadata cleanup manually. Iceberg handles snapshots, file pruning, and unreferenced file removal automatically, significantly reducing operational overhead.
Apache Hudi requires Amazon EMR clusters, Spark jobs, and manual compaction orchestration, increasing complexity. The Parquet-only approaches in options C and D do not support updates or deletes efficiently and would require full rewrites of datasets, which is not scalable.
Therefore, using S3 Tables with Apache Iceberg provides the most efficient, scalable, and low-maintenance solution that satisfies all query and update requirements.