How does Apache Iceberg on S3 work?
Apache Iceberg on S3 brings ACID transactions, time travel, and schema evolution to data lakes. How Erathos supports S3 + Iceberg as a destination.



What are Amazon S3 and Apache Iceberg?
Amazon S3 (Simple Storage Service) is a scalable and highly durable AWS storage service, used by companies to store large volumes of data with high availability. It is widely adopted as the foundation for data lakes, but on its own, it does not provide structure for efficient analytical table management.
Apache Iceberg is an open-source table format designed to solve limitations of traditional formats such as Hive and Parquet. It provides support for ACID transactions, data versioning, and optimizations that improve analytical query performance.
Why S3 Apache Iceberg?
Apache Iceberg has revolutionized the analytical table format by addressing historical limitations of formats like Hive and Parquet. It offers:
ACID Transactions: Update, delete, and merge without headaches.
Schema Evolution: Adapt your data as the business changes, without breaking pipelines.
Time Travel: Query previous data versions for audits or comparative analysis.
Scalability: Combine the power of Amazon S3 with Iceberg’s optimized structure.
What changes for Erathos users?
Now, you can route pipelines directly to Iceberg tables on Amazon S3, leveraging the best of both worlds:
Native integration with engines such as Databricks, Trino, Athena, and Spark.
Automatic metadata and partition management.
High performance for analytical queries even on massive datasets.
Practical use cases:
Governed lakes: Build scalable data lakes without sacrificing consistency and versioning.
Near real-time analytics: Increment data frequently without compromising performance.
Audits and compliance: Use time travel to track changes and ensure compliance.
Get started now!
Update your pipelines and try the new S3 Apache Iceberg destination. In just a few clicks, you’ll have a robust, optimized lakehouse.
➡️ Want to see it in action? Book a meeting with the Erathos team and discover how to simplify your data stack, or try it yourself by following the steps in the documentation.
What are Amazon S3 and Apache Iceberg?
Amazon S3 (Simple Storage Service) is a scalable and highly durable AWS storage service, used by companies to store large volumes of data with high availability. It is widely adopted as the foundation for data lakes, but on its own, it does not provide structure for efficient analytical table management.
Apache Iceberg is an open-source table format designed to solve limitations of traditional formats such as Hive and Parquet. It provides support for ACID transactions, data versioning, and optimizations that improve analytical query performance.
Why S3 Apache Iceberg?
Apache Iceberg has revolutionized the analytical table format by addressing historical limitations of formats like Hive and Parquet. It offers:
ACID Transactions: Update, delete, and merge without headaches.
Schema Evolution: Adapt your data as the business changes, without breaking pipelines.
Time Travel: Query previous data versions for audits or comparative analysis.
Scalability: Combine the power of Amazon S3 with Iceberg’s optimized structure.
What changes for Erathos users?
Now, you can route pipelines directly to Iceberg tables on Amazon S3, leveraging the best of both worlds:
Native integration with engines such as Databricks, Trino, Athena, and Spark.
Automatic metadata and partition management.
High performance for analytical queries even on massive datasets.
Practical use cases:
Governed lakes: Build scalable data lakes without sacrificing consistency and versioning.
Near real-time analytics: Increment data frequently without compromising performance.
Audits and compliance: Use time travel to track changes and ensure compliance.
Get started now!
Update your pipelines and try the new S3 Apache Iceberg destination. In just a few clicks, you’ll have a robust, optimized lakehouse.
➡️ Want to see it in action? Book a meeting with the Erathos team and discover how to simplify your data stack, or try it yourself by following the steps in the documentation.