Data Lakes in OCI Object Storage store raw data for analysis. The three correct characteristics are:
Schema on read (C):Data Lakes store data in its raw, native format (e.g., JSON, CSV, Parquet) without a predefined schema. The schema is applied when data is read or processed, not when written, offering flexibility. For example, a Parquet file with sales data might be queried with SQL only when analyzed, not structured upfront like in a database.
Multiple subject areas (D):Data Lakes aggregate data from diverse sources—sales, HR, IoT—spanning multiple subject areas. This enables cross-domain analysis, like combining customer data with weather data for insights, all stored in a single OCI bucket.
Mixed data types (E):Data Lakes support varied formats: structured (e.g., CSV tables), semi-structured (e.g., JSON documents), and unstructured (e.g., videos). For instance, a bucket might hold CSV logs, JSON events, and image files, all accessible for processing.
The incorrect options are:
High concurrency (A):Data Lakes in Object Storage are not designed for high-concurrency transactional access (e.g., thousands of simultaneous updates). They’re optimized for batch processing or analytics, unlike ATP’s concurrency focus.
High transaction performance (B):Transactional performance (e.g., fast commits) is a database strength, not a Data Lake’s. Object Storage prioritizes scalability and durability over transactional speed, making it unsuitable for OLTP workloads.
These traits make Data Lakes ideal for big data analytics, not real-time transactions.