A GenAI developer is building a Retrieval Augmented Generation (RAG)-based customer support application that usesAmazon...

Amazon Web Services AIP-C01 Full Course Access

Amazon Web Services AIP-C01 View All Questions

Amazon Web Services AIP-C01 Question Answer

A GenAI developer is building a Retrieval Augmented Generation (RAG)-based customer support application that usesAmazon Bedrockfoundation models (FMs). The application needs to process 50 GB of historical customer conversations that are stored in anAmazon S3bucket as JSON files. The application must use the processed data as its retrieval corpus. The application’s data processing workflow must extract relevant data from customer support documents, remove customer personally identifiable information (PII), and generate embeddings for vector storage. The processing workflow must be cost-effective and must finish within 4 hours.

Which solution will meet these requirements with the LEAST operational overhead?

UseAWS LambdaandAmazon Comprehendto process files in parallel, remove PII, and call Amazon Bedrock APIs to generate vectors. Configure Lambda concurrency limits and memory settings to optimize throughput.

Create anAWS GlueETL job to run PII detection scripts on the data. UseAmazon SageMaker Processingto run the HuggingFaceProcessor to generate embeddings by using a pre-trained model. Store the embeddings inAmazon OpenSearch Service.

Deploy anAmazon EMRcluster that runs Apache Spark with user-defined functions (UDFs) that call Amazon Comprehend to detect PII. Use Amazon Bedrock APIs to generate vectors. Store outputs inAmazon AuroraPostgreSQL with the pgvector extension.

Implement a data processing pipeline that usesAWS Step Functionsto orchestrate a workload that uses Amazon Comprehend to detect PII and Amazon Bedrock to generate embeddings. Directly integrate the workflow withAmazon OpenSearch Serverlessto store vectors and provide similarity search capabilities.

Explanation:

Comprehensive and Detailed 250 to 350 words of Explanation From AWS Generative AI concepts and services documents:

OptionDis the best solution because it delivers a fully managed, scalable pipeline with minimal infrastructure management while meeting the 50 GB and 4-hour constraint. AWS Step Functions provides a serverless orchestration layer that can coordinate parallel processing steps, retries, and error handling without managing clusters or tuning long-running compute.

Using Amazon Comprehend for PII detection fulfills the requirement to remove customer PII in a managed and consistent way. Step Functions can coordinate Comprehend calls at scale and route sanitized outputs into the embedding step. Generating embeddings with Amazon Bedrock keeps the entire workflow within AWS managed services, eliminates the need to maintain custom embedding models, and supports consistent vector representations for downstream retrieval.

Direct integration with Amazon OpenSearch Serverless provides a low-operations vector store that can handle large-scale indexing and similarity search without cluster sizing, node maintenance, or shard management. This aligns strongly with the requirement for least operational overhead and supports growth beyond the initial 50 GB corpus. Step Functions can batch and parallelize ingestion into OpenSearch Serverless to meet the 4-hour completion goal in a cost-effective manner by controlling concurrency, chunk sizes, and failure handling.

Option A can be difficult and costly at this scale because Lambda concurrency and per-invocation overhead can become complex to tune for 50 GB within 4 hours. Option B introduces SageMaker Processing and embedding model management, increasing operational complexity. Option C requires EMR cluster management and tuning, which is the opposite of minimal overhead.

Therefore,Option Dis the most operationally efficient, scalable, and managed approach to build the required PII-sanitized embedding pipeline for a RAG corpus.

AIP-C01 PDF/Engine

Printable Format
Value of Money
100% Pass Assurance
Verified Answers
Researched by Industry Experts
Based on Real Exams Scenarios
100% Real Questions

Get 65% Discount on All Products, Use Coupon: "ac4s65"

A legal research company has a Retrieval Augmented Generation (RAG) application that uses Amazon Bedrock...

A company is designing a solution that uses foundation models (FMs) to support multiple AI...

Summer Sale Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ac4s65

A GenAI developer is building a Retrieval Augmented Generation (RAG)-based customer support application that usesAmazon...

The Answer Is:

Explanation:

Quick Links