Spring Sale Special Limited Time 65% Discount Offer - Ends in 0d 00h 00m 00s - Coupon code: ac4s65

A company is developing a generative AI (GenAI) application that analyzes customer service calls in...

A company is developing a generative AI (GenAI) application that analyzes customer service calls in real time and generates suggested responses for human customer service agents. The application must process 500,000 concurrent calls during peak hours with less than 200 ms end-to-end latency for each suggestion. The company uses existing architecture to transcribe customer call audio streams. The application must not exceed a predefined monthly compute budget and must maintain auto scaling capabilities.

Which solution will meet these requirements?

A.

Deploy a large, complex reasoning model on Amazon Bedrock. Purchase provisioned throughput and optimize for batch processing.

B.

Deploy a low-latency, real-time optimized model on Amazon Bedrock. Purchase provisioned throughput and set up automatic scaling policies.

C.

Deploy a large language model (LLM) on an Amazon SageMaker real-time endpoint that uses dedicated GPU instances.

D.

Deploy a mid-sized language model on an Amazon SageMaker serverless endpoint that is optimized for batch processing.

AIP-C01 PDF/Engine
  • Printable Format
  • Value of Money
  • 100% Pass Assurance
  • Verified Answers
  • Researched by Industry Experts
  • Based on Real Exams Scenarios
  • 100% Real Questions
buy now AIP-C01 pdf
Get 65% Discount on All Products, Use Coupon: "ac4s65"