Option B is the correct solution because it leverages native Amazon Bedrock intelligent prompt routing, which is specifically designed to reduce cost and complexity in multi-model GenAI architectures. Intelligent prompt routing automatically analyzes incoming prompts and selects the most appropriate foundation model based on prompt characteristics and complexity—without requiring custom classification logic or orchestration code.
This approach directly meets the requirement for least implementation effort. The company does not need to deploy additional Lambda functions, maintain routing rules, or manage separate classification stages. Routing decisions are handled by Bedrock, which simplifies architecture and reduces operational risk.
By routing the majority (70%) of simple product inquiries to smaller, lower-cost models, the company minimizes inference cost and latency. More complex return policy inquiries are automatically routed to larger models that provide better reasoning capabilities, preserving response quality and customer satisfaction.
Because routing is handled inline by Bedrock, response latency remains low compared to multi-stage architectures that require an additional classification model call before inference. This is critical for customer service scenarios where responsiveness directly impacts satisfaction.
Option A introduces additional inference steps and custom logic. Option C increases cost by overusing a mid-sized model for all queries. Option D relies on brittle keyword rules and increases operational overhead through endpoint management.
Therefore, Option B delivers the optimal balance of cost efficiency, performance, and simplicity for dynamic model selection in Amazon Bedrock.