You are creating an agent workflow in a Microsoft Foundry project to support natural voice...

Microsoft AI-103 Question Answer

You are creating an agent workflow in a Microsoft Foundry project to support natural voice interactions.

The agent must receive continuous audio input, convert the input into text for reasoning, and then return spoken responses to a

user. The workflow must meet the following requirements:

. Support turn-taking dynamics, where the agent begins to generate the speech output before the user finishes speaking.

. Operate with low latency to maintain a conversational experience.

You need to enable both speech to text and text to speech in a real-time agent interaction.

What should you do?

Use an embeddings model to encode the audio, and then decode the audio into text and speech.

Use batch transcription to convert the audio input and return text responses from the agent.

Use speech translation to convert the audio into another language and return the translated text.

Use real-time speech to text for incoming audio and text to speech for agent responses.

AI-103 PDF/Engine

Get 65% Discount on All Products, Use Coupon: "ac4s65"