According to the Microsoft Azure AI Fundamentals (AI-900) official study guide and Microsoft Learn module “Explore computer vision”, Optical Character Recognition (OCR) is a form of computer vision that enables a system to detect and extract printed or handwritten text from images or documents. OCR is particularly useful in scenarios where the goal is to digitize textual information from physical documents, such as receipts, invoices, or travel expense forms — exactly as described in this question.
In the given scenario, employees need a mobile application that allows them to scan and store expenses while traveling. The process involves taking photos of receipts that contain printed text, such as vendor names, totals, dates, and item descriptions. The OCR technology automatically detects the text areas within the image and converts them into machine-readable and searchable data that can be stored in a database or processed further for expense management.
Microsoft’s Azure Cognitive Services include the Computer Vision API and the Form Recognizer service, both of which use OCR technology. The Form Recognizer builds upon OCR by adding intelligent document understanding, enabling it to extract structured data from expense receipts automatically.
Other answer options are incorrect for the following reasons:
A. Semantic segmentation assigns labels to every pixel in an image, typically used in autonomous driving or medical imaging, not for text extraction.
B. Image classification identifies the overall category of an image (e.g., “This is a receipt”), but it does not extract the textual content.
C. Object detection identifies and locates objects in an image with bounding boxes but is not used for text reading or conversion.
Therefore, based on the official AI-900 training and Microsoft Learn content, the correct answer is D. Optical Character Recognition (OCR) — the technology that enables extracting textual information from scanned expense receipts.