Prateek's Digital Garden

Domain 5: Evaluation and Troubleshooting - Exam Cues

”Least Operational Overhead” Winners

This specific list of services appears repeatedly as the “correct” answer when the question asks for “Least Operational Overhead” or “Least Development Effort.”

Category	The “Winner” Service	Why?
Orchestration	Step Functions	Managed state machine; better than managing Lambda code/retries.
Text Extraction	Textract / BDA	Managed OCR; better than custom libraries.
Search/RAG	Knowledge Bases	Fully managed RAG pipeline; better than building your own.
Apps/Chat	Amazon Q Business	Out-of-the-box chat app; better than building a React UI.
Evaluation	LLM-as-a-judge	Automated scoring; better than manual human review.
Data Quality	Glue Data Quality	Managed rules; better than writing Python validation scripts.
Priv. Network	VPC Endpoints	Managed PrivateLink; better than VPN/DirectConnect for service access.
Classification	Comprehend	Managed classifier; better than fine-tuning a model for simple tags.

Exam Cues

If you see…	Think…
”Compare summaries” + “Automated”	ROUGE / BLEU (via Bedrock Evaluation)
“Validate new model” + “No user impact”	Shadow Testing (SageMaker)
“Drift” or “Degraded accuracy”	Model Monitor
”Audit all prompts/responses”	Invocation Logging (to S3)
“Latency alert”	CloudWatch Alarm (on `InvocationLatency`)
“Subjective quality” + “Scale”	LLM-as-a-judge (Automated Evaluation)
“Ground Truth” + “High Risk”	Human Evaluation (Own Team)