Bedrock deep dive: structuring resumes at scale
When you're processing tens of thousands of CVs every day, the difference between a well-designed AI pipeline and a naïve one is the difference between a bill you can justify and one that keeps the CFO up at night. This post walks through how we built a scalable, cost-efficient document structuring service using Amazon Bedrock at eFinancialCareers.
The problem
CVs arrive in every conceivable format — PDFs, Word documents, plain text, HTML scraped from profiles. The data inside is unstructured: inconsistent date formats, freeform job titles, missing fields, multilingual content. The goal was to extract a clean, structured JSON representation of each CV that downstream services could consume reliably.
Why Amazon Bedrock?
We evaluated several approaches including fine-tuned models, traditional NLP pipelines, and managed LLM services. Bedrock won on three axes:
- No model hosting overhead — serverless API, no GPU instances to manage
- AWS-native integration — IAM, CloudWatch, and VPC support out of the box
- Model flexibility — ability to swap foundation models (Claude, Titan) without changing infrastructure
Architecture overview
The pipeline is entirely serverless and event-driven:
S3 (raw CV upload)
→ SQS queue (decoupling & retry)
→ Lambda (Bedrock extraction)
→ DynamoDB (structured output)
→ EventBridge (downstream notification)
Each Lambda invocation processes one CV, calls the Bedrock API with a structured extraction prompt, validates the JSON response, and writes to DynamoDB. Failed extractions are routed to a dead-letter queue for inspection.
Prompt engineering for structured output
The key to reliable extraction is a well-structured prompt with explicit output schema instructions. We use Claude via Bedrock with a system prompt that defines the expected JSON schema and instructs the model to return only valid JSON — no prose, no explanation. We validate the response with a JSON schema validator before writing, and fall back to a retry with a simplified prompt if validation fails.
Cost management at scale
At 10k+ documents per day, Bedrock input/output tokens add up quickly. Our optimisations:
- Pre-process documents to strip noise (headers, footers, repeated boilerplate) before sending to Bedrock
- Cache structured results in DynamoDB with a TTL — re-parse only when the source document changes
- Use the smallest capable model for each task — lighter models for field extraction, larger ones only for ambiguous cases
Lessons learned
The biggest surprises weren't technical — they were operational. SQS visibility timeouts need careful calibration when Bedrock latency varies. CloudWatch alarms on DLQ depth are essential. And prompt versioning (treating prompts like code, with version control and testing) turned out to be one of the most valuable practices we adopted.
If you're evaluating Bedrock for a similar use case, feel free to get in touch — we're always happy to discuss AI pipeline architecture.