Claude Prep Lab — CCA-F Cheat Sheet
5 domains · Scenario-based MCQ · 720/1000 to pass
Correct loop control
- Continue when
stop_reason = "tool_use" - Terminate when
stop_reason = "end_turn" - Append tool results to conversation history each iteration
Anti-patterns
- Parse natural language signals to detect completion
- Use arbitrary iteration caps as primary stop mechanism
- Check for assistant text content as completion indicator
Hub-and-spoke pattern
- Coordinator manages ALL inter-subagent communication and error handling
- Subagents have isolated context — they do not inherit coordinator history
- Pass complete findings explicitly in each subagent's prompt
- Use
Tasktool to spawn subagents;allowedToolsmust include"Task" - Spawn parallel subagents by emitting multiple
Taskcalls in a single coordinator response (not across separate turns) - Coordinator prompts: specify goals + quality criteria, not step-by-step procedures
Narrow decomposition
If coordinator assigns subtasks that don't cover the full topic, downstream agents succeed but output is incomplete. Design coordinator to map the full domain before assigning subtasks.
Forking & resumption
--resume <name>continues a named sessionfork_sessionbranches from a shared baseline- If tool results are stale, start fresh with injected summary
- When resuming, tell the agent which files changed
Deterministic. Use when compliance is non-negotiable: identity verification, financial thresholds, policy gates. Zero failure rate.
Probabilistic. Non-zero failure rate. OK for guidance and suggestions. Never rely on this for critical business logic.
SDK Hooks (PostToolUse)
- Intercept tool results to normalize formats before model sees them (Unix timestamps, ISO 8601, numeric codes)
- Intercept outgoing tool calls to enforce rules (block refunds > $500, redirect to escalation)
Prompt chaining
Fixed sequential steps. Best for predictable multi-aspect reviews. Example: analyze each file, then run a cross-file integration pass.
Dynamic decomposition
Generate subtasks based on what is discovered at each step. Best for open-ended investigation. Map structure first, then prioritize adaptively.
What a good tool description includes
- Purpose (specific, not generic)
- Expected input formats and example queries
- Edge cases and what it does NOT handle
- When to use it vs similar alternatives
Overlapping descriptions
Two tools with near-identical descriptions cause misrouting. Fix: rename + rewrite to eliminate overlap. System prompt wording can also create unintended tool associations.
Split generic tools
Replace one vague tool with 2-3 purpose-specific tools with defined input/output contracts. E.g.,
analyze_document → extract_data_points, summarize_content,
verify_claim.
Error response must include:
errorCategory: transient / validation / permission / businessisRetryable: boolean- Human-readable description
- For business violations:
retriable: false+ customer-friendly message
Needs retry logic. Propagate with structured context so coordinator can decide.
Successful query, no matches. Do NOT treat as an error. Distinguish clearly.
Too many tools = worse performance
18 tools vs 4-5 tools — decision complexity degrades reliability. Give each agent only the tools its role needs. Agents with out-of-scope tools will misuse them.
tool_choice options
"auto"— model may return text or call a tool"any"— model must call some tool (prevents conversational text){"type":"tool","name":"..."}— forces a specific named tool first
Project-level (.mcp.json)
Shared via version control. Use for shared team tooling. Use ${ENV_VAR} expansion for auth
tokens — never commit secrets.
User-level (~/.claude.json)
Personal/experimental only. Not shared with teammates. Both levels can be active simultaneously.
MCP Resources
Expose content catalogs (issue summaries, DB schemas, doc hierarchies) as resources to reduce exploratory tool calls. Use existing community MCP servers for standard integrations (e.g., Jira) — build custom only for team-specific workflows.
Grep vs Glob
- Grep: search file contents (function names, imports, error strings)
- Glob: find files by name/extension pattern (
**/*.test.tsx)
Edit vs Read+Write
- Edit: targeted modification using unique text match
- If Edit fails (non-unique match), fall back to Read then Write
Three levels
~/.claude/CLAUDE.md— user-level: personal only, not version-controlled.claude/CLAUDE.mdor rootCLAUDE.md— project-level: shared via git- Subdirectory
CLAUDE.md— directory-level: scoped to that folder
@import syntax
Reference external files to keep CLAUDE.md modular. Import only the standards relevant to each package.
.claude/rules/
Topic-specific rule files (e.g., testing.md, api-conventions.md). Use YAML
frontmatter with paths: glob patterns for conditional loading only when editing matching
files.
User-level config doesn't share to teammates
If a new team member doesn't receive instructions, check whether they live in
~/.claude/CLAUDE.md instead of project-level. Use /memory to verify which files
are loaded in a session.
Commands
.claude/commands/— project-scoped, version-controlled, team-wide~/.claude/commands/— personal only, not shared
Skills (.claude/skills/)
context: fork— runs in isolated sub-agent context, prevents output polluting main sessionallowed-tools— restricts tools during skill executionargument-hint— prompts for required params if invoked without arguments
On-demand invocation for task-specific workflows (e.g., codebase analysis, brainstorming).
Always-loaded universal standards. Rules that must apply to every single session.
Use .claude/rules/ with glob patterns when:
- Conventions span multiple directories (test files spread throughout codebase)
- You need automatic loading without manual invocation
- Rules apply to file types regardless of location (
**/*.test.tsx)
Preferred over subdirectory CLAUDE.md files for cross-directory conventions.
Plan Mode when:
- Large-scale changes across many files
- Multiple valid approaches exist
- Architectural decisions required
- Monolith-to-microservices, library migrations (45+ files)
- You need to explore before committing
Direct Execution when:
- Simple, well-scoped single-file changes
- Clear stack trace pointing to one fix
- Adding a single validation or conditional
- Scope and approach are already clear
Key CLI flags
-p/--print— non-interactive mode (prevents pipeline hangs waiting for input)--output-format json+--json-schema— machine-parseable structured output for inline PR comments
Session isolation for code review
The same session that generated code is less effective at reviewing it (retains reasoning context, less likely to question its own decisions). Use an independent review instance without the generator's reasoning context.
"Flag comments only when claimed behavior contradicts actual code behavior"
"Be conservative" / "Only report high-confidence findings" — no meaningful improvement in precision
False positive effect
High false-positive categories undermine trust in accurate ones. Temporarily disable noisy categories to restore developer trust while improving those prompts separately. Define explicit severity criteria with concrete code examples per severity level.
When few-shot is the most effective technique
- Detailed instructions alone produce inconsistent output
- Ambiguous scenarios where judgment must be demonstrated
- Reducing hallucination in extraction tasks with varied formats
- Showing model what to accept vs flag (false positive reduction)
- Demonstrating correct handling of varied document structures
Structure of good few-shot examples
- 2-4 targeted examples covering ambiguous cases
- Show reasoning: why this choice over plausible alternatives
- Demonstrate desired output format (location, issue, severity, suggested fix)
- Include varied document structures to enable generalization to novel patterns
Key rules
tool_use+ JSON schema = most reliable structured output (eliminates syntax errors)- Strict schemas prevent syntax errors but not semantic errors (line items not summing, values placed in wrong fields)
- Make fields optional/nullable when source may not contain that data — prevents fabrication to satisfy required fields
- Use
"other" + detailfield pattern for extensible enum categories - Add
"unclear"enum value for genuinely ambiguous cases
Retry works for:
- Format mismatches
- Structural output errors
- Schema compliance failures
Include on retry: original doc + failed extraction + specific validation error.
Retry won't fix:
- Information absent from the source document
- Data only in an external doc not provided
Detect these cases and route to human review. Don't loop indefinitely.
Semantic error detection
Extract calculated_total alongside stated_total and flag mismatches for human
review. Never silently reconcile financial data or pick the "most likely correct" value automatically.
Use Message Batches API for:
- Overnight reports, weekly audits
- Nightly test generation
- Any latency-tolerant workload
50% cost savings. Up to 24h processing window. No guaranteed latency SLA.
Do NOT use batch for:
- Blocking pre-merge checks (devs wait for results)
- Any workflow requiring fast response
- Multi-turn tool calling (not supported)
Batch SLA formula
Worst-case total = max wait time + 24h batch window. Must be under SLA. Example: 30h SLA → submit every 4h (4+24=28h, 2h buffer). 6h submission windows leave zero buffer.
Independent review beats self-review
A model retains reasoning context from generation — less likely to question its own decisions. An independent Claude instance (without prior context) catches more subtle issues than self-review instructions or extended thinking. Split large reviews into per-file local passes + cross-file integration pass to avoid attention dilution.
Lost in the middle
Models reliably process info at the beginning and end of long inputs. Middle sections get attention gaps. Put key findings summaries first. Use explicit section headers for detailed middle content.
Structured fact extraction
Extract transactional facts (amounts, dates, order numbers, statuses) into a persistent "case facts" block included in each prompt — outside summarized history. Don't let these get condensed.
Trim verbose tool outputs
Tool results accumulate and consume tokens disproportionately to their relevance. Keep only relevant fields (e.g., 5 return-relevant fields from an order lookup with 40+ fields).
Long multi-issue sessions
Progressive summarization for resolved earlier issues — compress into narrative summaries, keep full history only for the active issue. Simpler and more reliable than sidecar context layers or sliding windows.
Escalate when:
- Customer explicitly requests a human — honor immediately, no investigation first
- Policy is ambiguous or silent on the specific request (not just "complex")
- Unable to make meaningful progress on the case
Frustration without explicit escalation request
Acknowledge frustration + offer to resolve. Only escalate if the customer reiterates their preference for a human. Never take unilateral action (e.g., process the refund anyway) without honoring their stated preference first.
Unreliable escalation triggers — don't use:
- Sentiment analysis / frustration detection (doesn't correlate with case complexity)
- Self-reported LLM confidence scores (poorly calibrated on hard cases)
- "Case complexity" heuristics without explicit criteria
Multiple customer matches
When tool returns multiple matches: ask for additional identifiers. Never select based on heuristics (e.g., "most recent" or "most likely").
Structured error context to coordinator must include:
- Failure type
- What was attempted (query, parameters)
- Any partial results obtained
- Potential alternative approaches
Anti-pattern: silent suppression
Returning empty results as "success" prevents recovery and risks incomplete research outputs. Never mask errors this way.
Anti-pattern: full workflow termination
Terminating the entire workflow on one subagent failure. Handle locally when possible; propagate with context when not. Proceed with partial results.
Scratchpad files
Persist key findings across context boundaries. Agents reference scratchpad for subsequent questions to counteract context degradation (model referencing "typical patterns" instead of actual classes found).
Crash recovery pattern
Each agent exports state to a known location. Coordinator loads a manifest on resume and injects state into agent prompts. Don't require full re-exploration.
Coordinator spawning subagents for simple follow-ups
If the coordinator already has findings in its context, do not spawn a subagent to re-ingest 80K tokens for a simple summarization request. Subagent spawning is for complex analysis — not follow-up questions on already-retrieved data.
/compact
Use to reduce context usage during extended exploration sessions when context fills with verbose discovery output. Combine with Explore subagent to isolate verbose discovery phases.
Don't trust aggregate accuracy metrics
97% overall accuracy can mask poor performance on specific document types or fields. Validate accuracy by document type and field segment before reducing human review. Use stratified random sampling to detect novel error patterns.
Confidence-based routing
- Output field-level confidence scores per extraction
- Calibrate review thresholds using labeled validation sets
- Route low-confidence or conflicting-source extractions to human review
- Prioritize limited reviewer capacity toward highest-impact cases
Claim-source mapping requirements
- Subagents must output structured claim-source mappings (URL, doc name, excerpt)
- Synthesis agents must preserve attribution through the synthesis step
- Conflicting statistics from credible sources: annotate with source attribution, don't arbitrarily pick one
- Always include publication/collection dates to prevent temporal differences being misread as contradictions
Format
- Multiple choice, 4 options, 1 correct answer
- 4 scenarios drawn randomly from 6
- No penalty for guessing (answer everything)
Scoring
- Scaled score: 100–1000
- Minimum passing score: 720
- Pass/Fail designation only