Structured Outputs with Claude API: Production Patterns in Go
The difference between a demo LLM integration and a production one often comes down to structured outputs. In a demo, free-form text is fine — you are showing a human-readable result. In production, you need to reliably parse the response into typed data structures, validate it, handle failures gracefully, and integrate it into downstream systems that expect specific types. This post covers the patterns that have worked in our Go services at the platform.
Why Free-Form Text Fails in Production
LLMs are probabilistic. Even with a deterministic system prompt, the same input can produce slightly different output formats across calls. "Return the ACOS as a number" might sometimes produce 23.5, sometimes 23.5%, sometimes "ACOS: 23.5%". Any of these can happen, and your production system must handle all of them or crash.
Structured outputs — combined with JSON schema validation — eliminate this class of problem. Instead of parsing the LLM response as free text, you define exactly what shape the response should take, enforce it in the prompt, validate it on receipt, and retry if validation fails.
Approach 1: JSON Schema in System Prompt
The simplest approach: tell the model exactly what JSON schema to produce, and refuse to use any response that does not conform:
const analysisSchema = `{
"type": "object",
"required": ["acos", "roas", "trend", "recommendation"],
"properties": {
"acos": {
"type": "number",
"description": "Advertising Cost of Sales percentage (0-100)"
},
"roas": {
"type": "number",
"description": "Return on Ad Spend"
},
"trend": {
"type": "string",
"enum": ["improving", "stable", "degrading"]
},
"recommendation": {
"type": "object",
"required": ["action", "reason", "estimated_impact"],
"properties": {
"action": {"type": "string"},
"reason": {"type": "string"},
"estimated_impact": {"type": "string"}
}
}
}
}`
func buildSystemPrompt() string {
return fmt.Sprintf(`You analyse Amazon advertising campaign performance.
Respond ONLY with a JSON object matching this schema:
%s
Rules:
- acos and roas must be numbers, not strings
- trend must be exactly one of: "improving", "stable", "degrading"
- Do not include markdown, code blocks, or any text outside the JSON`, analysisSchema)
}
Robust JSON Extraction
Even with explicit instructions, Claude sometimes wraps the JSON in markdown code blocks. A robust extractor handles this:
var jsonExtractors = []struct {
pattern *regexp.Regexp
capture int
}{
{regexp.MustCompile("(?s)```json\s*(\{.*?\})\s*```"), 1},
{regexp.MustCompile("(?s)```\s*(\{.*?\})\s*```"), 1},
{regexp.MustCompile("(?s)(\{.*\})"), 1}, // fallback: first JSON object
}
func extractJSON(raw string) (string, error) {
raw = strings.TrimSpace(raw)
// Happy path: the entire response is already JSON
if strings.HasPrefix(raw, "{") && json.Valid([]byte(raw)) {
return raw, nil
}
// Try extractors in order
for _, e := range jsonExtractors {
if m := e.pattern.FindStringSubmatch(raw); len(m) > e.capture {
candidate := strings.TrimSpace(m[e.capture])
if json.Valid([]byte(candidate)) {
return candidate, nil
}
}
}
return "", fmt.Errorf("no valid JSON found in response (raw: %q)", truncate(raw, 200))
}
Schema Validation with jsonschema
Parsing the JSON is necessary but not sufficient. A response like {"acos": "high", "roas": null} parses fine but violates the schema. Validate against the schema before using any field:
import "github.com/santhosh-tekuri/jsonschema/v5"
var compiledSchema *jsonschema.Schema
func init() {
compiler := jsonschema.NewCompiler()
compiler.AddResource("analysis.json", strings.NewReader(analysisSchema))
var err error
compiledSchema, err = compiler.Compile("analysis.json")
if err != nil { panic(err) }
}
func validateAnalysis(raw []byte) error {
var v interface{}
if err := json.Unmarshal(raw, &v); err != nil {
return fmt.Errorf("invalid JSON: %w", err)
}
if err := compiledSchema.Validate(v); err != nil {
var ve *jsonschema.ValidationError
if errors.As(err, &ve) {
return fmt.Errorf("schema violation: %s", ve.Message)
}
return err
}
return nil
}
Retry with Correction Message
When the response fails validation, retry once with an explicit correction message appended to the conversation. This handles ~95% of malformed responses:
func (c *ClaudeClient) GetStructuredAnalysis(ctx context.Context, data CampaignData) (*Analysis, error) {
messages := []anthropic.MessageParam{
anthropic.NewUserMessage(anthropic.NewTextBlock(formatData(data))),
}
for attempt := 0; attempt < 2; attempt++ {
resp, err := c.client.Messages.New(ctx, anthropic.MessageNewParams{
Model: anthropic.F(c.model),
MaxTokens: anthropic.F(int64(512)),
System: anthropic.F([]anthropic.TextBlockParam{anthropic.NewTextBlock(buildSystemPrompt())}),
Messages: anthropic.F(messages),
})
if err != nil { return nil, fmt.Errorf("claude api: %w", err) }
rawText := resp.Content[0].Text
jsonStr, err := extractJSON(rawText)
if err != nil {
if attempt == 0 {
// Append Claude's response and a correction to the conversation
messages = append(messages,
anthropic.NewAssistantMessage(anthropic.NewTextBlock(rawText)),
anthropic.NewUserMessage(anthropic.NewTextBlock(
"Your response did not contain valid JSON. Please respond with ONLY the JSON object, "+
"no markdown, no explanation.")),
)
continue
}
return nil, fmt.Errorf("no JSON after retry: %w", err)
}
if err := validateAnalysis([]byte(jsonStr)); err != nil {
if attempt == 0 {
messages = append(messages,
anthropic.NewAssistantMessage(anthropic.NewTextBlock(rawText)),
anthropic.NewUserMessage(anthropic.NewTextBlock(
"Schema validation failed: "+err.Error()+". Please correct the response.")),
)
continue
}
return nil, fmt.Errorf("validation failed after retry: %w", err)
}
var analysis Analysis
json.Unmarshal([]byte(jsonStr), &analysis)
return &analysis, nil
}
return nil, ErrUnparseable
}
Approach 2: Claude's Native Tool Use
For new integrations, Claude's native tool/function calling is more reliable than manual JSON prompting. You define a tool with a JSON schema, Claude calls it with structured arguments, and you never deal with extraction:
resp, err := client.Messages.New(ctx, anthropic.MessageNewParams{
Model: anthropic.F(anthropic.ModelClaude3_5SonnetLatest),
MaxTokens: anthropic.F(int64(1024)),
Tools: anthropic.F([]anthropic.ToolParam{{
Name: anthropic.F("record_analysis"),
Description: anthropic.F("Record the campaign performance analysis"),
InputSchema: anthropic.F(anthropic.ToolInputSchemaParam{
Type: anthropic.F(anthropic.ToolInputSchemaTypeObject),
Properties: anthropic.F[interface{}](map[string]interface{}{
"acos": map[string]interface{}{"type": "number"},
"roas": map[string]interface{}{"type": "number"},
"trend": map[string]interface{}{"type": "string", "enum": []string{"improving","stable","degrading"}},
}),
Required: anthropic.F([]string{"acos", "roas", "trend"}),
}),
}}),
ToolChoice: anthropic.F[anthropic.ToolChoiceUnionParam](anthropic.ToolChoiceToolParam{
Type: anthropic.F(anthropic.ToolChoiceToolTypeAuto),
Name: anthropic.F("record_analysis"),
}),
Messages: anthropic.F(messages),
})
// Extract the tool call arguments — already valid JSON
for _, block := range resp.Content {
if block.Type == "tool_use" {
var analysis Analysis
json.Unmarshal(block.Input, &analysis)
return &analysis, nil
}
}
Observability for LLM Calls
Log a sampled subset of prompt/response pairs to a separate analytics table. This is essential for debugging quality issues and understanding where the model is struggling:
type LLMTrace struct {
RequestID string `db:"request_id"`
Model string `db:"model"`
Prompt string `db:"prompt"` // truncated to 5KB
Response string `db:"response"` // truncated to 5KB
ParsedOK bool `db:"parsed_ok"`
RetryCount int `db:"retry_count"`
LatencyMS int64 `db:"latency_ms"`
InputTokens int `db:"input_tokens"`
OutputTokens int `db:"output_tokens"`
Cost float64 `db:"cost"`
CreatedAt time.Time `db:"created_at"`
}
// Sample 10% of calls — enough for debugging without storing everything
if rand.Float64() < 0.10 {
go s.logTrace(context.Background(), trace)
}
Structured outputs are the foundation of reliable LLM-powered features. Once you have extraction, validation, and retry in place, you can build on top of Claude with the same confidence you would have with any other typed API.
Comments
Post a Comment