← All posts

Trace Context Propagation Across Go Workers and AWS Queues

Circuit board close up

Distributed tracing is straightforward for HTTP calls. A request comes in, middleware starts a span, headers propagate to the next service, and the trace forms a nice chain. Queues break that chain unless you explicitly carry trace context through the message.

For systems built with Go workers, SQS, EventBridge, and background jobs, trace context propagation is the difference between seeing a complete workflow and seeing disconnected islands.

Put Trace Context in Message Attributes

Do not hide trace metadata inside business payloads. Use message attributes when the transport supports them. For SQS, the W3C `traceparent` header can be stored as an attribute and extracted by the consumer.

func addTraceAttributes(ctx context.Context, attrs map[string]types.MessageAttributeValue) {
    carrier := propagation.MapCarrier{}
    otel.GetTextMapPropagator().Inject(ctx, carrier)

    for k, v := range carrier {
        attrs[k] = types.MessageAttributeValue{
            DataType:    aws.String("String"),
            StringValue: aws.String(v),
        }
    }
}

Extract Before Starting Work

On the worker side, extract context before creating the processing span. That makes the worker span a child of the original request span instead of a new root trace.

func contextFromMessage(ctx context.Context, msg types.Message) context.Context {
    carrier := propagation.MapCarrier{}
    for k, v := range msg.MessageAttributes {
        if v.StringValue != nil {
            carrier[k] = *v.StringValue
        }
    }
    return otel.GetTextMapPropagator().Extract(ctx, carrier)
}

Name Spans Around Business Work

Queue spans should be named for the business operation, not the infrastructure. `sqs.receive` is less useful than `campaign.sync_report`. Infrastructure details belong as attributes.

  • messaging.system = aws.sqs
  • messaging.destination = report-sync-queue
  • tenant.company_id = 5000
  • ads.profile_id = 50000100
  • job.type = campaign-report-sync

Handle Retries Clearly

Retries should be visible in traces. Include receive count, attempt number, and idempotency key. When a message lands in the dead-letter queue, the trace should make it obvious which dependency failed and how many times.

Once trace context flows through queues, production debugging changes. Instead of starting from logs and reconstructing causality, you can open one trace and see the HTTP request, event publication, queue delay, worker processing, database calls, and downstream API calls in order.

Comments