Architecture Deep Dive

Trace schema, span conventions, and context propagation patterns

W3C Trace Context Format

traceparent Header Format
00-4bf92f3577b34da6a3ce929d0e0e4736-00f067aa0ba902b7-01
Version (2 hex)
Trace ID (32 hex)
Span ID (16 hex)
Flags (2 hex)

The W3C Trace Context specification defines a standard header format for propagating trace context across service boundaries. The traceparent header carries the trace ID (globally unique), span ID (parent span), and sampling flags.

Context Propagation Flow

1
Browser Initiates Trace
User clicks optimize button. WebTracerProvider creates root span with new trace_id. FetchInstrumentation automatically injects traceparent header into HTTP request.
traceparent: 00-{trace_id}-{span_id}-01
2
CNC AppService Receives Request
ASP.NET Core middleware extracts traceparent header and creates Activity with same trace_id but new span_id. Parent-child relationship established.
Activity.Current.TraceId == incoming_trace_id
Activity.Current.ParentSpanId == incoming_span_id
3
Service Bus Message Injection
Before sending to queue, inject trace context into message properties. Azure SDK auto-instrumentation handles Diagnostic-Id property.
message.ApplicationProperties["Diagnostic-Id"] = Activity.Current.Id
4
QSM Sidecar Extracts Context
Consumer extracts Diagnostic-Id from message properties and creates new Activity linked to the original trace.
var traceContext = message.ApplicationProperties["Diagnostic-Id"];
using var activity = ActivitySource.StartActivity("ProcessMessage", ActivityKind.Consumer, traceContext);
5
Julia Inference Receives Context
Trace ID passed to Julia via RPC metadata or environment variable. Julia logs include trace_id for correlation.
@info "Starting optimization" trace_id=ENV["TRACE_ID"] iterations=max_iter

Span Schema Definitions

cnc.optimize SpanKind.SERVER
Attribute Type Required Description
cnc.request_id string required Unique request identifier
cnc.model_type string required Optimization model type (e.g., "milp", "quadratic")
cnc.problem_size int optional Number of variables in the problem
cnc.constraints_count int optional Number of constraints
cnc.queue.send SpanKind.PRODUCER
Attribute Type Required Description
messaging.system string required "azure_servicebus"
messaging.destination string required Queue name
messaging.message_id string required Service Bus message ID
julia.inference SpanKind.INTERNAL
Attribute Type Required Description
julia.solver string required "HiGHS"
julia.solve_time_ms float required Solver execution time in milliseconds
julia.iterations int optional Number of solver iterations
julia.objective_value float optional Final objective function value
julia.gap float optional MIP optimality gap

Span Naming Conventions

HTTP Spans
  • Use HTTP method and route pattern
  • Include service prefix for clarity
  • Parameterize dynamic segments
HTTP POST /api/v1/optimize
Database Spans
  • Prefix with db type
  • Include operation type
  • Sanitize query parameters
db.postgresql SELECT results
Queue Spans
  • Use messaging.* semantic conventions
  • Distinguish send vs receive
  • Include queue/topic name
cnc.queue.send optimize-jobs
Internal Spans
  • Use service.operation format
  • Be specific but not verbose
  • Include business context
julia.inference HiGHS

Sampling Strategy

Sampling determines which traces are recorded and exported. The right sampling strategy balances visibility with cost and performance.

100%
Development
Capture all traces for debugging
10%
Staging
Balance visibility and volume
1%
Production
Cost-effective monitoring
100%
Errors
Always capture error traces
Recommended: Tail-based Sampling

Use tail-based sampling in the OTel Collector to make sampling decisions after seeing the complete trace. This ensures error traces and slow requests are always captured.

  • Sample 100% of error traces (status_code >= 400)
  • Sample 100% of slow traces (duration > 5s)
  • Sample 1% of successful fast traces
  • Always sample traces with specific baggage items