Step-by-step instructions for implementing end-to-end observability with OpenTelemetry
Establish the current state of your system and collect baseline measurements. This phase identifies all instrumentation points and documents where time is being spent in the 14-second user experience.
Analyze the current React/Next.js setup and identify all instrumentation points for the optimize flow.
# Find all optimize-related components grep -r "optimize" src/components --include="*.tsx" # Check for existing performance monitoring grep -r "performance.mark\|performance.measure" src/ # List API call patterns grep -r "fetch\|axios\|useSWR" src/ --include="*.ts"
Map the CNC AppService and QSM code paths to identify trace injection points.
// Key areas to map in CNC AppService: // 1. Controller entry point [HttpPost("optimize")] public async Task<IActionResult> Optimize(OptimizeRequest request) { // REQUEST RECEIVED - first trace point } // 2. Queue message creation await _serviceBusClient.SendMessageAsync(message); // QUEUE INSERTION - second trace point // 3. Response polling/waiting var response = await WaitForResponse(correlationId); // RESPONSE PICKUP - third trace point
Measure the current 14-second user experience and identify where time is being spent.
// Quick baseline measurement script (run in browser console) const baseline = { buttonClick: 0, apiCallStart: 0, apiCallEnd: 0, stateUpdate: 0, renderComplete: 0 }; // Wrap the optimize button click const originalClick = optimizeButton.onclick; optimizeButton.onclick = function() { baseline.buttonClick = performance.now(); originalClick.apply(this, arguments); }; // Log results after optimization completes console.table(baseline);
| Button Click → API Start | ~50-200ms |
| API Round-trip | ~2-8s (includes queue + Julia) |
| State Update | ~100-500ms |
| Re-render (soft refresh?) | ~4-10s (suspected bottleneck) |
Add immediate visibility with manual performance marks. This provides instant insight into where time is being spent without the overhead of setting up full OpenTelemetry infrastructure.
Add Stopwatch timing to CNC AppService endpoints.
// CncTiming.cs public class CncTiming { public string RequestId { get; set; } public DateTime RequestReceived { get; set; } public DateTime? QueueInserted { get; set; } public DateTime? ResponseReceived { get; set; } public DateTime? ResponseSent { get; set; } public void LogTimings(ILogger logger) { var queueTime = QueueInserted - RequestReceived; var processTime = ResponseReceived - QueueInserted; var totalTime = ResponseSent - RequestReceived; logger.LogInformation( "CNC Timing [{RequestId}]: Queue={QueueMs}ms, Process={ProcessMs}ms, Total={TotalMs}ms", RequestId, queueTime?.TotalMilliseconds, processTime?.TotalMilliseconds, totalTime?.TotalMilliseconds ); } }
Implement the core distributed tracing infrastructure. This phase establishes the trace schema, deploys the collector, and instruments all services for end-to-end visibility.
Deploy the OTel Collector with Azure Monitor exporter.
# otel-collector-config.yaml receivers: otlp: protocols: grpc: endpoint: 0.0.0.0:4317 http: endpoint: 0.0.0.0:4318 cors: allowed_origins: - "https://*.yourdomain.com" - "http://localhost:*" processors: batch: timeout: 10s send_batch_size: 1024 memory_limiter: check_interval: 1s limit_mib: 512 exporters: azuremonitor: connection_string: ${APPLICATIONINSIGHTS_CONNECTION_STRING} debug: verbosity: detailed service: pipelines: traces: receivers: [otlp] processors: [memory_limiter, batch] exporters: [azuremonitor, debug]
Set up the WebTracerProvider and auto-instrumentation.
// src/lib/otel/provider.ts import { WebTracerProvider } from '@opentelemetry/sdk-trace-web'; import { BatchSpanProcessor } from '@opentelemetry/sdk-trace-base'; import { OTLPTraceExporter } from '@opentelemetry/exporter-trace-otlp-http'; import { Resource } from '@opentelemetry/resources'; import { SemanticResourceAttributes } from '@opentelemetry/semantic-conventions'; import { registerInstrumentations } from '@opentelemetry/instrumentation'; import { FetchInstrumentation } from '@opentelemetry/instrumentation-fetch'; import { UserInteractionInstrumentation } from '@opentelemetry/instrumentation-user-interaction'; export function initTelemetry() { const resource = new Resource({ [SemanticResourceAttributes.SERVICE_NAME]: 'cnc-frontend', [SemanticResourceAttributes.SERVICE_VERSION]: process.env.NEXT_PUBLIC_VERSION, [SemanticResourceAttributes.DEPLOYMENT_ENVIRONMENT]: process.env.NODE_ENV, }); const exporter = new OTLPTraceExporter({ url: process.env.NEXT_PUBLIC_OTEL_COLLECTOR_URL + '/v1/traces', headers: {}, }); const provider = new WebTracerProvider({ resource }); provider.addSpanProcessor(new BatchSpanProcessor(exporter)); provider.register(); registerInstrumentations({ instrumentations: [ new FetchInstrumentation({ propagateTraceHeaderCorsUrls: [ new RegExp(`${process.env.NEXT_PUBLIC_API_URL}`), ], }), new UserInteractionInstrumentation({ eventNames: ['click', 'submit'], }), ], }); return provider; }
Configure ActivitySource and Azure SDK instrumentation.
// Program.cs - OpenTelemetry Configuration using OpenTelemetry.Resources; using OpenTelemetry.Trace; using Azure.Monitor.OpenTelemetry.Exporter; builder.Services.AddOpenTelemetry() .ConfigureResource(resource => resource .AddService( serviceName: "cnc-appservice", serviceVersion: typeof(Program).Assembly.GetName().Version?.ToString())) .WithTracing(tracing => tracing // Auto-instrumentation .AddAspNetCoreInstrumentation() .AddHttpClientInstrumentation() .AddSource("Azure.*") // Custom ActivitySources .AddSource("CNC.Optimize") .AddSource("CNC.Queue") // Exporters .AddAzureMonitorTraceExporter(options => { options.ConnectionString = builder.Configuration["ApplicationInsights:ConnectionString"]; }) .AddOtlpExporter(options => { options.Endpoint = new Uri(builder.Configuration["OtelCollector:Endpoint"]); }));
// Observability/OptimizeActivitySource.cs using System.Diagnostics; public static class OptimizeActivitySource { public static readonly ActivitySource Source = new("CNC.Optimize"); public static Activity? StartOptimize(string requestId) { return Source.StartActivity("Optimize", ActivityKind.Server)? .SetTag("cnc.request_id", requestId); } public static Activity? StartQueueSend(string correlationId) { return Source.StartActivity("QueueSend", ActivityKind.Producer)? .SetTag("messaging.destination", "optimize-queue") .SetTag("messaging.correlation_id", correlationId); } }
Profile and optimize React rendering to eliminate the suspected "soft refresh" bottleneck. Target: 50% reduction in re-render time.
Use React DevTools Profiler to identify unnecessary re-renders and optimize the component tree.
Create operational dashboards in Azure Monitor and Grafana for ongoing visibility and SLA monitoring.
Build the ardeshir.io/open documentation site with interactive visualizations and implementation guides.