Observability
Graphile Worker RS exposes observability through standard Rust tracing, optional
OpenTelemetry compatibility features, and lifecycle hooks. The worker does not
force a logging or metrics backend on applications; install the subscriber,
exporter, and hook plugin that match your runtime.
Tracing
The main crate depends on tracing, so worker internals and application task
handlers can emit spans and events into the subscriber you install. The hooks
example uses tracing-subscriber with an environment filter and a formatted
output layer:
use tracing_subscriber::{
filter::EnvFilter, prelude::__tracing_subscriber_SubscriberExt, util::SubscriberInitExt,
};
pub fn enable_logs() {
let fmt_layer = tracing_subscriber::fmt::layer();
let filter_layer = EnvFilter::try_new("debug,sqlx=warn").unwrap();
tracing_subscriber::registry()
.with(filter_layer)
.with(fmt_layer)
.init();
}
Initialize the subscriber once, before starting the worker. The example keeps
SQLx noise at warn while enabling debug logging elsewhere.
OpenTelemetry compatibility
OpenTelemetry support is feature-gated. Enable exactly one compatibility feature:
[dependencies]
graphile_worker = {
version = "0.13",
features = ["opentelemetry_0_32"]
}
Available compatibility features are:
| Feature | OpenTelemetry crate version |
|---|---|
opentelemetry_0_30 | opentelemetry 0.30 |
opentelemetry_0_31 | opentelemetry 0.31 |
opentelemetry_0_32 | opentelemetry 0.32 |
The crate rejects builds that enable more than one of these features at the same time.
When an OpenTelemetry feature is enabled, job insertion code can capture the
current span context and write it into the job payload under _trace. Object
payloads receive the field directly. Array payloads receive it on each object
item in the array, while scalar values are left unchanged.
{
"user_id": 42,
"_trace": {
"flags": 1,
"trace_id": "4bf92f3577b34da6a3ce929d0e0e4736",
"span_id": "00f067aa0ba902b7"
}
}
When the worker later runs a job, it reads _trace from object payloads and
adds the recorded span context as a link to the current span. Invalid,
non-object, or missing trace payloads are ignored. Without an OpenTelemetry
feature, this trace extraction and linking path is a no-op.
Metrics with lifecycle hooks
Lifecycle hooks are the extension point for metrics and structured operational
events. A plugin registers callbacks on HookRegistry; those callbacks receive
context objects for worker and job events.
The metrics example counts started, completed, and failed jobs with atomics:
use std::sync::Arc;
use std::sync::atomic::{AtomicU64, Ordering};
use graphile_worker::{
HookRegistry, JobComplete, JobFail, JobStart, Plugin, WorkerShutdown, WorkerStart,
};
pub struct MetricsPlugin {
jobs_started: AtomicU64,
jobs_completed: AtomicU64,
jobs_failed: AtomicU64,
}
impl Plugin for MetricsPlugin {
fn register(self, hooks: &mut HookRegistry) {
hooks.on(WorkerStart, async |ctx| {
println!("[MetricsPlugin] Worker {} started", ctx.worker_id);
});
let jobs_started = Arc::new(self.jobs_started);
let jobs_completed = Arc::new(self.jobs_completed);
let jobs_failed = Arc::new(self.jobs_failed);
{
let jobs_started = jobs_started.clone();
hooks.on(JobStart, move |ctx| {
let jobs_started = jobs_started.clone();
async move {
jobs_started.fetch_add(1, Ordering::Relaxed);
println!(
"[MetricsPlugin] Job {} started (task: {})",
ctx.job.id(),
ctx.job.task_identifier()
);
}
});
}
{
let jobs_completed = jobs_completed.clone();
hooks.on(JobComplete, move |ctx| {
let jobs_completed = jobs_completed.clone();
async move {
jobs_completed.fetch_add(1, Ordering::Relaxed);
println!(
"[MetricsPlugin] Job {} completed in {:?}",
ctx.job.id(),
ctx.duration
);
}
});
}
{
let jobs_failed = jobs_failed.clone();
hooks.on(JobFail, move |ctx| {
let jobs_failed = jobs_failed.clone();
async move {
jobs_failed.fetch_add(1, Ordering::Relaxed);
println!(
"[MetricsPlugin] Job {} failed: {} (will_retry: {})",
ctx.job.id(),
ctx.error,
ctx.will_retry
);
}
});
}
hooks.on(WorkerShutdown, move |ctx| async move {
println!(
"[MetricsPlugin] Worker {} shutting down (reason: {:?})",
ctx.worker_id, ctx.reason
);
});
}
}
Use the same hook points to export counters, histograms, or structured events to your metrics backend:
| Hook | Useful signal |
|---|---|
WorkerStart | worker process started and has a worker id |
WorkerShutdown | worker process is stopping and exposes the shutdown reason |
JobStart | job id and task identifier started executing |
JobComplete | job completed and exposes execution duration |
JobFail | job failed, exposes the error and whether it will retry |
Logging
For application logs, combine a subscriber with hook callbacks and task-handler instrumentation. The hook contexts expose job ids, task identifiers, errors, durations, worker ids, and shutdown reasons, which are the stable values to put into operational logs.
For example, a JobFail hook can log the job id, error, and retry decision;
JobComplete can log duration; WorkerShutdown can log the shutdown reason.
The example code prints these values with println!, but production code will
usually emit tracing events or send them to a metrics/logging client.
Admin and monitor visibility
The workspace includes graphile_worker_admin_api,
graphile_worker_admin_ui, and graphile_worker_admin_ui_client crates. Use
those admin surfaces for queue visibility, and use hooks and tracing for
process-local signals that admin views cannot infer on their own, such as
per-process shutdown reasons, local counters, and span links.
For runtime monitoring, the most useful signals visible from the current public hooks are:
| Signal | Source |
|---|---|
| Worker started | WorkerStart hook |
| Worker shutdown reason | WorkerShutdown hook |
| Job throughput | JobStart and JobComplete hooks |
| Job failure rate | JobFail hook |
| Job duration | JobComplete hook |
| Trace continuity from enqueue to execution | _trace payload plus OpenTelemetry span links |