Open Source AIOps · Apache 2.0

Your infrastructure already knows what's wrong.
Now your tooling does too.

pulse-agent · us-east-1 · live stream

47 nodes

02:17:43.221INFOnode-exporter[us-east-1a] cpu_usage=0.34 mem=68%

02:17:43.289TRACEspan[auth-svc] GET /api/v2/session latency=12ms

02:17:43.312INFOkafka consumer lag=142 topic=events partition=3

02:17:43.401DEBUGpg_pool connections=18/25 idle=4 wait=0

02:17:43.445INFOnode-exporter[us-east-1b] cpu_usage=0.31 mem=71%

02:17:43.502WARNredis eviction_rate=0.02 used_memory=7.8GB

02:17:43.578TRACEspan[payment-svc] POST /charge latency=89ms

02:17:43.601INFOingress nginx 200 GET /health 0.3ms

02:17:43.634DEBUGdns lookup api.stripe.com ttl=60 ok

02:17:43.712INFOnode-exporter[us-east-1c] cpu_usage=0.29 mem=65%

02:17:44.001INFOprometheus scrape interval=15s targets=47

02:17:44.089TRACEspan[order-svc] POST /order latency=210ms

02:17:44.122DEBUGgc pause 14ms heap=2.1GB

02:17:44.201INFOcdn edge cache hit_rate=0.94 region=iad

>█

Simulate a production incident →

Pulse · Anomaly Dashboard

auto-correlating

All Systems Nominal

47/47 nodes healthy · SLO 99.97%

INFOAuth service elevated P99 latency

14m ago

auth-svc

↳ Routine traffic spike — no action needed

📖 Runbook RB-204

Step 1 of 1

From zero to observability in 90 seconds.

One command. No YAML sprawl. No six-figure contracts. Pulse auto-discovers your services and starts correlating within minutes.

$curl -fsSL https://get.pulse-obs.dev | bash█

DockerHelmK8sBinaryTerraform

Star on GitHub

Why Pulse

Every SRE's nightmare. Pulse's baseline.

The Status Quo

Alert storms that cry wolf at 3 a.m.

Your Prometheus rules fire 40 alerts for one bad deploy. By the time you triage, the real cause is buried.

40+ alerts for a single incident
Manual correlation across 6 dashboards
MTTR measured in hours, not seconds

Pulse Answer94% less noise

One incident. One root cause. One runbook.

Pulse clusters correlated signals in real time, traces the blast radius, and surfaces the exact runbook before you open Slack.

Alert noise reduced by 94% on average
Automatic root-cause graph in < 1s
MTTR cut from 47 min → 4 min

The Status Quo

Duct-taped dashboards nobody trusts.

Grafana, Datadog, PagerDuty, Jaeger — four tools, four panes of glass, zero shared context.

Metrics, logs, and traces siloed
No correlation between services
On-call engineers context-switch constantly

Pulse Answer30+ integrations

Unified telemetry. One source of truth.

Pulse ingests from Prometheus, OpenTelemetry, Loki, and 30+ exporters — then correlates everything in a single timeline.

OTLP, Prometheus, Loki native ingestion
Cross-signal correlation engine
30+ integrations, zero config drift

The Status Quo

Observability that costs more than your infra.

Datadog bills scale with cardinality. A Series B startup can easily hit $80k/year just for metrics.

$80k+/year for mid-size stacks
Vendor lock-in and data egress fees
Open-source alternatives require FTEs to maintain

Pulse Answer$0 to start

Enterprise observability. Open-source price.

Self-hosted, Apache 2.0. Your data stays in your VPC. Scale to billions of events without a per-seat conversation.

Free forever for self-hosted
Your data, your VPC, your rules
Cloud-managed tier from $0 → scales with you

Community

Built in the open. Accelerating.

Pulse ships weekly. 187 contributors across 23 countries. The repo doesn't sleep — and neither does the correlator.

GitHub Stars

Contributors

0.0k

Monthly Downloads

Core Contributors

Marcus Kim

Core maintainer

Shreya Rajan

Alert engine

Tomás Pérez

OTLP ingestion

Aisha Wallace

UI/Dashboard

Daniel Luo

ML correlator

Fatima Osei

K8s operator

Jonas Nilsson

Prometheus adapter

Priya Bhatt

Runbook engine

+ 179 more contributorsView all

Recent Commitslive

a3f92c1

feat(correlator): reduce false positive rate by 18%

Marcus Kim · 4 min ago

b7e14d8

fix(alert): deduplicate firing alerts across partitions

Shreya Rajan · 12 min ago

c2a8f3e

perf(ingest): batch OTLP spans 40% faster

Tomás Pérez · 31 min ago

d9c4b2a

feat(ml): add temporal anomaly window support

Daniel Luo · 1h ago

e5f1c9b

ui(dashboard): add root-cause graph zoom controls

Aisha Wallace · 2h ago

f8a3d7c

chore(k8s): bump operator CRD to v1beta2

Fatima Osei · 3h ago

g1b4e6a

fix(prometheus): handle remote_write backpressure

Jonas Nilsson · 4h ago

h6c2f8d

feat(runbook): add auto-remediation dry-run mode

Priya Bhatt · 5h ago

a3f92c1

feat(correlator): reduce false positive rate by 18%

Marcus Kim · 4 min ago

b7e14d8

fix(alert): deduplicate firing alerts across partitions

Shreya Rajan · 12 min ago

c2a8f3e

perf(ingest): batch OTLP spans 40% faster

Tomás Pérez · 31 min ago

d9c4b2a

feat(ml): add temporal anomaly window support

Daniel Luo · 1h ago

e5f1c9b

ui(dashboard): add root-cause graph zoom controls

Aisha Wallace · 2h ago

f8a3d7c

chore(k8s): bump operator CRD to v1beta2

Fatima Osei · 3h ago

g1b4e6a

fix(prometheus): handle remote_write backpressure

Jonas Nilsson · 4h ago

h6c2f8d

feat(runbook): add auto-remediation dry-run mode

Priya Bhatt · 5h ago

Star on GitHub · 0

Benchmarks

The numbers that wake you up less.

Measured against 10,000 replayed production incidents across 40 customer stacks. Published methodology, reproducible results.

97.4%

Incident Detection Accuracy

True positive rate across 10k incident replay tests

Manual triage61%

+36.400000000000006%

0.8s

Mean Time to Detect

Median time from anomaly onset to correlated incident

PagerDuty threshold180s

225× faster

94%

Alert Noise Reduction

Signals suppressed that would have fired without correlation

Raw Prometheus0%

+94%

12ms

P99 Ingest Latency

End-to-end from metric scrape to correlation engine

Datadog agent340ms

28× faster

Production-ready · Apache 2.0 · Self-hostable

Your infra is talking.
Start listening.

Deploy in 90 seconds. No credit card. No vendor lock-in. Join 187 contributors shipping observability that doesn't cost six figures.

Star on GitHub

Docker · Helm · K8s · Binary · Terraform · 30+ integrations

Your infrastructure already knows what's wrong.Now your tooling does too.

From zero to observability in 90 seconds.

Every SRE's nightmare. Pulse's baseline.

Alert storms that cry wolf at 3 a.m.

One incident. One root cause. One runbook.

Duct-taped dashboards nobody trusts.

Unified telemetry. One source of truth.

Observability that costs more than your infra.

Enterprise observability. Open-source price.

Built in the open. Accelerating.

The numbers that wake you up less.

Incident Detection Accuracy

Mean Time to Detect

Alert Noise Reduction

P99 Ingest Latency

Your infra is talking.Start listening.

Choose deployment method

Your infrastructure already knows what's wrong.
Now your tooling does too.

Your infra is talking.
Start listening.