Open Source AIOps · Apache 2.0

Your infrastructure already knows what's wrong.
Now your tooling does too.

pulse-agent · us-east-1 · live stream
47 nodes
02:17:43.221INFOnode-exporter[us-east-1a] cpu_usage=0.34 mem=68%
02:17:43.289TRACEspan[auth-svc] GET /api/v2/session latency=12ms
02:17:43.312INFOkafka consumer lag=142 topic=events partition=3
02:17:43.401DEBUGpg_pool connections=18/25 idle=4 wait=0
02:17:43.445INFOnode-exporter[us-east-1b] cpu_usage=0.31 mem=71%
02:17:43.502WARNredis eviction_rate=0.02 used_memory=7.8GB
02:17:43.578TRACEspan[payment-svc] POST /charge latency=89ms
02:17:43.601INFOingress nginx 200 GET /health 0.3ms
02:17:43.634DEBUGdns lookup api.stripe.com ttl=60 ok
02:17:43.712INFOnode-exporter[us-east-1c] cpu_usage=0.29 mem=65%
02:17:44.001INFOprometheus scrape interval=15s targets=47
02:17:44.089TRACEspan[order-svc] POST /order latency=210ms
02:17:44.122DEBUGgc pause 14ms heap=2.1GB
02:17:44.201INFOcdn edge cache hit_rate=0.94 region=iad
>
Pulse · Anomaly Dashboard
auto-correlating

All Systems Nominal

47/47 nodes healthy · SLO 99.97%

OK
INFOAuth service elevated P99 latency
14m ago
auth-svc

Routine traffic spike — no action needed

📖 Runbook RB-204

Step 1 of 1

From zero to observability in 90 seconds.

One command. No YAML sprawl. No six-figure contracts. Pulse auto-discovers your services and starts correlating within minutes.

$curl -fsSL https://get.pulse-obs.dev | bash
DockerHelmK8sBinaryTerraform
Star on GitHub

Why Pulse

Every SRE's nightmare. Pulse's baseline.

The Status Quo

Alert storms that cry wolf at 3 a.m.

Your Prometheus rules fire 40 alerts for one bad deploy. By the time you triage, the real cause is buried.

  • 40+ alerts for a single incident
  • Manual correlation across 6 dashboards
  • MTTR measured in hours, not seconds
Pulse Answer94% less noise

One incident. One root cause. One runbook.

Pulse clusters correlated signals in real time, traces the blast radius, and surfaces the exact runbook before you open Slack.

  • Alert noise reduced by 94% on average
  • Automatic root-cause graph in < 1s
  • MTTR cut from 47 min → 4 min
The Status Quo

Duct-taped dashboards nobody trusts.

Grafana, Datadog, PagerDuty, Jaeger — four tools, four panes of glass, zero shared context.

  • Metrics, logs, and traces siloed
  • No correlation between services
  • On-call engineers context-switch constantly
Pulse Answer30+ integrations

Unified telemetry. One source of truth.

Pulse ingests from Prometheus, OpenTelemetry, Loki, and 30+ exporters — then correlates everything in a single timeline.

  • OTLP, Prometheus, Loki native ingestion
  • Cross-signal correlation engine
  • 30+ integrations, zero config drift
The Status Quo

Observability that costs more than your infra.

Datadog bills scale with cardinality. A Series B startup can easily hit $80k/year just for metrics.

  • $80k+/year for mid-size stacks
  • Vendor lock-in and data egress fees
  • Open-source alternatives require FTEs to maintain
Pulse Answer$0 to start

Enterprise observability. Open-source price.

Self-hosted, Apache 2.0. Your data stays in your VPC. Scale to billions of events without a per-seat conversation.

  • Free forever for self-hosted
  • Your data, your VPC, your rules
  • Cloud-managed tier from $0 → scales with you

Community

Built in the open. Accelerating.

Pulse ships weekly. 187 contributors across 23 countries. The repo doesn't sleep — and neither does the correlator.

0
GitHub Stars
0
Contributors
0.0k
Monthly Downloads

Core Contributors

MK

Marcus Kim

Core maintainer

SR

Shreya Rajan

Alert engine

TP

Tomás Pérez

OTLP ingestion

AW

Aisha Wallace

UI/Dashboard

DL

Daniel Luo

ML correlator

FO

Fatima Osei

K8s operator

JN

Jonas Nilsson

Prometheus adapter

PB

Priya Bhatt

Runbook engine

+ 179 more contributorsView all

Recent Commitslive

a3f92c1

feat(correlator): reduce false positive rate by 18%

Marcus Kim · 4 min ago

b7e14d8

fix(alert): deduplicate firing alerts across partitions

Shreya Rajan · 12 min ago

c2a8f3e

perf(ingest): batch OTLP spans 40% faster

Tomás Pérez · 31 min ago

d9c4b2a

feat(ml): add temporal anomaly window support

Daniel Luo · 1h ago

e5f1c9b

ui(dashboard): add root-cause graph zoom controls

Aisha Wallace · 2h ago

f8a3d7c

chore(k8s): bump operator CRD to v1beta2

Fatima Osei · 3h ago

g1b4e6a

fix(prometheus): handle remote_write backpressure

Jonas Nilsson · 4h ago

h6c2f8d

feat(runbook): add auto-remediation dry-run mode

Priya Bhatt · 5h ago

a3f92c1

feat(correlator): reduce false positive rate by 18%

Marcus Kim · 4 min ago

b7e14d8

fix(alert): deduplicate firing alerts across partitions

Shreya Rajan · 12 min ago

c2a8f3e

perf(ingest): batch OTLP spans 40% faster

Tomás Pérez · 31 min ago

d9c4b2a

feat(ml): add temporal anomaly window support

Daniel Luo · 1h ago

e5f1c9b

ui(dashboard): add root-cause graph zoom controls

Aisha Wallace · 2h ago

f8a3d7c

chore(k8s): bump operator CRD to v1beta2

Fatima Osei · 3h ago

g1b4e6a

fix(prometheus): handle remote_write backpressure

Jonas Nilsson · 4h ago

h6c2f8d

feat(runbook): add auto-remediation dry-run mode

Priya Bhatt · 5h ago

Star on GitHub · 0

Benchmarks

The numbers that wake you up less.

Measured against 10,000 replayed production incidents across 40 customer stacks. Published methodology, reproducible results.

97.4%

Incident Detection Accuracy

True positive rate across 10k incident replay tests

Manual triage61%
+36.400000000000006%
0.8s

Mean Time to Detect

Median time from anomaly onset to correlated incident

PagerDuty threshold180s
225× faster
94%

Alert Noise Reduction

Signals suppressed that would have fired without correlation

Raw Prometheus0%
+94%
12ms

P99 Ingest Latency

End-to-end from metric scrape to correlation engine

Datadog agent340ms
28× faster
Production-ready · Apache 2.0 · Self-hostable

Your infra is talking.
Start listening.

Deploy in 90 seconds. No credit card. No vendor lock-in. Join 187 contributors shipping observability that doesn't cost six figures.

Star on GitHub

Docker · Helm · K8s · Binary · Terraform · 30+ integrations

Star on GitHub