← All Posts
How It Works

Backfill and Gap Detection: How Paychainly Recovers from RPC Downtime

May 21, 2026· 2 min read

The Problem: RPC Downtime

No RPC endpoint has 100% uptime. When your primary node goes down, blocks pass unmonitored. Paychainly solves this with two complementary systems: live failover and startup backfill.

Multi-RPC Failover

The rcps table holds a pool of RPC endpoints ordered by priority. When a request fails or returns a 429 rate-limit response, RpcFailoverService immediately switches to the next endpoint. The failed endpoint enters a 30-second cooldown before rejoining the pool.

The listenerLastBlock Checkpoint

Every 10 blocks (configurable via LISTENER_CHECKPOINT_INTERVAL), the current block number is saved to network_configs.listenerLastBlock. On restart, this value determines where to resume.

Startup Backfill

On boot, BlockPipelineBootstrapService computes the gap:

gap = currentSafeBlock - lastCheckpointBlock
// if gap > LISTENER_MAX_BACKFILL_BLOCKS (50000): alert + partial backfill
// else: full backfill in LISTENER_BACKFILL_CHUNK_BLOCKS (10) chunks

Backfill jobs run at BullMQ priority 10 (lower than live blocks at priority 1), so live payments are never delayed.

Hourly Gap Detector

Even mid-session, the hourly GapDetectorService runs a generate_series SQL query against the block_audit table to find any missing block ranges, then enqueues backfill jobs for them.

Idempotent Processing

Backfill may re-process blocks already seen. The unique txHash constraint on the transactions table ensures every payment is credited exactly once — no double-webhooks.

← Back to Blog
backfillgap detectionRPC failoverreliabilityblock pipeline