Surviving Solana RPC 429s: Rate Limits in Production
If you run anything serious on Solana — an arbitrage bot, an indexer, a trading frontend — you have already met HTTP 429. It shows up exactly when you don't want it: during a volatility spike, when your bot is polling hardest and the chain is busiest, and suddenly half your getAccountInfo calls come back 429 Too Many Requests instead of data. The position you were tracking moves, and you're flying blind for a few hundred milliseconds.
Solana makes this worse than most chains for one structural reason: blocks come roughly every 400ms. That's 5–10× the block cadence of a typical EVM L1, which means anything that polls "once per block" is making 5–10× the requests, and anything that reacts to state changes is fighting a firehose. Rate limits that would be generous on Ethereum get eaten alive on Solana. This post is about why the 429s happen and how to stop them — first by handling them correctly, then by removing the cause.
Why 429s happen on Solana specifically
A 429 means you crossed a request-per-second (or per-method) ceiling on your endpoint. Three things conspire to make that easy on Solana:
- Block time. At ~400ms/slot, a naive "poll every block" loop is 2.5 requests/second per thing you watch. Watch 40 accounts and you're at 100 req/s before doing anything clever.
- Heavy methods.
getProgramAccounts(gPA) scans every account owned by a program and can return megabytes. Many providers count it as several requests, or rate-limit it separately, or both. One unfiltered gPA can blow your whole budget for that second. - Polling instead of subscribing. The most common cause. People reach for
getAccountInfoin a loop because it's familiar, when Solana's WebSocket subscriptions exist precisely so you don't have to poll.
The fix has two halves: handle the 429s you still get gracefully, and restructure so you generate far fewer of them.
Half one: handle 429s correctly
You will never reach exactly zero 429s, so your client has to survive them. The wrong response is to retry immediately — that just adds load to an already-saturated endpoint and makes it worse for every other call you have in flight.
Respect the Retry-After header when it's present, and fall back to exponential backoff with jitter when it isn't:
async function rpcWithRetry(connection, fn, { max = 5 } = {}) {
let attempt = 0;
for (;;) {
try {
return await fn();
} catch (err) {
const is429 =
err?.message?.includes("429") ||
err?.code === 429 ||
err?.statusCode === 429;
if (!is429 || attempt >= max) throw err;
// Honor Retry-After if the server sent one, else backoff.
const retryAfter = Number(err?.headers?.["retry-after"]);
const backoff = Number.isFinite(retryAfter)
? retryAfter * 1000
: Math.min(2 ** attempt * 250, 4000);
const jitter = backoff * 0.25 * Math.random();
await new Promise((r) => setTimeout(r, backoff + jitter));
attempt++;
}
}
}
Two details that matter in production:
- Jitter is not optional. If ten workers all 429 at the same instant and all back off for exactly 500ms, they all retry at the same instant and 429 again. Jitter spreads the herd.
- Cap the backoff. On a 400ms chain, a retry that lands four seconds late is stale data. Past a ceiling, it's better to drop the call and re-fetch fresh than to deliver an answer about a slot that's long gone.
Half two: stop generating the load
Backoff keeps you alive; it doesn't make you fast. The real win is making fewer, cheaper requests.
Subscribe, don't poll
If you're calling getAccountInfo in a loop to watch an account, replace it with an accountSubscribe over WebSocket. The server pushes you the new state the moment it changes — one persistent connection instead of 2.5 polls/second/account. For most "watch these accounts" workloads this single change cuts request volume by an order of magnitude.
import { Connection } from "@solana/web3.js";
const connection = new Connection(
"https://rpc.swiftnodes.io/rpc/solana?key=YOUR_API_KEY",
{
wsEndpoint: "wss://rpc.swiftnodes.io/ws/solana?key=YOUR_API_KEY",
commitment: "confirmed",
}
);
// Push-based: no polling, no per-block request cost.
const subId = connection.onAccountChange(
pubkey,
(accountInfo, ctx) => {
handleUpdate(accountInfo.data, ctx.slot);
},
"confirmed"
);
onProgramAccountChange does the same for a whole program, and onLogs lets you react to transactions touching a program without polling for signatures. Anything you were polling to detect a change is a subscription candidate.
Batch what you can't subscribe to
For reads you genuinely have to pull, use the batch methods instead of N single calls. getMultipleAccounts fetches up to 100 accounts in one request:
// Instead of 100 getAccountInfo calls:
const infos = await connection.getMultipleAccountsInfo(pubkeys); // 1 request
That's a 100:1 reduction against your rate limit for the same data. Same idea applies to bundling reads inside a single tick rather than scattering them across the event loop.
Tame getProgramAccounts
If you must call gPA, always pass filters (memcmp + dataSize) so the server returns a slice, not the whole program. Better still, if you only need to know when accounts change, subscribe with onProgramAccountChange and keep a local map instead of re-scanning. An unfiltered gPA on a busy program is the single most common "why am I getting 429s" culprit.
Half three: size the endpoint so the ceiling is high enough
Handling and reducing load only get you so far if the ceiling itself is too low for your real traffic. Public Solana RPC endpoints are aggressively rate-limited precisely because Solana's block rate makes them expensive to run — they're fine for a wallet, not for a bot. This is where flat-rate, per-second limits beat opaque credit systems: you can actually do the arithmetic ahead of time.
| Plan | HTTP req/s | WS msg/s | WS conns |
|---|---|---|---|
| Free | 2 | 1 | 2 |
| Starter | 50 | 25 | 20 |
| Growth | 150 | 75 | 50 |
| Scale | 300 | 150 | 100 |
| Pro | 500 | 250 | 200 |
Sizing is a multiplication, not a guess: take your steady-state req/s, multiply by your worst-case burst factor (volatility spikes are often 3–5×), and pick the tier above that. Because the limit is a flat req/s number and not a pool of compute units that heavy methods drain unpredictably, you can size once and not get surprised by a getProgramAccounts-shaped bill. See RPC rate limits decoded for how req/s, CUs, and API credits compare as limit models.
A production checklist
- Wrap every RPC call in retry-with-jitter; honor
Retry-After; cap backoff so you never deliver stale slots. - Replace polling loops with
accountSubscribe/onProgramAccountChange/onLogswherever you're watching for change. - Batch reads with
getMultipleAccounts(up to 100/call) instead of looping single fetches. - Never call
getProgramAccountswithoutfilters; prefer a subscription + local cache. - Size your endpoint to steady-state req/s × burst factor, then leave headroom.
Get all of that right and 429s go from a daily fire to a rare event you simply ride out.
SwiftNodes gives you a flat per-second Solana limit — HTTP and WebSocket under one key across 75+ chains, with no compute units to budget around. Start on the free tier and point your bot at https://rpc.swiftnodes.io/rpc/solana?key=YOUR_API_KEY to see where your real traffic lands.
Related posts
- How to Handle WebSocket Reconnections Without Losing Events
A WebSocket subscription that silently drops is worse than no subscription at all — you keep running, but events vanish into the gap. Here's how to build reconnect logic that detects the drop, backs off, re-subscribes, and backfills the missed events so your indexer never loses a log.
- RPC Rate Limits Decoded: req/s, CUs, API Credits
Every RPC provider limits you, but they don't agree on what a limit even is. One caps requests per second, another meters compute units, a third deducts API credits. Here's how the three models work, why the same workload costs differently on each, and how to estimate your own number.
- Solana RPC: WebSocket vs HTTP for High-Frequency Bots
Most Solana bots burn 80% of their RPC budget polling for state that WebSocket subscriptions would push to them for free. Here's when to use which, with the commitment-level gotchas that bite people in production.
