> ## Documentation Index
> Fetch the complete documentation index at: https://docs.verseodin.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Rate limits

> 60 requests/minute/key. Token-bucket algorithm. Standard X-RateLimit headers.

Every API key gets a per-key budget. Stay inside it and you'll never see
a `429`; cross it and you'll get told exactly how long to wait.

## The numbers

| Knob                | Value                            |
| ------------------- | -------------------------------- |
| Requests per minute | **60**                           |
| Burst capacity      | **60 tokens**                    |
| Refill rate         | **1 token / second**             |
| Algorithm           | Token bucket, in-memory, per-key |

The bucket starts full (60 tokens) and refills at 1 token/second up to
the cap. Each request consumes 1 token regardless of which endpoint or
how big the response is.

## Headers on every response

The four headers below ride on every API response, success or failure:

| Header                  | Example      | Meaning                                                                     |
| ----------------------- | ------------ | --------------------------------------------------------------------------- |
| `X-RateLimit-Limit`     | `60`         | Bucket capacity. Constant.                                                  |
| `X-RateLimit-Remaining` | `42`         | Tokens left right now.                                                      |
| `X-RateLimit-Reset`     | `1777358866` | Unix epoch seconds when the bucket would be full again at the current rate. |
| `Retry-After`           | `8`          | **Only on 429.** Seconds to wait before retrying.                           |

A simple polite client just respects `Retry-After` whenever it sees
`429`:

```python theme={null}
import time, requests

def call(path, key):
    r = requests.get(f"https://verseodin.com/api/v1{path}",
                     headers={"Authorization": f"Bearer {key}"})
    if r.status_code == 429:
        time.sleep(int(r.headers.get("Retry-After", "5")) + 1)
        return call(path, key)  # one retry — don't loop forever
    r.raise_for_status()
    return r.json()
```

## When you hit 429

```http theme={null}
HTTP/1.1 429 Too Many Requests
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 1777358900
Retry-After: 8
Content-Type: application/json

{
  "error": {
    "type": "rate_limit_error",
    "message": "Rate limit exceeded. Retry after 8s."
  }
}
```

Wait at least `Retry-After` seconds, then retry. The bucket refills
continuously — you don't need to wait until it's full to issue the
next request.

## Tips

* **Cache aggressively.** History rows are immutable once written for a given `(universe, day, engine)` triple — cache them client-side and you'll rarely re-query.
* **Use `/tabs/{tab}` instead of N `/metrics/{metric}` calls.** One round-trip vs five.
* **Use `/history` to grab everything once a day.** A single 2 MB response beats 50 small ones.
* **Spread bursts across keys** if you're running heavy back-fills. Multiple keys → multiple buckets.

## Roadmap

* **Tier-based limits** — Enterprise customers will get higher buckets in a future release.
* **Redis-backed counter** — when we run multiple frontend instances, we'll move the counter off in-process to keep the budget consistent across instances.

Both of these are transparent to the API surface — the headers stay
the same, just the numbers change.
