MarkDB

Rate limits & quotas

The API rate limiter, and how MarkDB backs off from provider quota errors.

There are two separate throttles to know about: MarkDB's own HTTP rate limit on the API, and how the background worker handles model-provider quota errors.

API rate limits

The Memory API (api.markdb.cloud/v1/*) is rate-limited per tenant on a fixed one-minute window:

TierDefault
Personal60 requests/min (MARKDB_RATE_LIMIT_PERSONAL_PER_MINUTE)
Team600 requests/min (MARKDB_RATE_LIMIT_TEAM_PER_MINUTE)

When you exceed the window you get 429 Too Many Requests with:

  • Retry-After -- seconds until the window resets
  • X-RateLimit-Limit -- your limit
  • X-RateLimit-Remaining -- 0

What is NOT rate-limited

Only the /v1 data-plane API is limited. The LLM proxy (proxy.markdb.cloud) and the MCP server (mcp.markdb.cloud) are not rate- limited by MarkDB -- their throughput is governed by your upstream model provider's own limits. The limiter also requires Redis; if Redis is down it fails open (requests pass through).

Provider quotas & backoff

Enrichment and embeddings call your model provider, which enforces its own per-minute and per-day quotas. When an upstream returns a quota/rate-limit error (e.g. Gemini RESOURCE_EXHAUSTED, a spending cap, or a rate_limit_exceeded), MarkDB classifies it and backs off instead of hammering the provider:

  • Escalating retry. A quota-blocked enrichment job is requeued with a growing delay -- roughly 5 min, then 30 min, then 2 h, then 6 h (or the provider's own retryDelay hint if it's longer).
  • Per-pass circuit breaker. If several jobs in one worker pass hit quota errors, the pass stops scheduling the rest and tries again on the next tick, rather than burning the whole batch against a capped provider.
  • No data loss. Quota-blocked jobs are retried indefinitely; they are never dropped. Once quota frees up, they complete.

Search indexing behaves similarly, retrying quota-blocked pages on subsequent passes.

What you'll observe

Capture is never blocked by enrichment quota -- your agent's proxy requests still succeed and are mirrored. Under provider quota pressure, only summaries and the search index lag until quota recovers. If summaries stop appearing, check your provider key and its quota, and consider a higher-throughput enrichment model (see Models and Processing).