GET /v1/usage is the customer-facing analytics surface. It serves the per-team dashboard charts (/dashboard/usage), and is the same API you can hit programmatically for billing reports, custom FinOps dashboards, and downstream observability.
The endpoint returns time-bucketed usage rows, grouped along one or more dimensions (type, model, api_key_id, user_id, status) and filtered by any combination of those plus lora_id and character_id.
Quick shape
data[*] are time buckets, each with groups[*] for the slices. Each group’s identity is in key; the numbers are in metrics. Whether you grouped by type alone or type,model together, this nesting stays the same — key carries every grouped field.
Query parameters
Required
start_time(RFC 3339 timestamp) — inclusive lower bound. Maximum lookback is 730 days.end_time(RFC 3339 timestamp) — exclusive upper bound. Must be >start_time. Defaults tonowif omitted.
Bucket sizing
bucket_width—1m/5m/15m/1h/1d/7d/30d. Determines the granularity of the time-bucketed rows. Default: chosen automatically based on(end_time - start_time)to keep the result under the bucket cap.
1m buckets over a 30-day window would exceed this — the response returns 400 too_many_buckets with a hint to either widen bucket_width or shrink the time range.
Grouping dimensions
group_by(comma-separated list) — any combination of:type,model,api_key_id,user_id,status. Default: no grouping (one row per bucket).
group_by=type is the most common shape — t2i / t2v / chat / embedding rows per bucket. group_by=model breaks LLM rows down by aurous-grow-2.0-pro vs aurous-embed-vision-1.0. group_by=type,model does both.
The lora_id and character_id dimensions are NOT valid in group_by (they have too many values to be useful as a top-level rollup) but ARE accepted as filters.
Filters
The dimensions below act as filters (returns only rows where the filter matches). Multi-value: comma-separated OR repeated query parameter.status—completed/failed/cancelled/processing/pendingtype—t2i(text-to-image) /i2i(image-to-image) /t2v(text-to-video) /i2v(image-to-video) /chat/embedding. Note: this is the inference-type granularity — broader thanimage/video. To filter all image generation rows pass?type=t2i,i2i.api_key_id—apikey_<ulid>user_id— Supabase user uuidlora_id—lora_<ulid>(filter only; not valid ingroup_by)character_id—cha_<ulid>(filter only; not valid ingroup_by)model— slug, e.g.aurous-grow-2.0-pro. Multi-value:?model=aurous-grow-2.0-pro,aurous-embed-vision-1.0or?model=aurous-grow-2.0-pro&model=aurous-embed-vision-1.0.
Pagination
limit— buckets per page (default 100, max 500). Each bucket carries all its groups regardless oflimit.page_token— opaque cursor from the previous response’snext_page. Forward-only walk; the cursor encodes the query fingerprint so changing filters between pages returns400 invalid_page_token.
Bucket + group shape
Eachdata[*] entry is a bucket (a time window). Inside, groups[*] are the grouped slices for that bucket. If you don’t pass group_by, each bucket has exactly one group with key: {} (the un-grouped total).
Metric fields
Every group’smetrics includes:
request_count— total billed requests in the bucketsuccessful_count— terminalcompletedrowsfailed_count— terminalfailedrows (see Known edge case below)cancelled_count— terminalcancelledrowscredits_used— sum ofcredits_chargedacross all rows in the group (4dp)image_count— total successful image-generation outputs (0 for chat / embedding / video buckets)video_seconds— total successful video-generation duration in seconds (0 for chat / embedding / image buckets)total_input_tokens— sum of input tokens for chat + embedding rows (0 for image / video)total_output_tokens— sum of output tokens for chat rows (0 for embedding / image / video)duration_ms_p50/duration_ms_p95— the 50th and 95th percentile end-to-end duration in milliseconds (only present whenrequest_count≥ 20; otherwisenull)
Known edge cases
failed_count does NOT include failed_provider_unavailable
The current implementation buckets these under a separate errored_count that is not yet surfaced on this endpoint. As a result, successful_count + failed_count + cancelled_count may be less than request_count for buckets where some requests hit failed_provider_unavailable.
A fix to either rename failed_count to errored_count (inclusive) or surface separate counters is on the v1.1 roadmap.
Pagination drops groups when limit < group_count
If a bucket has, say, 3 groups (chat + embedding + t2i) and you set limit=1, the response will return that bucket with the first 1 group and a next_page cursor. Calling the cursor returns nothing (the cursor encodes the bucket boundary; the engine has already moved past it).
Workaround: choose a limit larger than your largest bucket’s group count. With group_by=type that’s ≤ 6 (t2i, i2i, t2v, i2v, chat, embedding). Fix on the v1.1 roadmap.
Streaming individual events
If you need event-level granularity (one row per billed request), seeGET /v1/usage/events — the same data without aggregation.
Where to next?
- Usage pagination — cursor walk + known limitations
- Usage event stream — per-event detail rows
- Cost transparency — reconciling totals against per-call receipts
GET /v1/usage— endpoint reference

