API Documentation
InstantTranscriber API v1
Build audio-to-text and video transcription into your app. Upload files, import public URLs, request speaker labels and timestamps, receive webhooks, and export transcripts from one REST API.
Base URL
https://api.instanttranscriber.com/v1Authentication
All endpoints require a bearer API key created in account settings.
Authorization: Bearer it_live_<key_id>_<secret>Processing
Jobs usually process at roughly 10x audio speed. Up to 10 jobs can run in parallel.
Quick Start
Create a transcription
Submit a multipart file upload or a JSON body with a public file_url. Enhanced speaker labels can clean up unclear turns, and optional speaker name detection replaces generic labels only when names are clearly stated in the transcript.
curl -X POST https://api.instanttranscriber.com/v1/transcriptions \
-H "Authorization: Bearer $INSTANTTRANSCRIBER_API_KEY" \
-F "[email protected]" \
-F "speaker_labels=enhanced" \
-F "speaker_name_detection=auto" \
-F "timestamps=false"Request Options
Control transcript output per job
| Field | Values | Default |
|---|---|---|
| speaker_labels | none, standard, enhanced | standard |
| speaker_name_detection | off, auto | off |
| timestamps | true, false | true |
| summaries | short, detailed, or both | none |
| language | auto, omitted, or language code | auto |
| num_speakers | exact speaker count | auto |
| min_speakers / max_speakers | speaker range | auto |
| callback_url | webhook URL | unset |
| callback_secret | HMAC secret | unset |
| wait | hold request open, max 70 seconds | 0 |
Use language=auto or omit language for automatic language detection. The API supports the same 100 transcription language codes available in the web app.
Plans And Quota
Included API audio time
| Plan | Included API audio time | Max upload size | Max audio duration |
|---|---|---|---|
| Free | 1 hour per UTC calendar month | 50 MiB | 35 minutes |
| Premium | 8 hours per Stripe billing month | 1 GiB | 10 hours |
| API Plan | 100 hours per Stripe billing month | 1 GiB | 10 hours |
API usage is billed by audio duration, rounded down to the completed audio second. Each job has a 1 minute minimum. Failed jobs do not count, and dashboard transcriptions do not count against API quota. API Plan overage is billed at $0.49 per audio hour and can be capped from the quota endpoint.
Upload And URL Imports
Supported media and remote file rules
The API accepts common audio and video containers/codecs that ffmpeg can probe and decode, including mp3, m4a, wav, flac, ogg, opus, aiff, mp4, mov, mkv, webm, and 3gp. Video files are accepted when they contain an audio stream.
URL imports must be public http:// or https:// URLs on standard ports 80 or 443. Localhost, private IPs, link-local hosts, authenticated URLs, and YouTube-family URLs are rejected.
Remote download model
InstantTranscriber downloads the remote file server-side before validation, transcoding, queueing, and billing checks. Workers do not stream directly from your URL.
Endpoints
API surface
/v1/transcriptionsCreate a transcription job from a file upload or public file_url.
/v1/transcriptions/{id}/statusPoll queued, transcribing, post_processing, completed, or failed status.
/v1/transcriptions/{id}Fetch the completed transcript, segments, language, and requested summaries.
/v1/transcriptions?limit=25&offset=0List recent API-created jobs and quota counted for them.
/v1/quotaInspect included hours, used hours, remaining quota, reset time, and overage state.
/v1/quota/overage-capSet or remove an API Plan monthly overage cap.
/v1/transcriptions/{id}Delete a transcript and make future requests for that ID return 404.
Webhooks
Receive completion callbacks
Set callback_url to receive a best-effort POST when the top-level job reaches completed or failed. If callback_secret is set, requests include an X-IT-Signature header using HMAC-SHA256 over compact JSON.
Webhooks are at-least-once notifications. Use the transcript ID plus status as your idempotency key and fetch the transcript by ID for the authoritative result.
{
"id": "0a2c9f72-0f0b-42f3-a30b-15dc82619500",
"status": "completed",
"download_urls": {
"srt": "https://api.instanttranscriber.com/export/0a2c9f72.srt",
"vtt": "https://api.instanttranscriber.com/export/0a2c9f72.vtt",
"docx": "https://api.instanttranscriber.com/export/0a2c9f72.docx",
"pdf": "https://api.instanttranscriber.com/export/0a2c9f72.pdf"
}
}Polling
Job statuses
| Status | Meaning |
|---|---|
| queued | Accepted but not yet started. |
| transcribing | ASR is running. |
| succeeded | ASR has finished and post-processing may start shortly. |
| post_processing | Enhanced speaker labels or summaries are still being generated. |
| completed | The final result is ready. |
| failed | The job failed or was rejected. |
Poll every 5 seconds for one-off jobs. For many concurrent jobs, use webhook callbacks or per-job backoff. The default status polling budget is 60 status requests per minute per API key and no more than 1 status request per job every 5 seconds.
Results And Exports
Fetch transcript text, segments, summaries, and exports
GET /v1/transcriptions/{id} returns 409 result_not_ready until requested post-processing finishes. Completed results include full transcript text, language, speaker segments, and requested short or detailed summaries.
Completed webhook payloads include download URLs for SRT, VTT, DOCX, and PDF exports. Plain transcript text is available from the result endpoint.
Export formats
Use TXT for automation, SRT/VTT for captions, DOCX for review, and PDF for sharing or archive.
Errors
Stable error codes
Canonical errors return JSON with error.code, error.message, optional details, and a request_id. Include the request ID in support requests.
| HTTP | error.code | Retry guidance |
|---|---|---|
| 400 | validation_failed | Fix invalid request body, query string, or option values. |
| 400 | unsupported_format | Retry with a supported audio or video file. |
| 400 | remote_fetch_failed | Fix the public URL or upload the file directly. |
| 402 | quota_exceeded | Upgrade, increase the cap, or wait for the quota reset. |
| 402 | requires_upgrade | Upgrade before starting a free-plan job above plan duration limits. |
| 403 | auth_failed | Send a valid API key. |
| 404 | not_found | Use another transcript ID; the object is missing, deleted, or inaccessible. |
| 409 | result_not_ready | Poll status or wait for a webhook before fetching the result. |
| 413 | payload_too_large | Use a smaller file or a higher plan. |
| 422 | file_too_long | Trim or split the file. |
| 429 | rate_limited | Honor the Retry-After header. |
| 500 | internal_error | Retry with backoff and include request_id if it persists. |
Start building with the transcription API
Create an account, generate an API key in account settings, and submit your first audio-to-text job.