4All API 怎么调 Whisper？

端点 POST https://api.4allapi.com/v1/audio/transcriptions，multipart/form-data 上传音频文件 + 表单字段 model=whisper-1。返回 JSON 含 text 字段即为识别结果。Headers 带 Authorization: Bearer YOUR_TOKEN。

支持哪些音频格式？

Whisper 支持 mp3、mp4、mpeg、mpga、m4a、wav、webm 等主流格式。单文件大小上限 25MB。长音频需切片后多次调用拼接。

Whisper 准确率怎么样？

whisper-1（基于 whisper-large-v2 同款）中文识别准确率在干净录音下 95%+，多语种自动识别（无需指定 language）。背景噪音大的场景建议先用 RNNoise 等降噪。

可以输出带时间戳的字幕吗？

可以。请求时加 response_format=verbose_json 或 srt / vtt 即可返回带时间戳的输出。verbose_json 含每个 segment 的 start/end/text，srt 和 vtt 是常见字幕格式。

Whisper 计费规则？

按上传音频时长按分钟计费（不足一分钟向上取整），固定单价。可在控制台「使用日志」按 model=whisper-1 查每次识别的时长和扣费。

能识别说话人吗（区分谁在说）？

OpenAI 原版 Whisper 不区分说话人 (speaker diarization)。如需说话人分离，4All API 也提供其他 ASR 模型支持 diarization，或建议在 Whisper 输出后接 pyannote-audio 这类工具做后处理。

支持流式实时转写吗？

/v1/audio/transcriptions 本身是同步接口（上传 → 等待 → 返回）。如需实时流式 ASR，4All API 提供 /v1/realtime（OpenAI Realtime API），支持音频流双向通信。

Whisper语音转文字

whisper Model API Reference

This API is based on the Whisper model and provides speech-to-text functionality, supporting common audio formats.

Basic Concepts

Whisper model: OpenAI’s open-source speech recognition model, supporting multilingual transcription
Audio formats: Supports common formats such as mp3, wav, and m4a

API Endpoint

POST https://api.4allapi.com/v1/audio/transcriptions

Request Parameters

Parameter	Type	Required	Description
model	string	Yes	Fixed value `"whisper-1"`
file	file	Yes	The audio file to transcribe

Request Headers

Authorization: Bearer sk-*********************  # replace with your API token

Python Example

import json
import requests

def voice_to_text(file_path):
    """
    Speech-to-text functionality

    Parameters:
    file_path: path to the audio file

    Returns:
    recognized text content
    """
    url = "https://api.4allapi.com/v1/audio/transcriptions"

    # Build request parameters
    payload = {"model": "whisper-1"}
    files = {"file": ("audio.mp3", open(file_path, "rb"))}

    # Set request headers (replace with your API key)
    headers = {"Authorization": "Bearer sk-***************************"}  # replace with your API token

    # Send POST request
    response = requests.post(url, headers=headers, data=payload, files=files)

    # Parse response data
    data = json.loads(response.text)

    # Return the transcription result
    return data.get("text", "")

# Example usage
print(voice_to_text("audio.mp3"))  # replace with the path to your audio file

Response Example

Successful response:

{
    "text": "This is the recognized text content"
}

Notes

It is recommended that the audio file size not exceed 25MB
Supports multiple languages, including Chinese and English
Please keep your API key secure and do not disclose it

4All API · One-stop AI foundation model API aggregation platform | Pricing | Contact Us