4All API 怎么调 OpenAI TTS？

端点 POST https://api.4allapi.com/v1/audio/speech。Body JSON：{ "model": "tts-1", "voice": "alloy", "input": "要合成的文本", "response_format": "mp3" }。Headers 带 Authorization: Bearer YOUR_TOKEN。响应是二进制音频流，直接保存为 mp3 即可。

支持哪些 TTS 音色？

OpenAI TTS 提供 6 种音色：alloy、echo、fable、onyx、nova、shimmer。alloy 中性，echo 男性偏沉稳，fable 英伦腔男声，onyx 浑厚男声，nova 年轻女声，shimmer 温柔女声。中文场景推荐 alloy / nova。

可以输出哪些音频格式？

response_format 支持 mp3（默认）、opus（低带宽实时通信）、aac（移动端友好）、flac（无损）、wav、pcm。前 4 种是有损压缩，wav / pcm 适合后续音频处理流程。

按输入文本的字符数计费，tts-1 与 tts-1-hd 单价不同。tts-1-hd 输出更高质量但价格约为 tts-1 的 2 倍。可在控制台「使用日志」查每次合成的字符数和扣费。

支持。OpenAI TTS 原生支持 50+ 种语言，中文合成自然度可用，但音色仍偏英文母语者风格。如对中文有更高要求，可考虑 4All API 平台上的其他 TTS 模型如 Azure 系列。

返回的音频时长上限？

单次请求 input 文本上限为 4096 字符，合成音频时长约 5-8 分钟（取决于语种和语速）。超长文本需自行拆分后多次调用并拼接音频。

TTS语音合成

Speech Synthesis API Documentation

Overview

The Speech Synthesis (Text-to-Speech, TTS) API lets you convert text into natural, fluent speech. This API is compatible with the OpenAI standard and also supports the Tongyi Qianwen qwen-tts model family, delivering high-quality Chinese and English speech synthesis.

Endpoint Information

Endpoint: /v1/audio/speech
Method: POST
Content Type: application/json
Authentication: Bearer Token

Supported Models

Qwen-TTS Model Family

This system fully supports the Tongyi Qianwen qwen-tts model family:

Model Name	Description	Features
`qwen-tts`	Basic version	Standard audio quality, suitable for general use
`qwen-tts-latest`	Latest version	Better audio quality, supports more voices
`qwen-tts-2025-05-22`	Specific version	Stable release, suitable for production environments

Supported Voices

General Voices (Supported by All Versions)

Voice Code	Voice Name	Gender	Features
`Cherry`	Sweet female voice	Female	Sweet and pleasant, suitable for warm, friendly scenarios
`Serena`	Gentle female voice	Female	Soft and gentle, suitable for professional announcements
`Ethan`	Steady male voice	Male	Calm and composed, suitable for business scenarios
`Chelsie`	Lively female voice	Female	Energetic and lively, suitable for youthful content

Premium Voices (Supported by `qwen-tts-latest` and `qwen-tts-2025-05-22`)

Voice Code	Voice Name	Gender	Features
`Dylan`	Beijing dialect	Male	Youthful and energetic, suitable for trendy content
`Jada`	Wu dialect	Female	Intelligent and elegant, suitable for educational content
`Sunny`	Sichuan dialect	Female	Bright and cheerful, suitable for children’s content

Request Format

Basic Request

{
    "model": "qwen-tts",
    "input": "Hello, welcome to the speech synthesis service!",
    "voice": "Cherry"
}

Full Request Parameters

{
    "model": "qwen-tts-latest",
    "input": "This is a piece of text that needs to be converted into speech. It supports Chinese, English, and mixed Chinese-English input.",
    "voice": "Serena",
    "speed": 1.0,
    "response_format": "wav"
}

Request Parameters

Parameter	Type	Required	Description
`model`	string	Yes	The TTS model to use, supports OpenAI tts and the qwen-tts family
`input`	string	Yes	Text to convert into speech, up to 512 tokens
`voice`	string	Yes	Voice selection, see the supported voice list
`speed`	number	No	Speech speed, range 0.25-4.0, default 1.0
`response_format`	string	No	Audio format, currently supports wav

Response Format

Successful Response

The API returns the audio file content directly, with the following response headers:

Content-Type: audio/wav
Content-Disposition: attachment; filename="audio.wav"

Audio format specifications:

Format: WAV (RIFF)
Encoding: 16-bit PCM
Channels: Mono
Sample Rate: 24000 Hz

Error Response

{
    "error": {
        "message": "Error description",
        "type": "invalid_request_error",
        "code": "error_code"
    }
}

Examples

cURL Examples

Basic Request

curl -X POST "https://api.4allapi.com/v1/audio/speech" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-tts",
    "input": "Hello, this is a speech synthesis test",
    "voice": "Cherry"
  }' \
  --output audio.wav

Premium Voice Request

curl -X POST "https://api.4allapi.com/v1/audio/speech" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "qwen-tts-latest",
    "input": "Welcome to the Tongyi Qianwen speech synthesis service, I’m Jada!",
    "voice": "Jada",
    "speed": 1.2
  }' \
  --output audio_jada.wav

JavaScript Example

async function generateSpeech(text, voice = 'Cherry') {
  const response = await fetch('/v1/audio/speech', {
    method: 'POST',
    headers: {
      'Authorization': 'Bearer YOUR_API_KEY',
      'Content-Type': 'application/json'
    },
    body: JSON.stringify({
      model: 'qwen-tts-latest',
      input: text,
      voice: voice
    })
  });

  if (response.ok) {
    const audioBlob = await response.blob();
    const audioUrl = URL.createObjectURL(audioBlob);

    // Play audio
    const audio = new Audio(audioUrl);
    audio.play();

    return audioUrl;
  } else {
    const error = await response.json();
    throw new Error(error.error.message);
  }
}

// Usage example
generateSpeech('Hello, world!', 'Serena');

Python Example

import requests
import io

def generate_speech(text, voice='Cherry', model='qwen-tts-latest'):
    url = 'https://api.4allapi.com/v1/audio/speech'
    headers = {
        'Authorization': 'Bearer YOUR_API_KEY',
        'Content-Type': 'application/json'
    }
    data = {
        'model': model,
        'input': text,
        'voice': voice
    }

    response = requests.post(url, headers=headers, json=data)

    if response.status_code == 200:
        return response.content
    else:
        raise Exception(f"API Error: {response.status_code} - {response.text}")

# Usage example
audio_content = generate_speech('Hello, this is a Python call example!', 'Ethan')

# Save the audio file
with open('output.wav', 'wb') as f:
    f.write(audio_content)

Node.js Example

const fs = require('fs');
const fetch = require('node-fetch');

async function generateSpeech(text, voice = 'Cherry') {
  try {
    const response = await fetch('https://api.4allapi.com/v1/audio/speech', {
      method: 'POST',
      headers: {
        'Authorization': 'Bearer YOUR_API_KEY',
        'Content-Type': 'application/json'
      },
      body: JSON.stringify({
        model: 'qwen-tts-latest',
        input: text,
        voice: voice
      })
    });

    if (response.ok) {
      const buffer = await response.buffer();
      fs.writeFileSync('audio.wav', buffer);
      console.log('Audio file saved as audio.wav');
    } else {
      const error = await response.json();
      console.error('API error:', error);
    }
  } catch (error) {
    console.error('Request failed:', error);
  }
}

// Usage example
generateSpeech('Welcome to Node.js speech synthesis!', 'Dylan');

Streaming Response (Coming Soon)

For long text, you can use a streaming response to get a faster time to first byte:

curl -X POST "https://api.4allapi.com/v1/audio/speech" \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -H "X-DashScope-SSE: enable" \
  -d '{
    "model": "qwen-tts",
    "input": "This is a longer piece of text; using a streaming response can provide a better experience...",
    "voice": "Chelsie"
  }' \
  --no-buffer

Limitations

Text Limits

Maximum text length per request: 512 tokens
Supported languages: Chinese, English, and mixed Chinese-English text
Special characters are handled automatically

Request Limits

Rate limits: Based on your subscription plan
Concurrency limits: Multiple concurrent requests are supported by default
File size: The generated audio file size depends on text length

Audio Specifications

Maximum audio duration: About 5 minutes (depending on text length)
Audio quality: 24kHz, 16-bit PCM
Output format: WAV

Error Handling

Common Error Codes

Error Code	Description	Solution
`invalid_api_key`	Invalid API key	Check the API key in the Authorization header
`model_not_found`	Model does not exist	Make sure you are using the correct qwen-tts model name
`invalid_voice`	Unsupported voice	Check whether the voice parameter is in the supported list
`text_too_long`	Text too long	Reduce the input text to within 512 tokens
`quota_exceeded`	Insufficient quota	Check your account balance or request rate limits

Troubleshooting

Empty or corrupted audio file

Check whether the API key is valid
Make sure the channel configuration is correct
Verify the model name and voice parameter

Request timeout

Check your network connection
Reduce the text length
Retry the request

Voice does not take effect

Confirm that the model version being used supports the voice
Check the case of the voice parameter

Pricing

The qwen-tts model family is billed by character count:

Billing unit: Calculated based on input character count
Billing method: Prepaid model, deducted from account balance
Price: See the pricing configuration in the admin console

Best Practices

Text Optimization

Punctuation: Using punctuation properly can improve speech rhythm
Numbers: It is recommended to write numbers in Chinese form (for example: 123 → 一百二十三)
English words: In mixed Chinese-English text, English words will be pronounced according to Chinese speech rules

Voice Selection

Scenario matching: Choose the right voice based on the content type
Consistency: For the same application, it is recommended to use a consistent voice
Testing: Test different voices first before making a final choice

Performance Optimization

Caching: Cache audio files for repeated text
Chunking: For long text, process it in segments
Concurrency: Control the number of concurrent requests appropriately

Changelog

v1.0.0 (2025-08-29)

:check_mark_button: Added support for the qwen-tts model family
:check_mark_button: Supports 7 different voices
:check_mark_button: Compatible with OpenAI standard API format
:check_mark_button: Supports both streaming and non-streaming responses
:check_mark_button: Complete error handling mechanism
:check_mark_button: Multilingual SDK examples

Technical Support

If you run into any issues while using the service, please:

Check the troubleshooting section in this document
Review the error message in the API response
Contact the technical support team

Note: This API is fully compatible with OpenAI’s /v1/audio/speech interface specification, so it can directly replace existing OpenAI TTS calls.

4All API · One-stop AI large model API aggregation platform | Pricing | Contact Us

TTS语音合成

Speech Synthesis API Documentation

Overview

Endpoint Information

Supported Models

Qwen-TTS Model Family

Supported Voices

General Voices (Supported by All Versions)

Premium Voices (Supported by qwen-tts-latest and qwen-tts-2025-05-22)

Request Format

Basic Request

Full Request Parameters

Request Parameters

Response Format

Successful Response

Error Response

Examples

cURL Examples

Basic Request

Premium Voice Request

JavaScript Example

Python Example

Node.js Example

Streaming Response (Coming Soon)

Limitations

Text Limits

Request Limits

Audio Specifications

Error Handling

Common Error Codes

Troubleshooting

Pricing

Best Practices

Text Optimization

Voice Selection

Performance Optimization

Changelog

v1.0.0 (2025-08-29)

Technical Support

4All API Footer

Premium Voices (Supported by `qwen-tts-latest` and `qwen-tts-2025-05-22`)