TTS语音合成
Speech Synthesis API Documentation
Section titled “Speech Synthesis API Documentation”Overview
Section titled “Overview”The Speech Synthesis (Text-to-Speech, TTS) API lets you convert text into natural, fluent speech. This API is compatible with the OpenAI standard and also supports the Tongyi Qianwen qwen-tts model family, delivering high-quality Chinese and English speech synthesis.
Endpoint Information
Section titled “Endpoint Information”- Endpoint:
/v1/audio/speech - Method:
POST - Content Type:
application/json - Authentication: Bearer Token
Supported Models
Section titled “Supported Models”Qwen-TTS Model Family
Section titled “Qwen-TTS Model Family”This system fully supports the Tongyi Qianwen qwen-tts model family:
| Model Name | Description | Features |
|---|---|---|
qwen-tts | Basic version | Standard audio quality, suitable for general use |
qwen-tts-latest | Latest version | Better audio quality, supports more voices |
qwen-tts-2025-05-22 | Specific version | Stable release, suitable for production environments |
Supported Voices
Section titled “Supported Voices”General Voices (Supported by All Versions)
Section titled “General Voices (Supported by All Versions)”| Voice Code | Voice Name | Gender | Features |
|---|---|---|---|
Cherry | Sweet female voice | Female | Sweet and pleasant, suitable for warm, friendly scenarios |
Serena | Gentle female voice | Female | Soft and gentle, suitable for professional announcements |
Ethan | Steady male voice | Male | Calm and composed, suitable for business scenarios |
Chelsie | Lively female voice | Female | Energetic and lively, suitable for youthful content |
Premium Voices (Supported by qwen-tts-latest and qwen-tts-2025-05-22)
Section titled “Premium Voices (Supported by qwen-tts-latest and qwen-tts-2025-05-22)”| Voice Code | Voice Name | Gender | Features |
|---|---|---|---|
Dylan | Beijing dialect | Male | Youthful and energetic, suitable for trendy content |
Jada | Wu dialect | Female | Intelligent and elegant, suitable for educational content |
Sunny | Sichuan dialect | Female | Bright and cheerful, suitable for children’s content |
Request Format
Section titled “Request Format”Basic Request
Section titled “Basic Request”{ "model": "qwen-tts", "input": "Hello, welcome to the speech synthesis service!", "voice": "Cherry"}Full Request Parameters
Section titled “Full Request Parameters”{ "model": "qwen-tts-latest", "input": "This is a piece of text that needs to be converted into speech. It supports Chinese, English, and mixed Chinese-English input.", "voice": "Serena", "speed": 1.0, "response_format": "wav"}Request Parameters
Section titled “Request Parameters”| Parameter | Type | Required | Description |
|---|---|---|---|
model | string | Yes | The TTS model to use, supports OpenAI tts and the qwen-tts family |
input | string | Yes | Text to convert into speech, up to 512 tokens |
voice | string | Yes | Voice selection, see the supported voice list |
speed | number | No | Speech speed, range 0.25-4.0, default 1.0 |
response_format | string | No | Audio format, currently supports wav |
Response Format
Section titled “Response Format”Successful Response
Section titled “Successful Response”The API returns the audio file content directly, with the following response headers:
Content-Type: audio/wavContent-Disposition: attachment; filename="audio.wav"Audio format specifications:
- Format: WAV (RIFF)
- Encoding: 16-bit PCM
- Channels: Mono
- Sample Rate: 24000 Hz
Error Response
Section titled “Error Response”{ "error": { "message": "Error description", "type": "invalid_request_error", "code": "error_code" }}Examples
Section titled “Examples”cURL Examples
Section titled “cURL Examples”Basic Request
Section titled “Basic Request”curl -X POST "https://api.4allapi.com/v1/audio/speech" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen-tts", "input": "Hello, this is a speech synthesis test", "voice": "Cherry" }' \ --output audio.wavPremium Voice Request
Section titled “Premium Voice Request”curl -X POST "https://api.4allapi.com/v1/audio/speech" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -d '{ "model": "qwen-tts-latest", "input": "Welcome to the Tongyi Qianwen speech synthesis service, I’m Jada!", "voice": "Jada", "speed": 1.2 }' \ --output audio_jada.wavJavaScript Example
Section titled “JavaScript Example”async function generateSpeech(text, voice = 'Cherry') { const response = await fetch('/v1/audio/speech', { method: 'POST', headers: { 'Authorization': 'Bearer YOUR_API_KEY', 'Content-Type': 'application/json' }, body: JSON.stringify({ model: 'qwen-tts-latest', input: text, voice: voice }) });
if (response.ok) { const audioBlob = await response.blob(); const audioUrl = URL.createObjectURL(audioBlob);
// Play audio const audio = new Audio(audioUrl); audio.play();
return audioUrl; } else { const error = await response.json(); throw new Error(error.error.message); }}
// Usage examplegenerateSpeech('Hello, world!', 'Serena');Python Example
Section titled “Python Example”import requestsimport io
def generate_speech(text, voice='Cherry', model='qwen-tts-latest'): url = 'https://api.4allapi.com/v1/audio/speech' headers = { 'Authorization': 'Bearer YOUR_API_KEY', 'Content-Type': 'application/json' } data = { 'model': model, 'input': text, 'voice': voice }
response = requests.post(url, headers=headers, json=data)
if response.status_code == 200: return response.content else: raise Exception(f"API Error: {response.status_code} - {response.text}")
# Usage exampleaudio_content = generate_speech('Hello, this is a Python call example!', 'Ethan')
# Save the audio filewith open('output.wav', 'wb') as f: f.write(audio_content)Node.js Example
Section titled “Node.js Example”const fs = require('fs');const fetch = require('node-fetch');
async function generateSpeech(text, voice = 'Cherry') { try { const response = await fetch('https://api.4allapi.com/v1/audio/speech', { method: 'POST', headers: { 'Authorization': 'Bearer YOUR_API_KEY', 'Content-Type': 'application/json' }, body: JSON.stringify({ model: 'qwen-tts-latest', input: text, voice: voice }) });
if (response.ok) { const buffer = await response.buffer(); fs.writeFileSync('audio.wav', buffer); console.log('Audio file saved as audio.wav'); } else { const error = await response.json(); console.error('API error:', error); } } catch (error) { console.error('Request failed:', error); }}
// Usage examplegenerateSpeech('Welcome to Node.js speech synthesis!', 'Dylan');Streaming Response (Coming Soon)
Section titled “Streaming Response (Coming Soon)”For long text, you can use a streaming response to get a faster time to first byte:
curl -X POST "https://api.4allapi.com/v1/audio/speech" \ -H "Authorization: Bearer YOUR_API_KEY" \ -H "Content-Type: application/json" \ -H "X-DashScope-SSE: enable" \ -d '{ "model": "qwen-tts", "input": "This is a longer piece of text; using a streaming response can provide a better experience...", "voice": "Chelsie" }' \ --no-bufferLimitations
Section titled “Limitations”Text Limits
Section titled “Text Limits”- Maximum text length per request: 512 tokens
- Supported languages: Chinese, English, and mixed Chinese-English text
- Special characters are handled automatically
Request Limits
Section titled “Request Limits”- Rate limits: Based on your subscription plan
- Concurrency limits: Multiple concurrent requests are supported by default
- File size: The generated audio file size depends on text length
Audio Specifications
Section titled “Audio Specifications”- Maximum audio duration: About 5 minutes (depending on text length)
- Audio quality: 24kHz, 16-bit PCM
- Output format: WAV
Error Handling
Section titled “Error Handling”Common Error Codes
Section titled “Common Error Codes”| Error Code | Description | Solution |
|---|---|---|
invalid_api_key | Invalid API key | Check the API key in the Authorization header |
model_not_found | Model does not exist | Make sure you are using the correct qwen-tts model name |
invalid_voice | Unsupported voice | Check whether the voice parameter is in the supported list |
text_too_long | Text too long | Reduce the input text to within 512 tokens |
quota_exceeded | Insufficient quota | Check your account balance or request rate limits |
Troubleshooting
Section titled “Troubleshooting”- Empty or corrupted audio file
- Check whether the API key is valid
- Make sure the channel configuration is correct
- Verify the model name and voice parameter
- Request timeout
- Check your network connection
- Reduce the text length
- Retry the request
- Voice does not take effect
- Confirm that the model version being used supports the voice
- Check the case of the voice parameter
Pricing
Section titled “Pricing”The qwen-tts model family is billed by character count:
- Billing unit: Calculated based on input character count
- Billing method: Prepaid model, deducted from account balance
- Price: See the pricing configuration in the admin console
Best Practices
Section titled “Best Practices”Text Optimization
Section titled “Text Optimization”- Punctuation: Using punctuation properly can improve speech rhythm
- Numbers: It is recommended to write numbers in Chinese form (for example:
123→一百二十三) - English words: In mixed Chinese-English text, English words will be pronounced according to Chinese speech rules
Voice Selection
Section titled “Voice Selection”- Scenario matching: Choose the right voice based on the content type
- Consistency: For the same application, it is recommended to use a consistent voice
- Testing: Test different voices first before making a final choice
Performance Optimization
Section titled “Performance Optimization”- Caching: Cache audio files for repeated text
- Chunking: For long text, process it in segments
- Concurrency: Control the number of concurrent requests appropriately
Changelog
Section titled “Changelog”v1.0.0 (2025-08-29)
Section titled “v1.0.0 (2025-08-29)”- :check_mark_button: Added support for the qwen-tts model family
- :check_mark_button: Supports 7 different voices
- :check_mark_button: Compatible with OpenAI standard API format
- :check_mark_button: Supports both streaming and non-streaming responses
- :check_mark_button: Complete error handling mechanism
- :check_mark_button: Multilingual SDK examples
Technical Support
Section titled “Technical Support”If you run into any issues while using the service, please:
- Check the troubleshooting section in this document
- Review the error message in the API response
- Contact the technical support team
Note: This API is fully compatible with OpenAI’s /v1/audio/speech interface specification, so it can directly replace existing OpenAI TTS calls.
4All API Footer
Section titled “4All API Footer”4All API · One-stop AI large model API aggregation platform | Pricing | Contact Us
© 2025 4All API. All rights reserved.