语音合成
OpenAPI Specification
yaml
openapi: 3.0.1
info:
title: ''
description: ''
version: 1.0.0
paths:
/ent/v2/audio-tts:
post:
summary: 语音合成
deprecated: false
description: 官方文档:https://platform.vidu.cn/docs/text-to-speech
tags:
- 官方VIDU视频生成、图片生成、音频生成
parameters:
- name: Authorization
in: header
description: ''
required: false
example: Bearer {{YOUR_API_KEY}}
schema:
type: string
- name: Content-Type
in: header
description: ''
required: false
example: application/json
schema:
type: string
requestBody:
content:
application/json:
schema:
type: object
properties:
text:
type: string
description: >-
需要合成语音的文本
1. 长度限制小于 10000 字符
2. 段落切换用换行符标记
3. 停顿控制:支持自定义文本之间的语音时间间隔,以实现自定义文本语音停顿时间的效果。
- 使用方式:在文本中增加<#x#>标记,x 为停顿时长(单位:秒),范围 [0.01,
99.99],最多保留两位小数。文本间隔时间需设置在两个可以语音发音的文本之间,不可连续使用多个停顿标记
- 示例:你好<#2#>我是vidu<#2#>很高兴见到你
voice_setting_voice_id:
type: string
description: >-
合成音频的音色id
可查看音色列表查询全部可用音色:https://shengshu.feishu.cn/sheets/EgFvs6DShhiEBStmjzccr5gonOg
voice_setting_speed:
type: string
description: |-
语速,默认为1.0
1.0为正常语速,范围 [0.5,2],值为0.5时播报语速最慢,值为2时播报语速最快
voice_setting_volume:
type: string
description: |-
音量大小
范围 0 - 10,默认为0,代表正常音量,值越大音量越高
voice_setting_pitch:
type: string
description: |-
合成音频的语调
范围 [-12,12],默认 0,其中 0 为原音色输出
voice_setting_emotion:
type: string
description: >-
控制合成语音的情绪
1. 参数范围 ["happy", "sad", "angry", "fearful", "disgusted",
"surprised", "calm"],分别对应 7 种情绪:高兴,悲伤,愤怒,害怕,厌恶,惊讶,中性
2. 模型会根据输入文本自动匹配合适的情绪,一般无需手动指定
pronunciation_dict_tone:
type: string
description: >-
定义多音字发音
- 定义需要特殊标注的文字或符号对应的注音或发音替换规则,针对多音字场景,在中文文本中,声调用数字表示:一声为
1;二声为 2;三声为 3;四声为 4;轻声为 5。
- 示例如下:
["燕少飞/(yan4)(shao3)(fei1)", "达菲/(da2)(fei1)", "omg/oh my
god"]
payload:
type: string
description: |-
透传参数
不做任何处理,仅数据传输
注:最多 1048576个字符
required:
- text
- voice_setting_voice_id
x-apifox-orders:
- text
- voice_setting_voice_id
- voice_setting_speed
- voice_setting_volume
- voice_setting_pitch
- voice_setting_emotion
- pronunciation_dict_tone
- payload
example:
text: 人工智能正在改变我们的生活方式,从智能家居到自动驾驶,技术的进步让世界变得更加便利。
voice_setting_voice_id: male-qn-daxuesheng
responses:
'200':
description: ''
content:
application/json:
schema:
type: object
properties:
task_id:
type: string
state:
type: string
model:
type: string
prompt:
type: string
duration:
type: integer
seed:
type: integer
created_at:
type: string
credits:
type: integer
required:
- task_id
- state
- model
- prompt
- duration
- seed
- created_at
- credits
x-apifox-orders:
- task_id
- state
- model
- prompt
- duration
- seed
- created_at
- credits
example:
task_id: '911094612548939776'
state: created
model: audio1.0
prompt: 雨滴落在窗户上的声音,伴随着轻柔的雷声
duration: 5
seed: 0
created_at: '2026-01-20T07:16:38.094635957Z'
credits: 10
headers: {}
x-apifox-name: 成功
security: []
x-apifox-folder: 官方VIDU视频生成、图片生成、音频生成
x-apifox-status: released
x-run-in-apifox: https://app.apifox.com/web/project/5443236/apis/api-407989987-run
components:
schemas: {}
securitySchemes:
bearer:
type: http
scheme: bearer
servers:
- url: https://www.anyapi.vip
description: 正式环境
security:
- bearer: []