音频提示词(Audio Prompting)
AI 音频生成带来了新的创作可能:从背景音乐到旁白配音,再到音效。本指南会教你如何更有效地为音频模型写提示词。
Armox 中的音频模型(Audio Models in Armox)
音乐生成(Music Generation)
| Model | Credits | Best For |
|---|---|---|
| Music 1.5 | 48 | 快速背景音乐 |
| Lyria 2 | 120 | 高质量歌曲/配乐 |
| MusicGen | 176 | 更细致的控制 |
| Music-01 | 200 | 更复杂的音乐生成 |
语音与配音(Speech & Voice)
| Model | Credits | Best For |
|---|---|---|
| Speech-02 Turbo | 8 | 快速旁白草稿 |
| XTTS-v2 | 20 | 语音克隆/多语言 |
| Speech-02 HD | 40 | 高质量语音 |
| Voice Cloning | 100 | 克隆任意声音 |
音乐提示词基础(Music Prompting Basics)
音乐 prompts 建议描述:
- Genre/Style — 音乐类型/风格
- Mood/Emotion — 情绪/氛围
- Instruments — 乐器/声音元素
- Tempo — 节奏快慢
- Purpose — 用途/场景
基本结构(Basic Structure)
Prompt Template
[Genre] music, [mood], [instruments], [tempo], [purpose/context]
风格关键词(Genre Keywords)
尽量具体描述音乐风格:
常见风格(Popular Genres)
| Genre | Description |
|---|---|
| "Lo-fi hip hop" | 放松、学习氛围 |
| "Cinematic orchestral" | 史诗电影配乐 |
| "Upbeat pop" | 朗朗上口、商业化 |
| "Ambient electronic" | 氛围电子、背景 |
| "Acoustic folk" | 温暖、原声 |
| "Corporate" | 专业商务 |
| "Epic trailer" | 预告片式、层层推进 |
| "Chill electronic" | 轻松现代电子 |
融合风格(Fusion Styles)
你可以组合风格得到更独特的声音:
- "Jazz-influenced lo-fi"
- "Orchestral with electronic elements"
- "Acoustic pop with indie vibes"
- "Cinematic ambient"
情绪与氛围(Mood and Emotion)
正向情绪(Positive Moods)
- "Uplifting and inspiring"
- "Happy and cheerful"
- "Energetic and exciting"
- "Warm and comforting"
- "Hopeful and optimistic"
平静氛围(Calm Moods)
- "Peaceful and serene"
- "Relaxing and meditative"
- "Dreamy and ethereal"
- "Gentle and soothing"
- "Contemplative and reflective"
戏剧化情绪(Dramatic Moods)
- "Intense and powerful"
- "Mysterious and suspenseful"
- "Epic and triumphant"
- "Dark and moody"
- "Emotional and moving"
乐器与声音元素(Instruments and Sounds)
原声乐器(Acoustic Instruments)
Piano, acoustic guitar, strings,
violin, cello, flute,
drums, bass, percussion
电子音色(Electronic Sounds)
Synthesizer, electronic beats,
bass drops, pads, arpeggios,
808 drums, ambient textures
管弦乐(Orchestral)
Full orchestra, brass section,
string ensemble, timpani,
French horn, choir
带乐器的示例(Example with Instruments)
Cinematic orchestral music with soaring strings,
powerful brass, and thundering timpani,
building to an epic climax,
movie trailer style
节奏与能量(Tempo and Energy)
节奏关键词(Tempo Keywords)
| Term | BPM Range | Feel |
|---|---|---|
| "Very slow" | 40-60 | 冥想、缓慢 |
| "Slow" | 60-80 | 放松 |
| "Moderate" | 80-100 | 走路节奏 |
| "Upbeat" | 100-120 | 有活力 |
| "Fast" | 120-140 | 兴奋、快节奏 |
| "Very fast" | 140+ | 强烈、紧张 |
能量描述(Energy Descriptions)
- "Starts soft, builds gradually"
- "High energy throughout"
- "Ebb and flow dynamics"
- "Steady and consistent"
- "Explosive crescendo"
音乐 Prompt 模板(Music Prompt Templates)
背景音乐(Background Music)
Prompt Template
[Genre] background music, [mood], [tempo], suitable for [use case], [instruments], [duration note]
Example:
Lo-fi hip hop background music,
relaxing and focused, moderate tempo,
suitable for studying or working,
mellow beats with soft piano and vinyl crackle,
loopable
视频配乐(Video Soundtrack)
Prompt Template
[Genre] music for [video type], [mood] feeling, [tempo], [instruments], syncs with [visual description]
Example:
Uplifting corporate music for product launch video,
inspiring and professional feeling, moderate upbeat tempo,
piano, light percussion, subtle strings,
builds energy toward the end
Podcast/YouTube 片头(Podcast/YouTube Intro)
Prompt Template
[Genre] intro music, [duration] seconds, [mood], [instruments], catchy and memorable, suitable for [content type]
Example:
Modern electronic intro music,
10-15 seconds, energetic and exciting,
punchy synths and driving beat,
catchy and memorable hook,
suitable for tech podcast
语音与旁白提示(Speech and Voiceover Prompting)
Text-to-Speech
对 Speech-02 等模型来说,你的 prompt 就是“要被读出来的文本”:
Prompt Template
[The actual words you want spoken]声音特征(Voice Characteristics)
部分模型支持指定声音特征:
| Trait | Options |
|---|---|
| Gender | Male, female, neutral |
| Age | Young, middle-aged, elderly |
| Tone | Professional, friendly, authoritative |
| Accent | American, British, Australian |
| Speed | Slow, normal, fast |
语音示例(Speech Examples)
Professional Narration:
Welcome to our quarterly report.
This presentation covers our key achievements
and strategic initiatives for the coming year.
Friendly Explainer:
Hey there! In this video, we're going to show you
exactly how to get started with our app.
It's super easy, I promise!
Dramatic Trailer:
In a world where technology has changed everything...
one company dares to reimagine the future.
语音克隆(Voice Cloning)
要克隆声音:
- Upload 一段清晰音频(10-30 秒)
- Connect 到 Voice Cloning node
- Provide 你想要朗读的文本
- AI 会用该声音生成语音
参考音频最佳实践(Best Practices for Voice Samples)
- ✅ 清晰、低噪声录音
- ✅ 只有一位说话者
- ✅ 自然语速
- ✅ 10-30 秒语音
- ❌ 背景音乐/噪声
- ❌ 多人混说
- ❌ 过度处理的音频
音频与视频结合(Combining Audio with Video)
Workflow
- 先生成视频
- 生成与视频长度匹配的音频
- 在你的视频剪辑软件里合成
让音频匹配画面(Matching Audio to Video)
建议考虑:
- Video duration — 音频长度要匹配
- Video mood — 氛围要一致
- Key moments — 节拍与画面切点对齐
- Pacing — 快节奏视频搭配更有能量的音乐
常见音乐提示错误(Common Music Mistakes)
❌ 太模糊(Too Vague)
Nice music
✅ 更好(Better)
Uplifting acoustic folk music,
warm and hopeful, moderate tempo,
acoustic guitar and light percussion,
suitable for lifestyle brand video
❌ 描述冲突(Conflicting Descriptions)
Sad and depressing but also happy and upbeat
✅ 更好(Better)
Bittersweet and nostalgic,
melancholic melody with hopeful undertones
❌ 技术细节过度(Too Specific Technically)
Music in C major at 120 BPM with
a I-IV-V-I chord progression
✅ 更好(Better)
Upbeat pop music with a classic,
familiar chord progression,
catchy and singable
Use Case 示例(Use Case Examples)
YouTube 视频背景音乐
Upbeat electronic background music,
energetic but not distracting,
moderate-fast tempo,
synth pads and light beats,
suitable for tech review video,
loopable for long videos
冥想/放松
Ambient meditation music,
deeply calming and peaceful,
very slow tempo,
soft pads, gentle bells, nature sounds,
suitable for yoga or sleep
产品广告
Modern corporate music,
confident and innovative feeling,
moderate tempo building to upbeat,
clean synths and subtle percussion,
30 seconds, ends with resolution
社交媒体 Reel
Trendy pop music,
catchy and fun, fast tempo,
suitable for Instagram Reels,
15-30 seconds,
hook in first 3 seconds
迭代策略(Iteration Strategy)
- Start simple — Genre + mood + tempo
- Add instruments — 指定关键声音
- Refine energy — 描述动态与推进
- Match purpose — 贴合使用场景
下一步
- Audio Nodes — 音频节点设置
- Audio & Music Workflow — 完整音频制作流程
- Video Prompting — 生成视频与音频配对