Audio Prompting
AI audio generation opens up new creative possibilitiesβfrom background music to voiceovers to sound effects. This guide teaches you how to prompt audio models effectively.
Audio Models in Armox
Music Generation
| Model | Credits | Best For |
|---|---|---|
| Music 1.5 | 48 | Quick background music |
| Lyria 2 | 120 | High-quality songs |
| MusicGen | 176 | Detailed compositions |
| Music-01 | 200 | Complex music |
Speech & Voice
| Model | Credits | Best For |
|---|---|---|
| Speech-02 Turbo | 8 | Quick voiceovers |
| XTTS-v2 | 20 | Voice cloning |
| Speech-02 HD | 40 | High-quality speech |
| Voice Cloning | 100 | Clone any voice |
Music Prompting Basics
Music prompts should describe:
- Genre/Style β What type of music?
- Mood/Emotion β How should it feel?
- Instruments β What sounds?
- Tempo β Fast or slow?
- Purpose β What's it for?
Basic Structure
Prompt Template
[Genre] music, [mood], [instruments], [tempo], [purpose/context]
Genre Keywords
Be specific about the musical style:
Popular Genres
| Genre | Description |
|---|---|
| "Lo-fi hip hop" | Relaxed, study music vibe |
| "Cinematic orchestral" | Epic, movie soundtrack |
| "Upbeat pop" | Catchy, commercial |
| "Ambient electronic" | Atmospheric, background |
| "Acoustic folk" | Warm, organic |
| "Corporate" | Professional, business |
| "Epic trailer" | Dramatic, building |
| "Chill electronic" | Relaxed, modern |
Fusion Styles
Combine genres for unique sounds:
- "Jazz-influenced lo-fi"
- "Orchestral with electronic elements"
- "Acoustic pop with indie vibes"
- "Cinematic ambient"
Mood and Emotion
Positive Moods
- "Uplifting and inspiring"
- "Happy and cheerful"
- "Energetic and exciting"
- "Warm and comforting"
- "Hopeful and optimistic"
Calm Moods
- "Peaceful and serene"
- "Relaxing and meditative"
- "Dreamy and ethereal"
- "Gentle and soothing"
- "Contemplative and reflective"
Dramatic Moods
- "Intense and powerful"
- "Mysterious and suspenseful"
- "Epic and triumphant"
- "Dark and moody"
- "Emotional and moving"
Instruments and Sounds
Acoustic Instruments
Piano, acoustic guitar, strings,
violin, cello, flute,
drums, bass, percussion
Electronic Sounds
Synthesizer, electronic beats,
bass drops, pads, arpeggios,
808 drums, ambient textures
Orchestral
Full orchestra, brass section,
string ensemble, timpani,
French horn, choir
Example with Instruments
Cinematic orchestral music with soaring strings,
powerful brass, and thundering timpani,
building to an epic climax,
movie trailer style
Tempo and Energy
Tempo Keywords
| Term | BPM Range | Feel |
|---|---|---|
| "Very slow" | 40-60 | Meditative |
| "Slow" | 60-80 | Relaxed |
| "Moderate" | 80-100 | Walking pace |
| "Upbeat" | 100-120 | Energetic |
| "Fast" | 120-140 | Exciting |
| "Very fast" | 140+ | Intense |
Energy Descriptions
- "Starts soft, builds gradually"
- "High energy throughout"
- "Ebb and flow dynamics"
- "Steady and consistent"
- "Explosive crescendo"
Music Prompt Templates
Background Music
Prompt Template
[Genre] background music, [mood], [tempo], suitable for [use case], [instruments], [duration note]
Example:
Lo-fi hip hop background music,
relaxing and focused, moderate tempo,
suitable for studying or working,
mellow beats with soft piano and vinyl crackle,
loopable
Video Soundtrack
Prompt Template
[Genre] music for [video type], [mood] feeling, [tempo], [instruments], syncs with [visual description]
Example:
Uplifting corporate music for product launch video,
inspiring and professional feeling, moderate upbeat tempo,
piano, light percussion, subtle strings,
builds energy toward the end
Podcast/YouTube Intro
Prompt Template
[Genre] intro music, [duration] seconds, [mood], [instruments], catchy and memorable, suitable for [content type]
Example:
Modern electronic intro music,
10-15 seconds, energetic and exciting,
punchy synths and driving beat,
catchy and memorable hook,
suitable for tech podcast
Speech and Voiceover Prompting
Text-to-Speech
For Speech-02 and similar models, your prompt is the text to be spoken:
Prompt Template
[The actual words you want spoken]Voice Characteristics
Some models let you specify voice traits:
| Trait | Options |
|---|---|
| Gender | Male, female, neutral |
| Age | Young, middle-aged, elderly |
| Tone | Professional, friendly, authoritative |
| Accent | American, British, Australian |
| Speed | Slow, normal, fast |
Speech Examples
Professional Narration:
Welcome to our quarterly report.
This presentation covers our key achievements
and strategic initiatives for the coming year.
Friendly Explainer:
Hey there! In this video, we're going to show you
exactly how to get started with our app.
It's super easy, I promise!
Dramatic Trailer:
In a world where technology has changed everything...
one company dares to reimagine the future.
Voice Cloning
To clone a voice:
- Upload a clear audio sample (10-30 seconds)
- Connect to Voice Cloning node
- Provide the text you want spoken
- The AI generates speech in that voice
Best Practices for Voice Samples
- β Clear, noise-free recording
- β Single speaker only
- β Natural speaking pace
- β 10-30 seconds of speech
- β Background music or noise
- β Multiple speakers
- β Heavily processed audio
Combining Audio with Video
Workflow
- Generate your video first
- Create audio that matches the video length
- Combine in your video editor
Matching Audio to Video
Consider:
- Video duration β Match audio length
- Video mood β Audio should complement
- Key moments β Music beats align with visual cuts
- Pacing β Fast video = energetic audio
Common Music Mistakes
β Too Vague
Nice music
β Better
Uplifting acoustic folk music,
warm and hopeful, moderate tempo,
acoustic guitar and light percussion,
suitable for lifestyle brand video
β Conflicting Descriptions
Sad and depressing but also happy and upbeat
β Better
Bittersweet and nostalgic,
melancholic melody with hopeful undertones
β Too Specific Technically
Music in C major at 120 BPM with
a I-IV-V-I chord progression
β Better
Upbeat pop music with a classic,
familiar chord progression,
catchy and singable
Use Case Examples
YouTube Video Background
Upbeat electronic background music,
energetic but not distracting,
moderate-fast tempo,
synth pads and light beats,
suitable for tech review video,
loopable for long videos
Meditation/Relaxation
Ambient meditation music,
deeply calming and peaceful,
very slow tempo,
soft pads, gentle bells, nature sounds,
suitable for yoga or sleep
Product Commercial
Modern corporate music,
confident and innovative feeling,
moderate tempo building to upbeat,
clean synths and subtle percussion,
30 seconds, ends with resolution
Social Media Reel
Trendy pop music,
catchy and fun, fast tempo,
suitable for Instagram Reels,
15-30 seconds,
hook in first 3 seconds
Iteration Strategy
- Start simple β Genre + mood + tempo
- Add instruments β Specify key sounds
- Refine energy β Describe dynamics and build
- Match purpose β Tailor to your use case
Next Steps
- Audio Nodes β Master audio node settings
- Audio & Music Workflow β Complete audio creation process
- Video Prompting β Create videos to pair with your audio