Audio Models
Audio Models in Armox generieren Musik, Speech und Soundeffekte aus Textbeschreibungen oder Reference Inputs.
Overview
Audio Models können:
- Music generation â Originalmusik aus Beschreibungen erstellen
- Text-to-speech â NatĂŒrliche Stimmen aus Text generieren
- Sound effects â Ambient sounds und Effects erzeugen
- Voice cloning â Speech in spezifischen Stimmen generieren
- Audio continuation â Bestehendes Audio verlĂ€ngern
Available Audio Models
| Model | Provider | Cost | Duration | Best For |
|---|---|---|---|---|
| MusicGen | Meta | 100 credits | 8-30s | Music generation |
| Ace Step | Various | 100 credits | 60-300s | Long-form music |
| Dia TTS | Nari Labs | 50 credits | Variable | Text-to-speech |
| Kokoro TTS | Kokoro | 50 credits | Variable | Fast TTS |
| Chatterbox | Various | 50 credits | Variable | Voice cloning |
Connection Colors
Im Armox Canvas verwenden Audio-Connections orange Handles und Edges:
- Input Handle: Red circle auf der linken Seite von Nodes
- Output Handle: Red circle auf der rechten Seite von Nodes
- Connection Edge: Red line, die Nodes verbindet
Common Settings
Duration
Steuert die LĂ€nge des generierten Audios.
Sample Rate
- 44.1kHz â CD quality
- 48kHz â Professional audio
Format
- MP3 â Compressed, kleinere Dateien
- WAV â Uncompressed, höhere QualitĂ€t
Choosing the Right Model
For Music
- MusicGen (100 credits) â Kurze Musikclips
- Ace Step (100 credits) â Long-form music
For Speech
- Dia TTS (50 credits) â NatĂŒrliches Dialogue
- Kokoro TTS (50 credits) â Fast generation
- Chatterbox (50 credits) â Voice cloning
Best Practices
- Genre konkret angeben â "jazz", "electronic", "orchestral"
- Mood beschreiben â "upbeat", "melancholic", "energetic"
- Instruments nennen â "piano", "guitar", "synthesizer"
- Tempo angeben â "slow", "moderate", "fast"
- FĂŒr Speech: natĂŒrlicher Text â So schreiben, wie Sie sprechen wĂŒrden
Next Steps
Explore individual model documentation for detailed settings and use cases.