Armox
    Armox Academy 📚
    CanvasAudio Nodes

    Audio Nodes

    Audio nodes add sound to your creative work. Generate background music, voiceovers, sound effects, and even clone voices.

    What Are Audio Nodes?

    Audio Node with MusicGen

    An Audio node using MusicGen for AI music generation.

    Audio nodes generate sound using AI. They can:

    • 🎵 Generate music — Background tracks, jingles, songs
    • 🗣️ Create speech — Voiceovers, narration
    • 🎙️ Clone voices — Replicate a voice from a sample
    • 🔊 Make sounds — Sound effects, ambient audio

    Adding an Audio Node

    1. Open the node sidebar on the left
    2. Find Audio in the list
    3. Drag it onto your canvas

    Audio Node Inputs

    InputTypeRequiredPurpose
    Prompt/TextText (yellow)YesDescription or script
    Reference AudioAudio (orange)NoFor voice cloning or style

    Audio Models

    Music Generation

    ModelCreditsBest For
    Music 1.548Quick background music
    Lyria 2120High-quality songs
    MusicGen176Detailed compositions
    Music-01200Complex music

    Speech Generation

    ModelCreditsBest For
    Speech-02 Turbo8Quick voiceovers
    XTTS-v220Voice cloning
    Speech-02 HD40High-quality speech

    Voice Tools

    ModelCreditsBest For
    Voice Cloning100Clone any voice

    Music Generation

    Generate original music from text descriptions.

    Workflow

    Prompt Template
    [Text Node][Audio Node (Music)]
    Music description → Generated music

    Example

    1. Add a Text Node with music description:
      Upbeat electronic music, energetic and modern,
      suitable for tech product video,
      driving beat, synth melodies
      
    2. Add an Audio Node
    3. Select a music model (e.g., Music 1.5)
    4. Connect and run

    Music Prompt Tips

    Include these elements:

    • Genre — "lo-fi hip hop," "cinematic orchestral"
    • Mood — "uplifting," "mysterious," "relaxing"
    • Instruments — "piano," "synths," "strings"
    • Tempo — "slow," "upbeat," "moderate"
    • Purpose — "background music," "intro jingle"

    See Audio Prompting for detailed guidance.


    Speech Generation (Text-to-Speech)

    Convert text into spoken audio.

    Workflow

    Prompt Template
    [Text Node][Audio Node (Speech)]
    Your script → Spoken audio

    Example

    1. Add a Text Node with your script:
      Welcome to our product demonstration. 
      Today, we'll show you how easy it is 
      to create amazing content with AI.
      
    2. Add an Audio Node
    3. Select Speech-02 Turbo or Speech-02 HD
    4. Connect and run

    Speech Settings

    SettingDescription
    VoiceChoose from available voices
    SpeedSpeaking pace
    PitchVoice pitch adjustment
    LanguageOutput language

    Voice Cloning

    Replicate a specific voice from an audio sample.

    Workflow

    Prompt Template
    [Upload Node (Audio)][Audio Node (Voice Clone)]
    Voice sample → Cloned speech

    Steps

    1. Upload a voice sample

      • Add an Upload Node
      • Upload a clear audio file (10-30 seconds)
      • Single speaker, no background noise
    2. Add Audio Node

      • Select XTTS-v2 or Voice Cloning model
      • Connect the audio sample
    3. Add your script

      • Text Node with what you want said
      • Connect to the Audio Node
    4. Generate

      • Run to create speech in the cloned voice

    Voice Sample Requirements

    RequirementDetails
    Length10-30 seconds ideal
    QualityClear, noise-free
    SpeakerSingle voice only
    ContentNatural speech

    Audio Node Settings

    Duration

    Control audio length:

    Use CaseDuration
    Short jingle5-15 seconds
    Background loop30-60 seconds
    Full track60-180 seconds
    VoiceoverVaries by script

    Quality Settings

    SettingEffect
    Sample RateAudio quality (higher = better)
    FormatOutput format (MP3, WAV)

    Viewing and Downloading

    When generation completes:

    1. Audio player appears in the node
    2. Click play to preview
    3. Download button to save locally
    4. Audio is auto-saved to your Gallery

    Combining Audio with Video

    Method 1: In Armox

    Some video models accept audio input:

    Prompt Template
    [Audio Node][Video Node]
    Generated music → Video with audio

    Method 2: External Editor

    1. Generate video in Armox
    2. Generate audio in Armox
    3. Download both
    4. Combine in video editor (Premiere, Final Cut, etc.)

    Tips for Matching

    • Generate audio after you know video duration
    • Match audio mood to video content
    • Consider generating music slightly longer than needed

    Tips for Better Audio

    Music Tips

    • Be specific about genre and mood
    • Include instrument preferences
    • Specify tempo if important
    • Mention the use case (video, podcast, etc.)

    Speech Tips

    • Write natural, conversational scripts
    • Include punctuation for natural pauses
    • Test with Turbo model first
    • Use HD for final production

    Voice Cloning Tips

    • Use high-quality source audio
    • Longer samples = better cloning
    • Keep scripts natural to the voice
    • Test with short phrases first

    Common Issues

    Music Doesn't Match Description

    • Be more specific about genre
    • Include more mood keywords
    • Try a different model
    • Simplify the description

    Speech Sounds Unnatural

    • Add punctuation for pauses
    • Shorten long sentences
    • Use more conversational language
    • Try a different voice

    Voice Clone Sounds Wrong

    • Improve source audio quality
    • Use a longer sample
    • Ensure single speaker in sample
    • Remove background noise from sample

    Use Case Examples

    YouTube Intro Music

    Energetic electronic intro music,
    10 seconds, catchy hook,
    modern and professional,
    suitable for tech channel
    

    Podcast Voiceover

    Welcome to The Creative Hour, 
    where we explore the intersection 
    of technology and creativity. 
    I'm your host, and today we're diving into...
    

    Product Video Background

    Corporate background music,
    uplifting and professional,
    moderate tempo, subtle and non-distracting,
    suitable for product demonstration video
    

    Meditation Audio

    Peaceful ambient music,
    very slow and calming,
    soft pads and gentle nature sounds,
    suitable for meditation or relaxation
    

    Next Steps