Armox
    Armox Academy 📚
    AI Models ReferenceVideo ModelsGrok Video

    Grok Video

    Grok Video is xAI's powerful video generation suite featuring three modes: text-to-video, image-to-video, and video editing. All modes support audio generation.

    Grok Video I2V

    Overview

    PropertyValue
    ProviderxAI
    ModelsT2V, I2V, Edit
    ModalityVideo
    Duration1-15 seconds (T2V/I2V), up to 8s (Edit)
    Resolution480p, 720p
    Prompt RequiredYes

    Available Models

    Grok Video T2V (Text-to-Video)

    Generate videos with audio from text descriptions.

    PropertyValue
    Cost50-750 credits (varies by duration)
    Base Cost50 credits per second

    Grok Video I2V (Image-to-Video)

    Animate images into videos with audio.

    PropertyValue
    Cost52-752 credits (varies by duration)
    Base Cost50 credits per second + 2 credits for image

    Grok Video Edit

    Edit existing videos using text descriptions.

    PropertyValue
    Cost360 credits
    Max Input8 seconds

    What It's Best For

    • Quick video generation — Fast turnaround times
    • Audio included — Native audio generation
    • Image animation — Bring still images to life
    • Video editing — Transform and colorize videos
    • Flexible duration — 1-15 seconds for T2V/I2V

    Inputs

    Prompt (Required)

    Describe the video scene, action, or edit.

    Connection Color: Yellow

    Input Image (I2V only)

    Image to animate into a video.

    Connection Color: Blue

    Input Video (Edit only)

    Video to edit and transform.

    Connection Color: Green

    Configuration

    Duration (T2V & I2V)

    Type: Slider
    Range: 1-15 seconds
    Default: 6

    Video duration. Cost scales with duration.

    Aspect Ratio (T2V & I2V)

    Type: Select
    Default: 16:9 (T2V), Auto (I2V)

    OptionDescription
    16:9Landscape
    9:16Portrait
    1:1Square
    4:3Classic
    3:4Portrait classic
    3:2Photo landscape
    2:3Photo portrait
    AutoMatch input image (I2V only)

    Resolution

    Type: Select
    Default: 720p

    OptionDescription
    480pFaster, lower quality
    720pStandard HD quality
    AutoMatch input (Edit only)

    Output

    Type: Video with audio
    Connection Color: Green

    Use Cases

    Text-to-Video

    Anime schoolgirl bursting out of house door, 
    cherry blossoms blowing, morning light, 
    speed lines indicating rush, 
    classic shojo aesthetic, vibrant colors.
    

    Image-to-Video

    Medieval knight in ornate armor walking through 
    a mystical forest, bioluminescent plants pulsing 
    with light, ancient stone ruins overgrown with 
    glowing vines, dark fantasy aesthetic.
    

    Video Editing

    Colorize this black and white footage, 
    add warm golden hour lighting, 
    enhance the contrast for a cinematic look.
    

    Tips for Best Results

    1. Be descriptive — Include camera movements and lighting
    2. Use I2V for consistency — Start with an image for better character control
    3. Edit creatively — Transform old footage into new styles
    4. Optimize duration — Longer videos cost more, start short
    5. Match aspect ratios — Use Auto for I2V to preserve image proportions

    Pricing Details

    DurationT2V CostI2V Cost
    1 second5052
    6 seconds300302
    10 seconds500502
    15 seconds750752

    Edit video: Fixed 360 credits for up to 8 seconds.

    Comparison

    FeatureGrok VideoKling 2.6 ProVeo 3.1
    Text-to-VideoYesYesYes
    Image-to-VideoYesYesYes
    Video EditingYesNoNo
    AudioYesYesYes
    Max Duration15s10s8s
    Base Cost (6s)3001,2004,000