AI Actors

Transform a static image into a talking character. Provide text or upload an audio file, select a voice, and Vinci will generate a lip‑synced speaking video.
Based on the current app implementation. Some advanced options are Coming Soon.

Front-end controls

Inputs

  • Image: Upload character image (JPEG/PNG), minimum 512x512
  • Dialogue:
    • Text: Type your script
    • Audio: Upload your own recorded audio

Voice options

  • Vinci Voices: Curated professional voices
  • User Voices: Your voice clones created from audio samples
  • Voice Selection Mode:
    • “Vinci” (prebuilt)
    • “User” (custom)
  • Default fallback voice ID: 21m00Tcm4TlvDq8ikWAM

Output

  • Duration: Auto from text/audio
  • Format: MP4 (H.264/AAC)
  • Resolution: 1080p default

Advanced (defaults)

  • Frame rate: 30 fps
  • Batch size: 8
  • CRF: 19 (quality)
  • Audio processing: Automatic format detection/conversion
  • Seed: Randomized

Placeholders

  • “Type your dialogue here…”
  • “Click to type dialogue”
  • “Search by voice, character, or workflow…”

Typical workflow

1

Choose a character
2

Upload a clear, well-lit portrait image (or select from your Character Library).
3

Add dialogue
4

Type your script or upload an audio file. If you upload audio, text is optional.
5

Pick a voice
6

Choose a Vinci voice or one of your User voice clones. Leave empty to use the default fallback.
7

Generate and review
8

Start generation and monitor progress. Preview the output, then download or share.

Best practices

  • Use high-resolution faces with clear features
  • Keep scripts concise and natural
  • For uploads, provide clean audio (no background noise)
  • Align language of the text/voice with target audience
  • Test multiple voices to match character tone

Asset libraries

  • Characters: Pre‑made or uploaded (Images)
  • Voices: Vinci library and your cloned voices

Cost and usage

Coming Soon

  • Emotion/style controls
  • Phoneme‑level timing controls
  • Multi‑segment dialogue with pauses