Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.artnex.app/llms.txt

Use this file to discover all available pages before exploring further.

Voice Cloning lets you capture the characteristics of a real voice from an uploaded audio sample and use it to speak any text you write. The cloned voice preserves the tone, cadence, and timbre of the original, producing output that sounds like the source speaker. Go to Tools > Audio > VoiceZen > Voice Cloning.

Clone a voice

1

Upload a voice sample

Click Upload Audio and select your reference audio file. Accepted formats are MP3 and WAV only — other formats are not supported.
Only MP3 and WAV files are accepted. Uploading a different format will produce an error and the file will not be processed.
For the best results, your sample should:
  • Be at least a few seconds of clear, uninterrupted speech
  • Contain a single speaker with no background music or noise
  • Be recorded at a consistent volume without clipping or distortion
  • Use a clean WAV or high-quality MP3 (128 kbps or higher)
2

Enter the text to speak

Type the text you want the cloned voice to say into the text field.
3

Select a language

Choose the output language from the language dropdown. Supported languages include Arabic, Chinese, Danish, Dutch, English, Finnish, French, German, Greek, Hebrew, Hindi, Italian, Japanese, Korean, Portuguese, Russian, Spanish, and Turkish.
4

Adjust advanced settings (optional)

Click Settings to fine-tune the generation:
  • Exaggeration (0.25–2.0) — Controls how expressive and dramatic the speech delivery is. Lower values are neutral; higher values amplify emotion.
  • Temperature (0.05–5.0) — Controls randomness. Lower values produce more predictable, consistent output. Higher values introduce more variation.
  • CFG Weight (2–10) — How closely the output follows the voice characteristics. A value of 0.5 gives balanced results.
  • Seed — Enter a specific seed to reproduce an exact output. Set to 0 for a random result each time.
5

Generate

Click Create. The tool uploads your audio sample, processes the voice, and generates the cloned speech. This may take longer than standard TTS — allow up to 30 seconds.
6

Download

Click the download button on the audio player to save your result as a WAV file.

Credit cost

Voice Cloning costs 500 credits per generation.

Tips for clean audio samples

A short, clean sample consistently outperforms a long, noisy one. Prioritize audio quality over duration.
  • Record in a quiet room with soft furnishings to reduce echo
  • Use a dedicated microphone rather than a phone or laptop mic when possible
  • Avoid samples with background music, reverb, or crowd noise
  • Keep the speaker’s mouth close to the microphone and at a steady distance
  • Remove long silences or multiple speakers before uploading

Use cases

Personalized content

Create narration in your own voice for videos, courses, or presentations without re-recording every time you update the script.

Accessibility tools

Generate audio for written content in a familiar voice to help users who benefit from personalised audio delivery.

Creative projects

Produce character voices for games, audiobooks, animations, or interactive fiction using voice samples as a starting point.

Localization

Combine voice cloning with language selection to produce multilingual versions of content while preserving voice identity.

Responsible use

Voice Cloning is a powerful tool that carries ethical responsibilities. Only clone voices you have explicit permission to replicate. Do not use this feature to impersonate individuals without their consent, create deceptive content, or generate audio intended to mislead or defraud. Misuse may violate Artnex’s Terms of Service and applicable laws.