Responsible AI

AI Usage Policy

Transparency about how we use AI — what runs where, what we do and do not train on, and the guardrails we enforce.

Last updated: May 18, 2026

Models we use

ImageToSpeech combines best-in-class OCR engines with neural text-to-speech models from licensed providers and, in some cases, our own fine-tuned models. The exact provider chosen at runtime depends on language, content type and quality requirements.

Training and your data

We do not train models on your uploaded content by default. If you opt in to help improve quality, your contribution is anonymised, reviewed and used only for the stated purpose. You can withdraw opt-in at any time.

Voice cloning and identity

We do not allow creation of voice clones that imitate identifiable real people without verifiable consent. Custom voice features (when offered) are gated by identity checks.

Safety and content moderation

  • Automated filters block obviously prohibited content (CSAM, mass harassment, scam scripts)
  • Hard rate limits prevent automated abuse
  • Trust & Safety review for flagged accounts

Accuracy and bias

OCR and TTS models can make mistakes, especially on low-quality scans, mixed scripts or low-resource languages. We continuously benchmark quality across languages and accents and publish improvements in our changelog.

Disclosure

If you publish AI-generated audio, you should disclose that fact to your audience where it could otherwise be misleading.

Ready to turn images into natural-sounding speech?

Free to try. No credit card required. 100+ languages and 200+ voices.