Skip to content

Configuration

The Aloud TTS plugin offers a variety of settings to customize your text-to-speech experience. You can access them by navigating to Settings → Aloud Text To Speech.

These settings control the core behavior of the plugin.

  • TTS Provider: Choose the text-to-speech service you want to use. Each provider offers different voices and pricing. Supported providers include OpenAI, Google Gemini, Hume AI, and any OpenAI-compatible API.
  • Playback Speed: Adjust the default playback speed. The default is 1.0x. This can also be adjusted from the player UI.
  • Audio Folder: The directory in your vault where exported audio files are saved. The default is aloud/.

These settings control the audio cache behavior.

  • Cache Storage:
    • Device: (Default) Stores audio on your local device’s storage (IndexedDB). The cache is not synced across devices.
    • Vault: Stores audio in a .tts folder inside your vault. This allows the cache to be synced, but increases your vault’s size.
  • Cache Duration: How long audio files are kept in the cache before being automatically deleted. The default is 7 days.

Each TTS provider has its own specific settings. You only need to configure the provider you have selected.

  • API Key: Your API key from OpenAI.
  • Model: The TTS model to use (e.g., tts-1, tts-1-hd, gpt-4o-mini-tts).
  • Voice: The voice to use for playback.
  • API Key: Your API key for the Gemini API.
  • Model: The Gemini model to use.
  • Voice: The voice to use for playback.
  • API Key: Your API key from Hume AI.
  • Voice: The Hume AI voice to use.

For users who self-host a TTS service or use a third-party provider with an OpenAI-compatible API.

  • API Key: The API key for your service.
  • API Base URL: The URL of your API endpoint (e.g., http://localhost:8020/v1).
  • Model: The name of the model your service uses.
  • Voice: The name of the voice to use.