VoiceSmith makes it possible to train and infer on both single and multispeaker models without any coding experience. It fine-tunes a pretty solid text to speech pipeline based on a modified version of DelightfulTTS and UnivNet on your dataset. Both models were pretrained on a proprietary 5000 speaker dataset. It also provides some tools for dataset preprocessing like automatic text normalization. Windows (only CPU supported currently) or any Linux based operating system. If you want to run this on macOS you have to follow the steps in build from source in order to create the installer. This is untested since I don't currently own a Mac. NVIDIA GPU with CUDA support is highly recommended, you can train on CPU otherwise but it will take days if not weeks. VoiceSmith currently uses a two-stage modified DelightfulTTS and UnivNet pipeline.

Features

  • Windows (only CPU supported currently)
  • Train and infer on both single and multispeaker models
  • No coding experience needed
  • It fine-tunes a pretty solid text to speech pipeline based on a modified version of DelightfulTTS
  • Models were pretrained on a proprietary 5000 speaker dataset
  • Provides some tools for dataset preprocessing like automatic text normalization

Project Samples

Project Activity

See All Activity >

Categories

Voice Cloning

License

Apache License V2.0

Follow VoiceSmith

VoiceSmith Web Site

Other Useful Business Software
Data management solutions for confident marketing Icon
Data management solutions for confident marketing

For companies wanting a complete Data Management solution that is native to Salesforce

Verify, deduplicate, manipulate, and assign records automatically to keep your CRM data accurate, complete, and ready for business.
Learn More
Rate This Project
Login To Rate This Project

User Reviews

Be the first to post a review of VoiceSmith!

Additional Project Details

Operating Systems

Mac, Windows

Programming Language

Python

Related Categories

Python Voice Cloning Software

Registered

2023-03-23