Website Speech Synthesis

Website to automatically speak the text

Released: Website Speech Synthesis

Text to Speech

Today is 2nd October, celebrated as the birthday of Mahatma Gandhi. He has been one of the iconic figures to be born in India, had fought against the oppression of people in colonies, slavery and gave voice to the world. I was thinking about the great soul, and then it struck to me what could I do to give voice to people who cannot speak, not only figuratively but also physically. In this blog, I will explain how I went about creating a website for the same!

First thought was that many people would have thought about and solved this problem. And so, I came across a Speech Synthesis API supported by Google Chrome browser upon a quick search. Yay! my work was so simplified. I was now determined to make a demo and expose the power of this insanely awesome API to the world. Honestly, I did have some infatuations about breaking barriers of devices, OSes, language, and what not in my head for some time! And before I thought I was going off-track in my dreamy world, I hence created a simple website where users can add their text and get their text spoken out by the machine.

Demo Video

Before I jump over to implementation details, enjoy the demo video:

Presenting a website to speak your text with No censorship and No tracking!

I crawled up on the internet and stumbled on Google Speech Synthesis API announcement which is supposed to be a single point where all the speech-related work is done. This is powerful!

Goals

In short, Following is the reason I decided to develop this website:

  • Allow users to enter their text

  • Show all possible voice options

  • Allow easy and fast speech synthesis

As POC, I made a simple website to solve this problem, this one: WebsiteSpeechSynthesis. I can always make it better by enabling better typing (spell checks), making the UI fancier, but I decided to rest my case and release on the birthday of the Mahatma.

Flow

  1. Allow users to enter their text:

    
     <textarea id="txtInput" style="width: 200px;height: 100px;word-wrap: break-word;word-break: break-all;"> 
     </textarea>
     
  2. Show all possible voice options:

    
     var synth = window.speechSynthesis;
     var voiceList = document.querySelector('#voiceList');
     function PopulateVoices() {
         voices = synth.getVoices();
         var selectedIndex = voiceList.selectedIndex &lt; 0 ? 0 : voiceList.selectedIndex;
         voiceList.innerHTML = '';
         for (index = 0; index &lt; voices.length; index++) {
             voice = voices[index];
             var listItem = document.createElement('option');
             listItem.textContent = voice.name;
             listItem.setAttribute('data-lang', voice.lang);
             listItem.setAttribute('data-name', voice.name);
             voiceList.appendChild(listItem);
             if (voice.lang === 'hi-IN') {
                 selectedIndex = index;
             }
         };
    
         voiceList.selectedIndex = selectedIndex;
     }
     PopulateVoices();
     if (speechSynthesis !== undefined) {
         speechSynthesis.onvoiceschanged = PopulateVoices;
     }
     
  3. Allow easy and fast speech synthesis:

    
     var txtInput = document.querySelector('#txtInput');
     var voiceList = document.querySelector('#voiceList');
     var btnSpeak = document.querySelector('#btnSpeak');
     var synth = window.speechSynthesis;
     btnSpeak.addEventListener('click', () =&gt; {
         var toSpeak = new SpeechSynthesisUtterance(txtInput.value);
         var selectedVoiceName = voiceList.selectedOptions[0].getAttribute('data-name');
         voices.forEach((voice) =&gt; {
             if (voice.name === selectedVoiceName) {
                 toSpeak.voice = voice;
             }
         });
         synth.speak(toSpeak);
     });
     

I felt it was straightforward to implement and does not have many dependencies to slow me down. Having a simple website to speak text works just fine for my use cases. I felt it would be super awesome to have following features - which would be a value add:

  • Spell checks
  • Auto expand input box while typing
  • Speak audio samples while changing voices
  • Export of Audio

For my demo, the source is hosted on GitHub repo: WebsiteSpeechSynthesis, quick sneak peek is below:

WebsiteSpeechSynthesis Preview

TECH
javascript website browser

Dialogue & Discussion