|
This project is no longer actively maintained by the Google Creative Lab but remains here in a read-only Archive mode so that it can continue to assist developers that may find the examples helpful. We aren’t able to address all pull requests or bug reports but outstanding issues will remain in read-only mode for reference purposes. Also, please note that some of the dependencies may not be up to date and there hasn’t been any QA done in a while so your mileage may vary.
For more details on how Archiving affects Github repositories see this documentation . We welcome users to fork this repository should there be more useful, community-driven efforts that can help continue what this project began. |
One Button for Voice Input is a customizable webcomponent built with Polymer 3+ to make it easy for including speech recognition in your web-based projects. It uses the Speech Recognition API, and for unsupported browsers it will fallback to a client-side Google Cloud Speech API solution.
With npm installed, in the root of this repo:
npm install
npm start
As of Polymer 3, all dependencies are managed through NPM and module script tags. You can simply add obvi to your project using:
npm install --save obvi-component
And then the following:
<!DOCTYPE html>
<html>
<head>
<script src="node_modules/@webcomponents/webcomponentsjs/webcomponents-bundle.js"></script>
</head>
<body>
<voice-button id="voice-button" cloud-speech-api-key="YOUR_API_KEY" autodetect></voice-button>
<script type="module">
import './node_modules/obvi/voice-button.js';
var voiceEl = document.querySelector('voice-button'),
transcriptionEl = document.getElementById('transcription');
// can check the supported flag, and do something if it's disabled / not supported
console.log('does this browser support WebRTC?', voiceEl.supported);
voiceEl.addEventListener('mousedown', function(event){
transcriptionEl.innerHTML = '';
})
var transcription = '';
voiceEl.addEventListener('onSpeech', function(event){
transcription = event.detail.speechResult;
transcriptionEl.innerHTML = transcription;
console.log('Speech response: ', event.detail.speechResult)
transcriptionEl.classList.add('interim');
if(event.detail.isFinal){
transcriptionEl.classList.remove('interim');
}
})
voiceEl.addEventListener('onStateChange', function(event){
console.log('state:',event.detail.newValue);
})
</script>
</body>
</html>
Note: You must run your app from a web server for the HTML Imports polyfill to work properly. This requirement goes away when the API is available natively.
Also Note: If your app is running from SSL (https://), the microphone access permission will be persistent. That is, users won't have to grant/deny access every time.
Static hosting services like GitHub Pages and Firebase Hosting don't support serving different files to different user agents. If you're hosting your application on one of these services, you'll need to serve a single build like so:
<script type="module" src="node_modules/obvi/voice-button.js"></script>
or
import './node_modules/obvi/dist/voice-button.js'
You can also customize the polymer build command in package.json and create your own build file to futher suit your needs.
Basic usage is:
<voice-button cloud-speech-api-key="YOUR_API_KEY"></voice-button>
| Name | Description | Type | Default | Options / Examples |
|---|---|---|---|---|
| cloudSpeechApiKey | Cloud Speech API is the fallback when the Web Speech API isn't available. Provide this key to cover more browsers. | String | null | <voice-button cloud-speech-api-key="XXX"></voice-button> |
| flat | Whether or not to include the shadow. | Boolean | false | <voice-button flat> |
| autodetect | By default, the user needs to press & hold to capture speech. If this is set to true, it will auto-detect silence to finish capturing speech. | Boolean | false | <voice-button autodetect> |
| language | Language for SpeechRecognition interface. If not set, will default to user agent's language setting. See here for more info. | String | 'en-US' | <voice-button language="en-US"> |
| disabled | Disables the button for being pressed and capturing speech. | Boolean | false | <voice-button disabled> |
| keyboardTrigger | How the keyboard will trigger the button | String | 'space-bar' |
<voice-button keyboard-trigger="space-bar"> space-bar, all-keys, none |
| clickForPermission | If set to true, will only ask for browser microphone permissions when the button is clicked (instead of immediately) | Boolean | false | <voice-button click-for-permission="true" |
| hidePermissionText | If set to true, the warning text for browser access to the microphone will not appear | Boolean | false | hide-permission-text="true" |
You can customize the look of the button using these CSS variables (default values shown):
voice-button{
--button-diameter: 100px;
--mic-color: #FFFFFF;
--text-color: #666666;
--button-color: #666666;
--circle-viz-color: #000000;
}
You can listen for the following custom events from the voice button:
| Name | Description | Return |
|---|---|---|
onSpeech |
Result from the speech handler | detail: { result: { speechResult : String, confidence : Number, isFinal : Boolean, sourceEvent: Object } |
onSpeechError |
The raw event returned from the SpeechRecognition onerror handler |
See here |
onStateChange |
When the button changes state | detail: { newValue: String, oldValue: String} see below for listening states |
Listening states:
IDLE: 'idle',
LISTENING: 'listening',
USER_INPUT: 'user-input',
DISABLED: 'disabled'
When the component is loaded, microphone access is checked (unless click-for-permission="true" is set, then it will ask one the button is clicked). If the host's mic access is blocked, there will be a warning shown. The language of the text matches the language attribute for the component (defaults to "en-US"). If the color of the text needs to be customized, you can use the --text-color CSS variable.
This component defaults to using the Web Speech API. If the browser does not support that, it will fall back to WebRTC in order to capture audio on the client and post it to the Google Cloud Speech API. Make sure to create an API Key and have the cloud-speech-api-key attribute (see above in Options) filled out in order to use this fallback. You can check the supported property of the button once it's loaded in to see if it has browser support.
When the fallback is used, there will be no streaming speech recognition; the speech comes back all at once.
| Browser | Support | Features |
|---|---|---|
| Firefox | Stable / Aurora / Nightly | Cloud Speech fallback |
| Google Chrome | Stable / Canary / Beta / Dev | Web Speech API |
| Opera | Stable / NEXT | Cloud Speech fallback |
| Android | Chrome / Firefox / Opera | Cloud Speech fallback |
| Microsoft Edge | Normal Build | Cloud Speech fallback |
| Safari 11 | Stable | Cloud Speech fallback |
- Fork it!
- Create your feature branch:
git checkout -b my-new-feature - Commit your changes:
git commit -am 'Add some feature' - Push to the branch:
git push origin my-new-feature - Submit a pull request :D
This component was authored by @nick-jonas, and was built atop some great tools - special thanks to the Polymer team (esp. Chris Joel), Jonathan Schaeffer for testing help, @jasonfarrell for fallback help, @GersonRosales & @danielstorey for showing a working recording example in iOS11 early days.

