One of the things we’ve noticed when developing the first phonics-based adult literacy app is that one of the four skills is not as easy as it sounds…
Whilst covering the four main skills: reading, writing listening and speaking, it is the speaking part which has an issue with accents.
Getting a computer to understand speaking is not simply solved by installing speech recognition as “plug and play”, there are basic issues regarding single words with no context given as you would have in a sentence and beyond that, there are accents!.
Try your browser speech recognition and say one or won, to, too or two: whats does the browser come back with? Not correct on each version… as you would not know either if someone said single words. We need a context to make sense of words – so do machines:
“She won first prize”: the browser should get that easily..
So, overcoming single word recognition is a challenge which we can accommodate using our own coding and logic.. however the next issue: accents, isn’t so easy.
Here is the Uk we have many different accents Glasgow, Newcastle: Liverpool etc. Speech recognition can work with most generic english: but not where accents are strong or someone who just has a different ability with pronunciation.
So imagine if the computer speech recognition just can’t predict the word being said by a person, how frustrating is that…
Like this example of two Glaswegians in a lift/elevator saying “eleven” https://www.youtube.com/watch?v=sAz_UvnUeuU
This is exactly what we are talking about.’Perception is reality’ as “they” say; so if an APP cant understand you, then the APP has failed.
This is where machine learning can come in handy. There are many ways to get into machine learning. One way for example is ML5 [https://ml5js.org/] which sits on top of the Tensorflow.js [https://www.tensorflow.org/]
We’re currently looking just at tensorflow.js TensorFlow is an open source machine learning infrastructure on which you can develop customised models for your own use.
- you can run exiting ML models
- Retrain those existing models
- Develop your own ML model with JS
If we can use ML to train the APP on any areas needing a bit of a tweaking e.g. for pin & pen or tin & ten. So for us this meant creating a custom sound model which can be trained to predict what a users has said.The user journey would go something like this.
- Person uses the APP
- App fails multiple time to predict the word they are saying.
- Activate the AI to retrain the ML model and add a new word to its model
- APP prompts user to say the word the APP needs to train itself.
- The Ai trains itself to the person’s pronunciation for that word
- The APP now has evolved specifically to the person, almost like a digital evolution; to make the APP more fit for purpose in its environment; and thus has a better chance of success.
You could try it here https://micro-phonics.com/wip/citizenliteracy/ai/lift.html
Its still very early days, but our roadmap is developing as we progress and find out what are dead ends and what technologies are fit for our purpose.
One thing which is very apparent: the Digital Evolution of APPs for individuals is necessary when it comes to education. Using AI and machine learning is a new step in making technology work at a more granular level for education to help us all improve our lives and environment. Which is what we are all about here at Citizen Literacy…
First of all you need to take a prebuilt model from tensorflow and add to it using a process called transfer learning.[https://en.wikipedia.org/wiki/Transfer_learning]
This is where we add new labels to the pre existing model. More later…