Keeping it Simple

Early on the developers agreed that it would be ideal if they could use as much ‘pure and simple’ javascript, SQL and PHP et. as possible – rather than extensive use of software libraries. The rationale for this was that such libraries (particularly the ones associated with server side processes such using node.js) can be both productive for getting results in a short time but act as ‘black boxes’ to the developers that can make future maintenance problematical – even if they are open source. There is also an issue of such libraries becoming ‘deprecated’ i.e. no longer supported by their owners, again introducing maintenance problems for the future. Currently the developers  are using the popular Bootstrap  library for the front end styling, including icon animations and synching with Audio, while the underlying logic is all ‘hand-coded’

Technically, some time was initially spent (as planned) in an exploratory phase evaluating different approaches to creating the App for the iOS platform. As was expected this was tricky and after trying some technical options the developers have settled on using the Cordova framework (https://cordova.apache.org/) together with some new Cordova software library components  that gave access to the iOS on-device speech recognitions system. This means the code base for both the App platforms is substantially the same and is based on the web App version.

Going forwards, the web version will be used to prototype and test basic functionality and that will then be tested in App forms.

Media Sourcing and Creation

  • Text to voice services for instruction and feedback –Amazon Polly for the tutor voice -is working well
  • Now using bootstrap Animations for interface icons (from various sources including Google Material Design icons and Fonts) with time-synch cues to mp3’s produced by Amazon Polly – working really well
  • Some screen movies of interface may be needed for user training

Top down video of tutor at table speaking and rearranging cardboard letters – early tests are very promising – brings a human touch into the app – this rough test show it could work – will need to work on audio quality and image framing / lighting. https://www.youtube.com/watch?v=Gl3V_WDVmC8

  • Images for a sound alphabet – using pixabay etc. – they are proving surprisingly good (must make sure to add credits!)

Earlier Prototypes

Way back in 2018c We were fortunate to have the assistance os students from the Digital Skills Academy of Digital Skills Global https://digitalskillsglobal.com/. They helped us work out some crucial ideas about interface design, logos etc. Along with some supporters working pro bono (Mr. Ll we owe you!) and our 2 software companies ReachWill Ltd and Micro-phonics Ltd. Here are some screen movies of that earlier work:

An early demo of the web app version:

Video of the app working on android phone – early test version

Video od Beta 1.2 Interface with the Amazon Polly Voice and using Bootstrap icon animation synched to the audio

Project Unexpected Outcomes

It is useful to have a placeholder to record unexpected outcomes in a project So far we have identified these.

  1. The web app can be a separate product in its own right – and as progressive web apps become more mature they might replace the traditional forms
  2. The web app as a teaching tool for classroom use with extra features looks very promising and could be a powerful publicity tool
    1. The use of web sockets also open up interesting collaboration opportunities in the future
    1. Google cloud services for voice are developing and need to keep an eye on them
  3. User accounts – initially we were not going to have them and rely on standalone installs on the device, partly for GDPR concerns. Now we have decided that we can have anonymous accounts  and provide monitoring and feedback on user performance as that will be useful to learners and potentially provide useful analytics in the future
  4. Creating a ‘White Paper’ has been very effective at clarifying thinking between the partners and as a basis enabling future collaboration
  5. Chatbots and AI are worth exploring in other related projects

Word Recognition Limitations

Our app features a limited number of very short words (3-4 letters) that learners need to be able to read and then speak into the app to confirm that they have read correctly. We have found that the existing device natural language processing (a form of AI) tends to break down when confronted with users inputting short words (a lack of contextual words sems to be the problem) this is exacerbated when the user has a regional accent. Accuracy is crucial for these learners to progress. The accent issue features in numerous YouTube videos about voice recognition failures. We have become aware of the Google AI initiative Teachable Machine that uses the open source TensorFlow software library.

It seems recognising a single short word for these systems is tricky. When you speak a sentence like a command to Alexa or the Google voice search the voice recognition system has a set of words to work on that provide a context and the performance seems much better. But there are still accent problems. There are recognised accent problems for these systems – there are some hilarious examples on YouTube – see the ’11’ one.

As luck would have it, we have stumbled on a possible solution called Teachable Machine from Google that uses an open source AI software library called TensorFlow. We did some basic testing over the break and so far, the results are impressive. It looks like once ‘trained’ the system will be able to be personalised to individual users’ accents. In our tests it could distinguish between 4 people with different accents saying the same word with a high degree of accuracy. This opens up some intriguing possibilities. 

We think this could make our app much more effective. An example of using Using AI to improve AI. Update: We have recently been successful in getting funding to explore this option – more details to follow

The classic ’11’ lift sketch about voice recognition and Scots Accents


© Citizen Literacy 2020 | info@citizenliteracy.com