Since 2005, members of the open-source community gather in Chicago each year for FreeSWITCH ClueCon. This annual conference is “a conference for developers, by developers” of the open-source voice over IP technology “FreeSWITCH” that powers many telephony platforms (including GreenKey). Our Senior Software Engineer, Patrick Kuca, has been attending the conference since 2016. Here’s a Q&A about his experience from this year’s conference. How do you approach your time at ClueCon?
Voice recognition is only as good as the inputs used to train the technology. That’s why introducing Scribe to the dialect of hundreds of accents has been crucial to ensuring its accuracy for transcription services. We started with this line: Please call Stella. Ask her to bring these things with her from the store: six spoons of fresh snow peas, five thick slabs of blue cheese, and maybe a snack for her brother Bob.
Voice assistants are everywhere, especially in people’s homes. We use Alexa, Siri, and Google Home to control our lights, answer trivia questions, and play music. Notably, sales of voice assistant devices have more than doubled in the last year.1 Beyond the home, more people are using voice commands on their phones. By 2020, nearly half of all internet searches are predicted to be voice driven.2 Why type when you can talk?
Most AI researchers I know are addicted to Westworld. Here’s why: it’s a representation of how far technology could go, while staying within the realm of plausible reality. The seamless interaction between humans and AI portrayed by HBO is something myself and other AI researchers strive to achieve. When Logan Delos (a potential investor) meets a room full of Hosts (what Westworld calls robots) and doesn’t realize it until he’s told, his expression is one of amazement and wonder.
At GreenKey, our data science team is constantly focused on one question: How do we make a machine recognize speech as well as humans? Several companies have shown computers outperforming humans at speech recognition, but these tests are normally on specific types of audio and don’t involve noisy environments. The fact is speech recognition engines have a hard time understanding all speech as well as humans. Instead, many speech recognition engines are trained to perform well on specific content.
In my 20+ year career, I’ve been lucky enough to be on the cusp of two technology paradigm shifts. In 1997, I got into web development, and in 2010 I began working with mobile. I love the rush of creating something new, when everything you build is breaking ground. In those moments, you feel the great possibility ahead and the rewarding struggle of no easy answers. In 2017, I began to search the market for my next big leap.
Over the last couple of years at GreenKey, I’ve been part of the team building Scribe, a deep-learning AI for nuanced industries. Dubbed the “Alexa for Wall Street” by Forbes, we’re now expanding Scribe’s focus to emergency services, like police and fire departments, and emergency medical technicians. Scribe allows brokers, traders, analysts, and others at financial institutions to transcribe their telephone conversations in real-time and extract important data like quotes and trades with high accuracy.
One of the ongoing challenges faced by most market participants after many new rules and regulations have been promulgated over the last few years is how to capture, store, analyze and retrieve the vast quantities of data that are produced from daily communications. All major global regulators (from the US to Europe to Asia) have “record keeping” requirements, many of which have been expanded during the process of financial market regulatory reform.
Humans have been talking for 100,000 years – and now, with the latest developments in machine learning and computing power, machines are smart enough to listen. Not only can machines recognise speech, they can understand the meaning of it. We are on the cusp of a profound change in behaviour. GreenKey offers insights into the world of speech recognition and how we are leveraging ASR to digitize voice in the financial markets and to change the way the markets communicate.
Who is GreenKey really? We are a bunch of telephony, algo trading and web development geeks who had wanted to trade energy, which is largely a “call-around” OTC market. We were shocked when everyone told us we needed to set up private lines and purchase $10,000 telephones to talk to the market. So we decided to start a company to fix this problem. At this point, you might be thinking – why does voice matter?