Engineering student's AI model turns American Sign Language into English in real-time
Engineering student Priyanjali Gupta does not have tall tales of the inspiration behind her AI model that translates American Sign language (ASL) into English immediately.
Instead, the driving factor was her mum, who asked her "to do something now that she's studying engineering", a statement echoed by most Indian mums. Gupta is a third-year computer science student specializing in data science from the Vellore Institute of Technology, Tamil Nadu.
That was in February 2021.
"She taunted me. But it made me contemplate what I could do with my knowledge and skillset. One fine day, amid conversations with Alexa, the idea of inclusive technology struck me. That triggered a set of plans," Gupta, from Delhi, told Interesting Engineering.
Fast-forward to February 2022, a year since her mum's gibe, Gupta created an AI model harnessing Tensorflow object detection API. It makes use of transfer learning through a pre-trained model dubbed ssd_mobilenet. Her post on LinkedIn went viral, with more than 58,000 reactions and 1,000 people appreciating her idea, which bridges the gap and creates a ripple in inclusive technology.
"The dataset is made manually by running the Image Collection Python file that collects images from your webcam for or all the mentioned below signs in the American Sign Language: Hello, I Love You, Thank you, Please, Yes and No," says her Github post.
Gupta credits her model inspiration to data scientist Nicholas Renotte's video on Real-Time Sign language Detection.
"The dataset is manually made with a computer webcam and given annotations. The model, for now, is trained on single frames. To detect videos, the model has to be trained on multiple frames for which I'm likely to use LSTM. I'm currently researching on it," Gupta says. Long-Short Term Memory networks, or LSTMs, have been considered as an efficient solution to solving sequence prediction problems in data science.
Gupta acknowledges that creating a deep learning model from zero for sign detection isn't the easiest. "Making a deep neural network solely for sign detection is rather complex," she told IE. She responds to one of the comments in the same vein, "I'm just an amateur student but I'm learning. And I believe, sooner or later, our open source community, which is much more experienced than me will find a solution."
A small step towards inclusivity
Though ASL is claimed to be the third most commonly used language in the United States, ranking behind English and Spanish, applications and technologies to translate it to another language are yet to catch up. However, the Zoom Boom, which was accelerated by the pandemic, has put sign language in the spotlight. A case in point would be Google AI researchers presenting a real-time sign language detection model that can identify people who are signing with up to 91 percent accuracy.
"According to me, researchers and developers are trying their best to find a solution that can be implemented. However, I think the first step would be to normalize sign languages and other modes of communication with the specially-abled and work on bridging the communication gap," Gupta says.