Only a few decades ago AI - Artificial Intelligence - seemed like the stuff of science fiction. Although AI has been around since the 1940s, its actual use-cases that sci fi writers dreamed of have only started to become reality in the past decade. It made AI much more tangible for people other than researchers. Let’s take a look at exactly what progress was made in the field that led to AI fortifying its place in our lives and our future.
2012, by far, can be considered one of the most important years in the history of deep learning. This is because the true power of convolutional neural networks (CNN) was recognised in 2012 at the ImageNet competition where the participants were required to build an AI that could accurately predict what an object was. The winner network of this competition was something called “AlexNet” which was designed by Alex Krizhevsky and published by Ilya Sutskever and Krizhevsky. This network was revolutionary because it decreased the error rate in the image recognition by 50%, which meant that the AI was now wrong in predicting the labels only 15.3% of the time.
This network taught itself to successfully detect cats with a 74.8% accuracy and faces with a 81.7% accuracy (from YouTube videos). This is, in fact, one of the main reasons we now have facial recognition in our phones now. Furthermore, this improved accuracy allowed researchers to deploy medical imaging models with a much greater confidence. CNNs have proved to be extremely useful in the medical field where they are used in retinopathy, cancer diagnosis, kidney disease and AR assisted surgery among other things.
What CNNs were to image processing and computer vision, attention networks are to Natural Language Processing (NLP). The idea of attention in NLP-specific networks already existed, however in 2017 “Attention Is All You Need” by A Vaswani et al., created a domino effect that enabled machines to understand language like never before. The architecture introduced in the paper is called “Transformer” and solely uses something called “attention mechanism”.
Due to this AI can now write fake news and tweets and has the ability to cause political instability. On the brighter side this advancement made language related tasks like translation much easier for AI resulting in better translation accuracy.
Following this Google released its BERT Model, which the search engine now uses to predict keywords and for SEO ranking. BERT soon became the go-to for MLP models, and other big companies like Microsoft and NVIDIA.
Computers have been beating humans at chess for quite some time now. But AI started venturing into more sophisticated games like Jeopardy, Go and more.
AlphaGo was the first AI to beat a professional Go Player, considered to be the most difficult board game out there. While IBM’s Watson managed to outwit two Jeopardy champions in a three-day match during the first half of the decade. Watson won $77,147 worth of prizes while the two human opponents collected $24,000 and $21,600. DeepMind introduced MuZero that can master multiple games such as Shogi, chess and even Go.
AI has been instrumental in decoding and understanding the constituents of living organisms. At its core the behaviour of a living organism can be traced to its proteins. These proteins hold the key to fighting many diseases, including pandemics like COVID-19.
Proteins, however, are complicated and hence it takes forever to run simulations. Google’s Deepmind solved this problem using AlphaFold, which was trained on protein sequences that were mapped out by many scientists across the world. While computer vision helped in diagnosing illnesses, AlphaFold proved essential in assisting researchers in drug discovery.
Generative Adversarial Network (GAN) was introduced in 2014. Over the next few years these networks became extremely popular and enabled Machines to go beyond the realm of cold logic and rationale, and experiment with creativity. These networks are able to create new faces, swap faces - in images and videos, create paintings. In fact, one painting created by a GAN sold for around $400k.
The downside to this application is the rise of something called deep fakes. These are doctored videos that swap faces and even voices of people on to something completely different. This led people to start using it for malicious purposes. The situation got so critical that big companies like Adobe had to introduce techniques to spot the fakes.