Automation, we often find this word in the media.
On one side, the scientific sections of newspapers speak about automation with excitement, by emphasizing how the progress of technology has lead us to a situation where human workers could be replaced by fully automated machines. On the other side, the economy sections treat this word with big concern: robots could take over our jobs!
Inside automated machines, there are softwares equipped with algorithms that enable robots to learn and take control with very little input from human operators. Indeed, Machine learning is the core behind the Industry 4.0 – the digital revolution.
This post first explains how machine learning could get its limelight back after few decades of downtime since 1980 and then gives a brief introduction on its categories and the way machine learning empowers real world applications.
Social impact and consequences of automation
According to the social science research result, it seems evident that robots will replace some of existing human jobs. It is predicted that 6% of the jobs in U.S. in next five years will be eliminated by the introduction of automation. The scene that robots and machines have replaced human workers in various industry sectors is nothing new: this replacement has happened in the 19th century already. Why do media frame this unsurprising change under the name of Industry 4.0, the digital revolution?
In the era of Industry 4.0, computers and automated machines play an entirely different role than before. Automation in earlier decades was about manufacturing companies using machines to make their production line more efficient or high tech companies using high performance supercomputers to compute given tasks million times faster than ordinary personal computers could do. However, automated systems and computers prevalent in Industry 4.0 era are more than simply capable of high performance computing. Now, we have machines that can learn by themselves. The capability of self-learning provides a huge potential to replace human workers with robots.
Self-learning is considered as one of key aspects of human intelligence and we used to consider that machines wouldn’t acquire this skill easily or in the near future. However, when DeepMind’s AlphaGo defeated the human Go champion Lee Sedol from the Go match in March 2016, it alerted the world that the human level intelligence that we considered as exclusive for ourselves may not be the case anymore.
Since then, the fear of Artificial Intelligence has been amplified. In fact, job replacement by machines might not only happen for low-waged jobs, but also for highly skilled professions. Financial sectors, medical services, legal and accounting sectors are facing the danger that jobs are expected to be decimated in the next decade. According to the Oxford study, the research concluded that accountants have 95% chance of losing jobs to automation in the future. In a nutshell, it is coming!
What is Machine Learning and what can you do with it?
Although there are many definitions of machine learning out there, the one from Prof. Tommi Jaakola (teacher of Machine Learning in Big Data and Text Processing at MIT) is my favorite one:
“Machine Learning is not necessarily to solve problems; rather it is about finding a good representation of given data. The ultimate reason to find the good representation of data is by generalizing the representation; hopefully we can predict the pattern of unknown data”.
In general, we categorize machine learning theory into 4 subcategories. Supervised learning, Unsupervised learning, Transfer learning and Reinforcement learning. (Some adds Semi-supervised learning in one of the independent machine learning categories. However, I consider it is still a type of supervised learning)
Supervised and Unsupervised Learning
Supervised learning and unsupervised learning are traditional learning types. The difference between supervised and unsupervised is whether training dataset is labelled or not. If the training data to build the model is labelled, it is supervised learning. Otherwise, it is unsupervised learning.
Reinforcement learning is a different breed than supervised or unsupervised learning techniques. The learning system is called as an agent who can observe the environment. The agent can select and do actions. Depending on the actions the agent takes, it gets rewards or penalties in return. It must learn by itself the sequence to reach the ultimate goal and the best sequence leading to the goal becomes the best strategy (policy).
Transfer learning is a machine learning type that is rather recent compared to the others. Transfer learning uses an idea of induction: it focuses on storing the gained knowledge while solving one problem and applying it to a different but still related problem. Transfer learning is getting attention because of its “economic” approach. The proven model’s knowledge can be transferred to solve different problems without spending much time for training or tweaking parameters, which might save a lot of costs.
Why Machine learning now?
The machine learning theory has been around already several decades. Linear Regression, Neural Networks and Reinforcement Learning are nothing new. Why is all of a sudden machine learning getting all this highlight?
1. Big Data
One of the important reasons is the volume of available data. Terabyte-sized Big Data can be easily accessed with few clicks. Ok, we were already used to have a lot of data, but the one our applications were used to handle was mostly structured. In other words, data was stored in limited formats and limited storage types such as databases and data warehouses.
Due to Big Data technology, we can now store any type of data regardless whether structured or unstructured, whether it is text or non-text type. Storage locations are either on premise or cloud based.
2. Computing Power
The second reason is the advanced computing power, especially using Graphics Processing Unit (GPU). GPU is a specialized electronic circuit designed to rapidly manipulate and alter memory to accelerate the creation of images in a frame buffer. Originally it was intended for output to a display device, but modern machine learning algorithms such as Deep learning require high throughput computation because their models have millions of parameters to be computed in parallel. GPU’s highly parallel structure makes therefore the computation more efficient than general-purpose CPUs for algorithms by processing of large blocks of data in parallel.
3. Large Model with Deep Layers
The third reason is that large machine learning models are relatively easier to train compared to smaller models. “Large” models have more parameters and are designed with a more complex hierarchy in terms of the architecture.
If the model has limited parameter sets and shallow layers in the architecture, it is relatively difficult to find the optimized parameter values. This reason is linked with the second reason.
When the computation power was not strong enough to compute complex machine learning models, scientists used to build smaller-sized models and, as a consequence, the problem sets that could be solved were not very complex. Machine learning theory was therefore considered as unpractical to solve real world problems.
Applications of Machine learning
One of the most remarkable advancement in machine learning in recent years is happening in the field of Natural language processing (NLP). Behind Google Translator, Deep learning algorithm runs to learn all types of human languages. The performance of NLP applications such as Siri and Alexa can be improved when the more users use it.
Image recognition techniques are the essential part of self-driving cars. The machine learning algorithm for image recognition is usually implemented using Neural Networks, precisely, Deep learning.
Fraud detection is a crucial part in finance and security business. The machine learning model for fraud detection is usually built with logistic classification techniques.
The prediction and forecasting problems such as weather forecast or supply chain management can be solved by regression models.
There is huge potential in the area of health industry by applying machine learning technology to diagnose diseases with high precision, to predict diseases early enough and to invent new personalized medicine.
Last but not least, one of the prevalent applications using machine learning in our daily lives is advertisement. The main technology behind the advertisement is machine learning driven recommendation systems, which companies like Netflix, Amazon or YouTube all implemented already.
In the next post we dive into the details of Collaboration Filtering, one of most widely used recommendation algorithms.
Latest posts by Soojung Hong (see all)
- Collaborative Filtering: the secret behind recommendation systems - September 27, 2017
- Machine learning behind the digital revolution - August 23, 2017
- Deep Link and App Indexing: Increase your app’s visiblity - October 15, 2015