In the digital society we increasingly live in, it wouldn’t be an exaggeration to say that machine learning (ML) is everywhere around us. From social media, marketing, route navigation to autopilot systems, banking or healthcare mobile apps – all these software solutions are built using artificial intelligence (AI). As Michael Kanaan explains in his 2020 book, ML apps replicate various natural processing and evaluative human skills and get better with each passing day. ML helps us navigate a world full of AI and provides a solution to mimic the human mind.
Never before the information age were terms like intelligence, thinking or learning associated with something other than the human or animal brains. Today, technology surrounds us and is an integral part of our lives, and that is why customers crave meaningful solutions that bring added value to their lives. From my experience working in a bespoke software development company, I know that technology plays a vital part in the success of modern businesses. One way to future-proof your products or services is to design them using the capabilities of AI and ML. Here is a short machine learning guide for beginners as an orientation in this trendy and complex area.
Difference Between AI and ML
Both terms often interfere with each other, and some people even use them interchangeably. Machine learning is a branch of artificial intelligence and, as such, shares lots of its principles. For example, AI enables computers to gather data, access it, solve problems, or predict outcomes based on facts. To determine whether a statement is true or false, appropriate or not and to make logical choices, computers need t to possess specific field knowledge.
How can a machine achieve this expert-level knowledge and be capable of making informed decisions? The simple answer is through mathematical algorithms that enforce machine learning. Some of the basic and most popular machine learning algorithms are K-Nearest Neighbours (KNN), Support Vector Machines (SVM), linear regression, K-means, decision tree etc. You will need a variety of algorithms and techniques based on the specific area of work and gathered data. As always, if you’re new to machine learning, start with the very basics and gradually immerse yourself deeper with practical projects.
Machine Learning Approaches
Mainly, there are four major ML types:
Supervised learning encompasses models trained to learn something using previously labeled data, which can consist of both dependent variables (labels) and independent variables (features). As model training takes place, programs learn to predict labels for new data based on already learned labels over time. Some of the most common examples for supervised learning algorithms are Stochastic Gradient Descent, Random Forests and Neural Networks.
A practical example from sports analytics: Vrije University in Amsterdam organised supervised learning to classify limbs and techniques in kickboxing experiment using ML. Beginners and professionals strike a boxing bag from various distances and produce a dataset with nearly 4000 limb trajectories. Then, using KNN and SVM classifications, the researchers created clusters with trajectory data and successfully trained the software to classify and predict the limbs and techniques used for the strikes (e.g. uppercut, front kick, spinning kick, superman punch etc.) with 86% accuracy.
When it comes to unsupervised learning, this concept assumes that there is no training data. The algorithm has the task to discover structures on its own using classification (clustering). Algorithm examples here are Hierarchical Clustering, K-Means Clustering and Affinity Propagation.
This type is a mixture of the first two types, and it is often employed in case only a small amount of available data is labelled, and the labelling process would be expensive. Essentially, this approach aims to label the unlabeled data by observing the rules and strategies within the dataset. Widely used methods here are label propagation, low-density separation and graph-based methods.
Last but not least, we have reinforcement learning, where the computer program needs to learn a control strategy (policy) in a dynamic environment (e.g. self-driving cars) and achieve a certain goal state. The ML system continuously gets feedback from its surroundings and receives rewards or punishments for desired or undesired actions. Among the most widely used methods here are the Markov decision process (MDP), Dynamic programming (DP) and Monte Carlo methods.
Core Tools for Machine Learning
Take your time before starting your very first machine learning project and ask yourself what problems you wish to solve. Will the solution you’re building save you or your clients’ precious time and efforts and allow you to focus on more important business aspects? Entering the field of AI includes a steep learning curve so bear in mind the challenges that await your in-house software development team if ML is not yet on their professional skill set.
Nowadays, learning resources and practical tools designed to build your ML projects are abundant online. You can easily find self-study guides depending on your previous field experience and technical expertise. Google also provides great free resources out there for ML engineering’s best practices for those who are switching to it now. For example, Google Colab is used as an environment for creating ML apps using cloud services that let you use their GPUs to reduce the processing time significantly. Still, what are some of the most important learning tools to jumpstart your next ML project?
This is an ML method for defining and refining algorithms. For example, in image recognition patterns, computer vision takes signals, creates tangible patterns from the visual world, and responds accordingly. A pattern recognition system is used to present data and theoretical hypotheses that are later built upon by other branches of machine learning since it is based entirely on data. Most importantly, the ML-based software needs to learn to recognise patterns from a different angle, at a distance, during night time and even if an object is partially covered. Among the best tools for pattern recognition in ML are Amazon Lex, Google Cloud AutoML or Microsoft Azure Machine Learning Studio.
Clustering, as mentioned above, is commonly used in unsupervised and reinforcement learning. Theoretically speaking, data points in the same group contain similar characteristics or properties, but they also possess specific individual features. Cluster analysis represents a mechanism for building data clusters based on similar criteria, e.g. detecting spam emails based on header attributes or text length. The crucial steps of the pre-clustering process are identifying missing data and deciding what to do with it, curse of dimensionality to optimise the number of features and data normalisation for structuring the database as determining categorical variables.
Computational Learning Theory
Computational learning theory (CoLT) is mainly adopted in supervised learning and applies formal mathematical approaches to ML. This concept aims to quantify learning problems by characterising tasks based on levels of difficulty. A famous CoLT framework is Probably Approximately Correct (PAC), created by Leslie Valiant, which seeks to approximate an unknown mapping function from inputs to outputs. This model examines how much computational power is required to find a suitable hypothesis (fit mode) close to the unknown target function.
Biography Aleksandrina Vasileva
Aleksandrina is a Content Creator at Dreamix, a custom software development company, and is keen on innovative technological solutions with a positive impact on our world. Her teaching background, mixed with interests in psychology, drives her to share knowledge. She is an avid reader and enthusiastic blogger, always looking for the next inspiration.