The new digital revolution, deep learning

(To Andrew Piras)

07/02/22

Recognize a photo, a song, a user's habit. With artificial intelligence it is already possible. But why is it important, and how is it affecting our way of life?

Before answering this question we need to take a step back to explain the difference between Artificial Intelligence (TO THE), Machine Learning (ML) and Deep Learning (DL), terms that are often confused but with a precise meaning.

To explain the basic idea, let's use an image (opening) taken from the NVIDIA website.

From the image it is clear that the notion of AI is a general concept of ML which in turn is a more general concept of the DL. But not only. In fact, as we see the first algorithms of deep learning they were born just over 10 years ago unlike artificial intelligence that was born around the 50s with the first languages such as LISP and PROLOG with the aim of imitating the capabilities of human intelligence.

The first artificial intelligence algorithms were limited to carrying out a certain possible number of actions according to a certain logic defined by the programmer (as in the game of checkers or chess).

Through the machine learning, artificial intelligence has evolved through so-called supervised and unsupervised learning algorithms with the aim of creating mathematical models of automatic learning based on a large amount of input data that constitute the "experience" of artificial intelligence.

In supervised learning, in order to create the model, it is necessary to train AI by assigning a label to each element: for example, if I want to classify fruit, I will take pictures of many different apples and label the model " apple ”so for pear, banana, etc.

In unsupervised learning the process will be the other way around: a model will have to be created starting from different images of fruit, and the model will have to extract the labels according to characteristics that apples, pears and bananas have in common.

The models of machine learning supervised are already used by antivirus, spam filters, but also in the marketing field like the products suggested by amazon.

The example of the spam filter

The idea behind an email spam filter is to train a model that "learns" from hundreds of thousands (if not millions) of emails, labeling each email as spam or legitimate. Once the model has been trained, the classification operation involves:

Extraction of peculiar characteristics (called features) such as, for example, the words of the text, the sender of the email, source ip address, etc.

Consider a "weight" for each extracted feature (for example if there are 1000 words in the text, some of them may be more discriminating than others such as the word "viagra", "porn", etc. they will have a different weight than good morning , universities, etc.)

Execute a mathematical function, which, taking as input features (words, sender, etc.) and their respective weights, return a numeric value

Check if this value is above or below a certain threshold to determine if the email is legitimate or to be considered spam (classification).

Artificial neurons

As said, the Deep learning is a branch of the machine learning. The difference with machine learning it is the computational complexity that brings huge amounts of data into play with a “layered” learning structure made of artificial neural networks. To understand this concept, we start from the idea of replicating the single human neuron as in the figure below.

As seen previously for machine learning we have a series of input signals (to the left of the image) to which we associate different weights (Wk), we add a cognitive "bias" (bk) that is a sort of distortion, and finally apply a activation function i.e. a mathematical function such as a sigmoid function, hyperbolic tangent, ReLU, etc. which taking a series of weighted inputs and taking into account a bias, returns an output (yk).

This is the single artificial neuron. To create a neural network, the outputs of the single neuron are connected to one of the inputs of the next neuron, forming a dense network of connections as shown in the figure below which represents the actual Deep Neural Networks.

Deep Learning

As we can see from the figure above we have a set of inputs to be supplied to the neural network (input layer), then intermediate levels called hidden layers which represent the "layers" of the model and finally an output level capable of discriminating (or recognize) one object over another. We can think of each hidden layer as a learning capacity: the higher the number of intermediate layers (i.e. the deeper the model), the more accurate the understanding will be but also the more complex the calculations to be performed.

Note that the output layer represents a set of output values with a certain degree of probability, for example 95% an apple, 4,9% a pear and 0,1% a banana and so on.

Let's imagine a DL model in the field of computer vision: the first layer is able to recognize the edges of the object, the second layer starting from the edges can recognize the shapes, the third layer starting from the shapes can recognize complex objects composed of several shapes, the fourth layer starting from shapes complex can recognize details and so on. In defining a model there is no precise number of hidden layer, but the limit is imposed by the power required to train the model in a certain time.

Without going into too much specifics, the training of a neural network has as its objective the calculation of all the weights and biases to be applied to all the single neurons present in the model: it is therefore evident that the complexity increases exponentially as the intermediate layers increase (hidden layer). For this reason, graphics card processors (GPUs) have been used for the training: these cards are suitable for more demanding workloads as, unlike CPUs, they are able to perform thousands of operations in parallel using SIMD (Single Instruction Multiple Data) architectures as well as modern technologies such as Tensor Core that allow matrix operations in hardware.

Deep Learning Applications

By processing huge amounts of data, these models have high fault and noise tolerance despite incomplete or inaccurate data. They therefore provide fundamental support in every field of science. Let's see some of them.

Image classification and security

In case of crimes, it allows the recognition of a face starting from the image captured by a surveillance camera and comparing it with a database of millions of faces: this operation if done manually by man could take days if not months or even years. Moreover, through the reconstruction of images some models allow to color missing parts of the same, with an accuracy now close to 100% of the original color.

Natural Language Processing

The ability of a computer to understand written and spoken text in the same way as humans would. Among the most famous systems Alexa and Siri able not only to understand but to answer questions of a different nature.

Other models are able to do sentiment analysis, always using systems of extraction and opinions from the text or words.

Medical diagnoses

In the medical field, these models are now used to perform diagnoses, including the analysis of CT or MRI scans. The results that in the output layer have a confidence of 90-95%, in some cases, can predict a therapy for the patient without human intervention. Able to work 24 hours a day, every day, they can also provide support in the patient triage phase, significantly reducing waiting times in an emergency room.

Autonomous guide

Self-driving systems require continuous monitoring in real time. More advanced models foresee vehicles able to manage every driving situation independently of a driver whose presence on board is not foreseen, providing for the presence of only passengers transported.

Forecasts and Profiling

Financial deep learning models allow us to make hypotheses on future market trends or to know the risk of insolvency of an institution more accurately than humans can do today with interviews, studies, questionnaires and manual calculations.

These models used in marketing allow us to know the tastes of people to propose new products, for example, based on associations made with other users who have a similar purchase history.

Adaptive evolutions

Based on the "experiences" uploaded, the model is able to adapt to situations that occur in the environment or due to user input. Adaptive algorithms cause an update of the entire neural network based on new interactions with the model. For example, let's imagine how YouTube offers videos of a certain theme depending on the period, adapting day after day and month after month to our new personal tastes and interests.

Finally, the Deep Learning it is a research field still in strong expansion. Universities are also updating their teaching programs on this subject which still requires a solid foundation in mathematics.

The advantages of applying the DL to industry, research, health and everyday life are undoubted.

However, we must not forget that this must provide man with support and that only in some limited and very specific cases can this replace man. To date, in fact, there are no “general purpose” models that are capable of solving any type of problem.

Another aspect is the use of these templates for illegal purposes such as creating videos DeepFake (see article), i.e. techniques used to overlay other images and videos with original images or videos with the aim of creating fake news, scams or revenge porn.

Another illicit way to use these models is to create a series of techniques aimed at compromising a computer system such as adversarial machine learning. Through these techniques it is possible to cause the wrong classification of the model (and thus induce the model to make a wrong choice), obtain information on the data set used (introducing privacy issues) or clone the model (causing copyright problems).

_References

_{https://blogs.nvidia.com/blog/2016/07/29/whats-difference-artificial-int...}

_{https://it.wikipedia.org/wiki/Lisp}

_{https://it.wikipedia.org/wiki/Prolog}

_{https://it.wikipedia.org/wiki/Apprendimento_supervisionato}

_{https://www.enjoyalgorithms.com/blog/email-spam-and-non-spam-filtering-u...}

_{https://foresta.sisef.org/contents/?id=efor0349-0030098}

_{https://towardsdatascience.com/training-deep-neural-networks-9fdb1964b964}

_{https://hemprasad.wordpress.com/2013/07/18/cpu-vs-gpu-performance/}

_{https://it.wikipedia.org/wiki/Analisi_del_sentiment}

_{https://www.ai4business.it/intelligenza-artificiale/auto-a-guida-autonom...}

_{https://www.linkedin.com/posts/andrea-piras-3a40554b_deepfake-leonardodi...}