“Machine learning” is emerging today in every detail in the IT industry and modern technology. But what is real? How do machines accumulate skills and knowledge? Time for a journey through the world of learning and self-learning devices.

Machines become “wiser”. They can recognize a cat in a picture they have never seen before, learn to play a game they do not know, win with a human, first checkers, later chess, and now in it. But there are different ways of getting to know the world and learning machines (and using that term we mean supercomputers), as well as the effects.

The promises of artificial intelligence such as the HAL 9000 from Stanley Kubrick’s “Odyssey of the Space 2001” have been heard since the 1960s. Despite the considerable progress made in artificial intelligence research, it has only recently been possible to develop solutions that utilize SIs that would not be exceptionally “silly”. Machines are not equal to general intelligence, but it is not their fault at all. They can learn quite quickly, which does not mean that they become rational.

The core of machine learning is the ability of the machine to develop some useful skills without having to understand all the nuances associated with the task entrusted to it. In other words, machine learning is a collection of techniques by which machines find patterns, which, however, are not them, but the person makes sense. The learned machine may provide for the future appearance of such a scheme on the basis of the analysis of collected data. Sounds complicated? So let’s take an example.

Bricks and feathers – how does machine learning generally work?

Numerous machine learning sessions, attempting to bring readers closer to nature, focus on exchanging teaching methods (eg Bayesian, top-down, supervised/unattended, decision trees, deep learning, etc.) or defining types of knowledge representation. computer. OK, we also briefly introduce readers to what are the different methods of machine learning (a summary of methods at the end of the article), but our aim was rather to explain the general idea of machine learning rather than the complexity of terms that students face in SI and machine learning. Show that machine learning is something other than the way people learn.

Suppose we want to “teach” the computer to answer the question: which of the items dropped from the same height will fall first? People solve this problem by knowing the laws of physics: universal gravity law, air resistance, etc. In the case of a machine, it can give a correct answer without having to force the baggage of human experience.

Imagine that we are standing – like Galileo – at the top of the leaning tower of Pisa and throwing two objects from it, such as a white bird feather and a red brick. Of course, we know which object will fall to the ground first, and which due to air resistance will be much drier in the air before finally reaching the ground level. We understand and we can explain this phenomenon. But the machine, or computer, which we are going to observe our experiment and record the parameters of the objects is not pale. I do not know why it happened. It only registered properties of dropped objects: their weight, size, color, and shape.

We go to the second phase of learning: we throw another white feather and another red brick – according to the laws of physics they reach the earth at different times depending on the resistance that the air puts them. Repeatability causes the computer to “understand” the scheme: red brick falls first white pen ever after. However, at this stage, the machine, although correctly answer the question “what will fall first?” Does not know yet what characteristics of the objects observed are important. Surface? Weight? Or maybe color? Therefore, even though the computer has already learned something, it still does not know enough to help people calculate the time of falling objects, not just white feathers and red bricks.

Narrowing SI’s effectiveness was a common problem for the first artificial intelligence in the middle of the last century. The artificial intelligence they were dealing with was a very narrow set of problems, and the attempt to extrapolate their skills to a wider range always ended in failure. Most often due to data depletion and computational power and combinatorial explosion. ( the Combinatorial explosion is an enormous increase in the number of required calculations due to the increasing, even negligible, number of inputs and significant variables describing the reality in which the SI is supposed to work.)

The next phase of computer teaching is ahead of us: in addition to the previously cast white, light feathers and red heavy bricks, we also throw black, lightweight pieces of paper and green, heavy medical balls. And in the next step, other objects differing in mass, color, and surface. The machine observing our actions and recording the parameters of the objects begins to “perceive” what is important. The computer begins to “understand” that the first is irrelevant. If he has enough data he will also “notice” that weight is also not important. Only the surface of the object being thrown is important. This is enough to make the machine correctly answer the question “what will happen first?”. In this way, we have learned the computer to perform certain actions: Predict the behavior of objects solely on the basis of their relevant parameters. Because a computer processes data much faster than a human, it can analyze more objects in a much shorter time, and in that sense will be more efficient than the best physicists.

That’s what machine learning is all about – we teach the computer to do the job correctly from a limited field. But it is easy to see that the machine still does not understand what the air resistance is and does not know the gravity law. However, he does not need to know to do his job properly. The computer that won with the world champion in him, certainly phenomenal in this classic and difficult game, but cannot answer simply for every homo sapiens representative question: what will fall first: feather or brick?

But the effectiveness of machine learning has allowed us to build artificial intelligence that effectively identifies speech, writing, images, defining effective strategies in demanding games (chess, go), controlling autonomous cars, navigating in an unknown environment, classifying different objects, finding relevant data in the ocean of information (vide Google, yes, the most popular search engine is also based on the SI), the financial markets and many other things that make the machine much faster than people do not understand what they do.

It does not mean better. In the trap of algorithms

The learned machine will perform the tasks for which it was designed, much faster than the man. Otherwise, designing such a system would make no sense. That is why more and more computers are responsible for assessing the creditworthiness of bank customers, analyzing market prices, and assessing the ability of candidates to work. Just machines do it faster. Not only about the speed here. The scale of action and its complexity are also important. Contemporary banks or insurance companies use algorithms that make much more decisions, in a much shorter time and with more complexity of the factors analyzed. But there is a problem. Although decisions made by machines may seem optimal, taking into account the huge amounts of input data, It is often the people who are the creators of a given SI system that are important and dominant. Dr. Cathy O’Neil, an American Harvard mathematician, is keenly aware of the dangers of mismanaged AI systems that, instead of actually helping humanity, build stereotypes by acting like machines, much faster and on a mass scale. In his lecture at TED.com, he warns that blind trust in algorithms and big data sets can end badly. Instead of actually helping humanity, they are stereotyping, acting like machines, much faster and on a mass scale. In his lecture at TED.com, he warns that blind trust in algorithms and big data sets can end badly. Instead of actually helping humanity, they are stereotyping, acting like machines, much faster and on a mass scale. In his lecture at TED.com, he warns that blind trust in algorithms and big data sets can end badly.

The problem is that decisions are made by machines, mindlessly, on the basis of the learned, learned characteristics and values. In a continuous flood of media reports about the successes of artificial intelligence, we can forget about it and fall into the trap of reality limited by the decisions of the “intelligent” zero-one systems around us. But life is not zero-one, it slips away all schemas. It is certainly not the future we would like.

Big data and correlation traps

It is common knowledge that machine learning is all the more effective, the more input we get to the system we want to learn. This is an exaggerated simplification. Machine learning and analysis of large datasets allow machines to find correlations where people using traditional statistical methods would not find them for a very long time. Success? Not necessarily. Finding correlation in itself does not mean that we have “bred” a system that will allow us to look deeper into reality, from which we have analyzed the vast collections of data. Many people often mistake two concepts: correlation and causality. In the meantime, it is not the same! The existence of the relationship between the variables in the analyzed data sample tells us that one variable can affect another (the first one, the second one, or affect each other). However, it may also mean that there is another, undetectable variable that affects both correlated variables or – and that is most interesting – that the found relationship is pure chance.

The latter can argue with our intuition. If a variable changes almost as much as another, then there must be a relationship, right? No. In the classical or “pre-data” statistics, there is practically no random correlation detected, mainly because all the statistical studies that we conducted were largely intentional. For example, we looked for a link between the age of citizens and the incidence of various illnesses, the level of education and the level of earnings, etc. These cases intuitively impose themselves as worth investigating. Machines that process gigantic data sets (big data), however, are governed by other laws. They detect correlations indicating a strong correlation between variables, to which our human minds are sure that they have absolutely no connection.

I suggest you visit the ” Spurious Correlations ” website, where we find a lot of absurd and funny examples (one of them in the above illustration). For example, the divorce rate in Maine is characterized by a very strong correlation (99.26%) with margarine consumption, and the number of drownings due to falling into the pool is correlated with … the number of films in which Nicolas Cage performed. Believing in blind algorithms, one could think that we would reduce the number of accidents at the pool by sending Mr. Cage to retirement. You are certainly aware of this absurdity.

Although the examples from this site may be ridiculous, after a deeper reflection and understanding of the scale in which machine learning and “educated” computers are influenced by our lives, the gloomy, dystopian vision of the world overwhelmed by imagination Intelligent but also misleading algorithms. Intelligence without reason? Sounds terrible, but it’s not bad until we have full control over it.

Should you teach the machine?

Definitely yes. In the reality surrounding us, we find a lot of positive examples when a properly educated computer (or their network) becomes a very useful tool in the hands of people who would come to the same results much longer. Let’s take, for example, the genome. Thanks to its performance, the machines help us to develop a whole new field of medicine: precision medicine, also called personalized medicine.

It is about the treatment methods developed for a particular patient, based on his individual genetic code. The first human genome sequencing operation started in 1990 (the very idea of the Human Genome Project was launched in 1984) lasted thirteen years and cost about $ 3 billion. Today, QIAGEN uses the machine-learning system of the Intel (Scalable System Framework) is able to offer a human genome for just $ 22 (about 80 dollars). Reducing the cost of sequencing means The ability to develop special cancer therapies based on specific tumor genomes is able to eliminate the risk without destroying the patient’s actions of classic cancer treatments such as radiotherapy or chemotherapy. But precision medicine would not be possible, if not just machine learning. It is similar with personalized medical diagnostics. Machine-assisted AIs support physicians’ perceptions of significant correlations in patient outcomes or patient lifestyle data and predict which individuals are at increased risk for specific diseases.

Another example of the use of machine-learning intelligence in the service of humanity is Intel’s partnership with the National Center for Missing & Exploited Children. This organization is dedicated to the search for missing children and counteracts the harassment and sexual exploitation of minors. Michelle DeLaune, senior vice president of the National Center for Missing & Exploited Children, says: we have 25 analysts. There is no way for them to themselves be able to check 8 million reports and reports. But thanks to machine learning and artificial intelligence it is possible and it is much easier to find missing children or to stop perpetrators of persecution. As you can see, it is not necessary to understand the idea behind the activities undertaken by the machine in order to achieve the goals of the machine.

Machine learning methods

Finally, we present yet in a very general outline and without penetrating into the intricacies of the mathematics behind the machine learning algorithms of the methods by which machines are taught. We already know what is generally machine learning, but what are the methods of this learning? Below are some definitions, we admit – simplistic – but we did not want to expose complicated mathematical patterns and difficult concepts in a popular science article.

Supervised teaching – in this mode, the computer receives both input data and the result, decision or action that should be taken on the basis of the received data. This is one of the most commonly used methods of machine learning. In this case, the data should be carefully selected by the human. This person is responsible for the formal description of the underlying data (eg describing particular messages as spam and others as proper correspondence). Incorrect input data will result in negative machine learning results.

Unattended Training – In this case, the machine receives only raw input without a well-defined output, and the task of the algorithm is to find specific patterns that may be useful in further analysis of data. However, this type of machine learning is most often used to create SIs that solve specific problems. . This mode of learning is used when we want to find something in the data that we can not predict in advance. However, it is impo.nt to remember that the results generated by unattended learning may be misleading and require careful verification.

Decision Tree Learning (DTL) is a method of learning machines intuitive for humans, mainly because the acyclic graffiti in the form of decision trees is readable to our minds.

Bayesian learning – a learning method using the so-called. Bayesian naive classifier and Bayesian theorem. This learning method is based on probabilistic proposition.

Learning from examples – is one of the simplest forms of machine learning, used among others. to teach perceptrons (the first type of neural network). This method automatically adjusts the weight based on the flowing examples.

Learning a set of rules is basically a variant of machine learning using decision trees, which in this case are presented as a set of conditional rules (if [condition] is [category]).

Teaching by reinforcement – a method modeled from behavioral sciences in human psychology; In general, the search for the optimal solution is determined by the equivalents of “rewards” (high values) and “punishments” (low values). “Awards” reinforce the optimized path of optimization.

Deep Machine Learning – A method that is essentially a combination of many of the above-mentioned methods, applied to multi-layer neural networks (hence the term “deep”) in which each layer is pre-learned. Machines with this technique achieve the most spectacular results. Example: artificial intelligence called AlphaGo, which won the World Champion in Go.