Uses Quora deep learning

Probability: The backbone of artificial intelligence

Have the text read to you

Required reading time: 4 minutes

What is the probability of winning the lottery twice in a row? What are the chances that it will rain on October 19th? Will I happen to meet a friend from elementary school today? All of these questions are quick to ask, but not that easy to answer. Probability calculation enables us to calculate the chance that a certain result will come about through a random process.

As the word “calculation” suggests, mathematical formulas are used to calculate probabilities as accurately as possible. The basis for this is data, for example on the weather on October 19 over the past ten years. If this information is reliable, the result for the coming October 19th is also reliable. So the quality of the data determines how good a prediction is.

Let's say there is a 30 percent chance it will rain on October 19th. Does this mean that it won't rain that day - because the value is below 50 percent? Not necessarily. It is unlikely, but not impossible. So, in a way, probability is a measure of uncertainty.

How are probability theory and AI related?

Prediction is the bread and butter of machine learning and AI. The goal of an AI model is always to make a decision. Does the picture show a cat or a dog? Which next move is the most promising in a game of chess? What is the most appropriate translation for a word, in a specific sentence or context?

Questions like these are modeled in machine learning in such a way that the answer is a probability. In the case of image classification, this works, for example, as follows:

Animal picture => (calculation of the deep learning model) => Result: [75% dog, 20% cat, 3% dolphin, 2% fish]

This can be rewritten in the probability calculation as follows: P (C = dog | animal picture) = 0.75.

So the picture shows a dog with a 75% probability. In the case of a game of chess, the “considerations” of a deep learning model would be similar: the first possible move has a probability of success of 0.3 and the second possibility has a probability of success of 0.7. Based on these results, the computer then makes a decision and makes its move.

A correct answer cannot always be guaranteed. As a rule, AI models have to train in order to minimize error rates. The statistical nature of both learning and prediction means that all ML and AI systems currently in use are probabilistic.

This is also due to the fact that decisions often have to be made with incomplete information. This requires a mechanism to quantify the uncertainty - which the calculus of probability offers. With their help, insecure elements - such as the risk in financial transactions and many other business processes - can be modeled.

So a solid understanding of probability theory is essential to understand machine learning on a deeper level. It is also essential to recognize which processes are running in a model in order to identify and correct errors.

The math behind the calculus of probability

A probability is mathematically given as “P”, derived from the English translation of the word - “Probability”. It is defined for an event E. The probability that event E occurs is measured in percent and looks like this in a formula: P (E) = x%.

Probability theory includes four important concepts:

  1. The event - an event to which a probability is assigned.
  2. The result space (in English = sample space), which represents the amount of possible results for an event.
  3. The probability function (in English = Probability Function), which assigns a probability to an event. The probability function indicates the probability that the event is part of the result space.
  4. The probability distribution (in English = Probability Distribution) represents the form or distribution of all events in the result space.

The probability of an event can be calculated by counting all the events of an event and dividing by the sum of all occurrences of the event. The probability is a fractional value and has a value in the range between 0 and 1, with 0 being no probability and 1 being the full probability.

There are two ways to interpret probability:

  • The frequentist probabilitywhich takes into account the actual probability of an event. It is based on counts or samples of existing data.
  • The Bayesian probability takes into account how strongly we assume that an event will occur. So it is based on assumptions or personal beliefs. Bayesian techniques can be used to model events that did not occur or only rarely occurred before (i.e. where no data exist yet).

All of this may sound logical so far, but the math behind these processes is complex and can quickly look like this:

Fortunately, there are numerous YouTube videos and online courses that explain the topic to you, for example at Udemy or If you feel confident with the mathematical basics and speak fluent English, there are also various offers from international universities available.

My conclusion: Probability calculation is a very interesting, but also very complex, mathematical topic. It is also a very broad field and not all elements are necessary to understand machine learning in depth. If you want to learn AI and probability theory in combination, you should therefore look around for specific courses.



Image: Unsplash