Are you interested in data science

What does a data scientist do?

You analyze huge, often unstructured amounts of data, recognize patterns and make predictions and decisions on this basis. Today it is no longer just who can collect the most data that counts, but who can evaluate it best: Big data becomes Smart data!

You use so-called ones Advanced Analytics. This is a further development of Business Intelligence (BI) - a part of business informatics that deals with processes and procedures for company analysis. BI analysis tools primarily scrutinize historical data, and advanced analytics are not only more technologically advanced, but often focus on predicting the future. Predictive analytics are counted among these advanced analysis methods. This allows you to assess what effects certain changes will have in the future. This procedure is used both in health care and in risk management for insurance companies. But so-called predictive analytics are also in demand in other fields of application. Who wouldn't want to be one step ahead of their competitors and know what's going to happen next? With the help of your analysis, you provide important information to help you make the best decision.

But before you come to the analysis, you first take care of one solid database. Because that stands or falls the quality of your statements and the probability that your predictions will come true. The question you have to ask yourself first: Which data is important for the decision and above all - where do you get this data from? An example is the entries in a search field on a homepage. They can provide helpful information on user behavior, e.g .: What new products are most customers looking for? Which products are returned frequently (e.g. "xy defective" or "unsatisfied with xy") and thus cause additional costs?

If you have your raw data, you can prepare it for your special use case. So you don't mind working with unstructured data. Data is often unstructured, especially at the beginning of a data cycle. So they have not yet been put into a special scheme. Your job is to extract the relevant data, filter out unimportant data and map the data. You also convert the cleaned data set into the appropriate format.

So that you can reliably recognize patterns, you usually need a database for the last 3 years. Smaller time windows are not recommended, as otherwise the deviations are too great. You check the mathematical models that you create on this basis with tests and so-called training sessions. Only then can you be sure that your forecast is meaningful.