Are data scientist jobs competitive

What does a data scientist actually do?

As a data scientist, I am part of the dev team at karriere.at and have turned my passion for data into a job. Most of my time, unsurprisingly, and like probably every data scientist, I spend most of my time processing data (data munging). Because as the saying goes: "Real world data is messy". Sounds unexciting and dry - but it isn't, the professional field of a data scientist is very varied.

Data science and data scientist are two terms that are heard more and more frequently in the IT industry today. Trends from job portals show that the search for data scientists is steadily increasing. And sometimes data science is even referred to as the “sexiest job of the 21st century”. But what do these two terms actually mean?

"Here’s a lot of data, what can you make from it?"

Corporate data has enormous potential and both the amount of data and its complexity are increasing. It is becoming increasingly clear to companies that they need to harness this potential in order to remain competitive. Because with every additional data source (keyword: social media platforms) the chances and possibilities of analyzing and using this data in the interest of your own business model increase. Data science is now more than just data analysis. In short: It includes all methods and approaches to gain useful insights from a huge amount of data by means of intelligent analysis.

Data science is a very interdisciplinary professional field and a data scientist therefore has a balanced mix of different skills and knowledge.

"The modern data scientist seems more like a unicorn than an actual individual"

I think data science describes quite well what data scientists actually do: a combination of programming, data analysis and problem solving. So should

  1. found the right questions and the appropriate data (application knowledge, database experience, data preparation),
  2. analyzes the data (mathematical, but above all statistical knowledge),
  3. Appropriate models applied (statistics, machine learning, data mining)
  4. and the knowledge gained from this is integrated into a product / productive system (programming knowledge, tool knowledge).

In addition, a data scientist should be able to think analytically and have communication and presentation skills. Because even non-technicians have to be brought on board and the knowledge can be communicated to the decision-makers.

"But what does a data scientist do all day?"

Search. Clean up. Processing. Aggregate. Interpret. Search. Ask. Questioning. Communicate. And then: analyze. Model. Evaluate. Questioning. Interpret. Question again. And finally: present findings.

Customer Profiling

But what exactly does it look like at karriere.at? In the last few months I have been working on customer profiling. Because karriere.at is not only a portal for job seekers, but also a portal for companies, which of course means that the acquisition of new customers is of great importance. In the course of optimizing the acquisition of new customers, the following questions arose at karriere.at, for example: How likely is it that a company will become a customer at karriere.at? And what does our typical customer look like?

For the first time, such questions require so-called customer profiling or the creation of a customer signature. This signature is a snapshot of each customer, characterizes him and subsequently enables analysis. The creation of a signature is generally more difficult for companies that have not become customers in the past few years.

To create such a signature, a large number of transformation processes (compression of the data, standardization, cleansing) and interpretations are initially necessary. Compression of the data means that, for example, data is only analyzed in summary per quarter. The standardization also reduces the complexity of the data by assigning values ​​to defined categories. The interpretation of the data should show, among other things, where there is still potential for optimization in data preparation. For example, one checks whether there are still features that have a large number of different values ​​and one asks above all what exactly missing entries (so-called NULLs) mean in a specific feature.

Having both signatures, methods (including decision trees) can be used that reveal a pattern of company characteristics that are more likely to lead to new customer acquisition.

My career

In the course of my computer science studies at the JKU Linz, I became more and more interested in knowledge processing and therefore also completed my studies in the field of information extraction (obtaining information from natural language documents, such as CVs).

At the time, I found the scientific work and experimenting with new methods and approaches very exciting and therefore stayed at the university for almost eight years to further specialize in this subject area. Although the focus of my research remained on information extraction, data mining and data analysis, the focus was more and more on data quality. My dissertation was also located in this area.

So many years at university were very instructive and intensive, but I didn't want my knowledge to always end in prototypes, I also wanted to see it implemented in a product. So I submitted an unsolicited application to karriere.at and lo and behold - now I can work with a lot of real data.

If you would also like to know whether you would be a data scientist and what type you would embody, you can take a look at the current SAS study "The Data Scientist: Types, Talents, Trends ...".