What's the hardest thing about data science

Career & Salary

What is the job of a data scientist?

Yasmeen Ahmad: A data scientist is a data enthusiast, problem solver, and storyteller rolled into one. Its task is to examine data in companies using various analysis techniques and thus to help solve specific questions and problems. A data scientist first dissects all the data available in the company, reveals gaps, integrates and organizes data from a wide variety of sources - in short, he brings order into the data chaos.

In the second step, he searches for data patterns. To do this, it extracts and analyzes the information in a variety of ways to determine which algorithm to use to get the best insights and answers. In the last step, he must be able to pass on the knowledge gained to the decision-makers and employees in the company using visualization and storytelling techniques, because if others do not understand his analysis, his work is worthless.

Does a "data scientist" have to specialize in a particular industry?

Ahmad: In principle, the role of the data scientist is not industry-specific; The same analysis techniques are used in all industries and business areas. On the other hand, data scientists who specialize in an industry or business area have an advantage. Because their deep understanding of the business context offers the potential for better problem solving. For this reason, many companies opt for a hybrid structure in which a central team of data scientists is active throughout the company, while there are additional specialized data scientists in the individual business areas. Companies benefit from specialists who really understand individual business areas and from generalists who can exchange best practice examples and case studies company-wide.

How do companies benefit from the data scientist's analyzes?

Ahmad: Digitization has led to massive upheavals in every industry. New competitors who challenge their status quo with innovative and flexible business models are shaking the strong position of former market leaders. Traditional products and services have lost their differentiation due to the rapid pace of innovation. Every product and every service can be recreated within a maximum of six months. The data that a company has collected over decades, however, cannot be reproduced, and certainly not the knowledge gained from it, which means a real competitive advantage. Since companies in a digital world base their business decisions more and more on data and analyzes, the data scientist is needed because he generates essential knowledge. As organizational complexity increases and the pressure to go to market quickly, automation based on data and analytics is becoming increasingly important to business survival. Data science is not only used for customer-oriented projects. The backend operation must also be efficient and function optimally. Data and its analysis can play a critical role in enabling greater efficiency through automation. For all these reasons, the employment of data scientists is no longer optional, but a must for companies that think and act in a forward-looking manner.

What training or special skills do you need as a data scientist?

Ahmad: Data scientists have very different individual requirements. That is why there are no generally applicable rules for the professional profile of "data scientist". The skills and character traits we are looking for at Teradata are primarily curiosity, problem-solving skills, experience with data and its analysis, excellent communication and, above all, the insatiable thirst for knowledge and always new to learn. These characteristics characterize the data scientist who has to develop solutions to the most difficult problems in highly complex areas of the company. We find these skills in people from diverse backgrounds such as math, computer science, physics, linguistics, and engineering. On the other hand, certain programming languages ​​(R, Python, Scala, SAS etc.) or technologies (Spark, Tensorflow, Hadoop etc.) do not have priority. Smart, curious, agile learners will quickly acquire new technologies and languages ​​as needed. Plus, it's moving so fast that the tools and technology we use today could be out of date in six or twelve months.

Can you be a data scientist and a manager at the same time?

Ahmad: Data science is a skill that does not begin or end with a specific function. However, my experience as a data scientist can make me a better manager. Because I understand from the outset how data and analyzes can be used sensibly in companies. I use data and facts to make decisions, and that's exactly what I encourage in my team. As a manager, I look for data and evidence that will show me how I can improve my part of the company and that will give me the right answers to strategically important questions such as: What actions can we take to improve our products and services? What do our customers really want? Which trends do we have to react to? What global, economic or social challenges do we as a company have to be prepared for?

How does the task differ from the usual clichés?

Ahmad: It is always assumed that a data scientist writes and programs algorithms all day. In truth, however, the analytics algorithms are the easiest part of the job. Today there are a multitude of libraries full of algorithms that can be found with a simple Google search. The difficulty is not in writing algorithms or executing them, but in knowing and understanding which algorithm to use in each case. Analytics algorithms exist in many variants. Only an experienced data scientist understands their characteristics and limitations, and can choose the right algorithm for a specific business problem. This often involves iterating through multiple algorithms to compare and combine results before the one correct algorithm can be selected. But that's by far the fastest part of the entire work as a data scientist. Many people underestimate the effort it takes to define and research the real problems and challenges facing a company, share results, hypothesize and determine actions. The time that a data scientist spends integrating complex data types is also often overlooked. In addition, there is very little time left for the algorithms.

How has the job profile changed in recent years and where is it going?

Ahmad: Just five years ago, companies hired a "data scientist for all cases", someone who can do almost anything and does everything. There is now a better understanding of the role and how it must function as part of a cross-functional team. We are no longer looking for the data scientist per se who is expert in understanding and documenting business requirements, setting up real-time data feeds and complex data collection pipelines, automating tests, integrating code and managing multiple development and production environments in large companies . On the other hand, we hire several employees from different backgrounds who work in a team. Through the cooperation of data scientists, automation engineers, business analysts, visualization and UI experts, software engineers and architects, we can design and implement real end-to-end solutions. Without the various skills in the team, the data scientist would not have time for new analyzes because of the configuration, setup and maintenance of the first solutions that he put into operation. We ensure that our data science team remains in the field of R&D and innovation and is not used for operations and maintenance. If there is a mixed team with different roles and areas of expertise, the data scientist can do what he is particularly good at, namely develop new analysis solutions for business problems.

What aspect of the job do you like most?

Ahmad: To discover new worlds every day. For me, data science opens many doors to new industries, areas of expertise and business challenges from which I can constantly learn. My core competencies have helped me to travel the world as a consultant and offer added value to companies with complex challenges. Using my skills to solve highly complex business problems makes me very satisfied. My particular passion is the visualization and communication of data. I worked on Teradata's "The Art of Analytics" exhibition, which demonstrates the power of combining data science and storytelling and how important convincing "data telling" is when talking to top management - and how to get data can also translate into other forms of representation.