The strange death of the Data Scientist


2024-05-01
-
7 min read


10 years ago, data scientist was considered the job of the century. With the consolidation of the Web-2 network, users were not only data consumers but data producers. The need to manage, process and interpret this new form of gold was paramount and as such, Data Scientist became a high priority for any company that consumed large amounts of data. These large amounts of user behaviour, network structures, content, demographic, psychographics and much more allowed companies to understand and predict human behaviour at levels never seen before. Data Scientist, where at the front of the data stream process that allowed to bring “life” to data. By “life”, I am metaphorically referring to Generative AI, as the ultimate emulation of human behaviour that AI has achieved.

Untitled

However, curiously, as in Mary Shelley's Frankenstein, the newly born creature, involuntary kills its creator. Between 2022 and 2023 over 500,000 layoffs occurred across all tech jobs, with over 30% of those layoffs affecting engineering and data science roles. This is not solely caused by high interest rates and its impact on the tech industry but also by the necessity for specialization in this area, due to the rapid evolution of the available AI techologies.

What happened to the role of the Data Scientist?

Typically, Data scientist primarily revolved around extracting insights from data, building predictive models, and optimising business processes using statistical analysis and machine learning algorithms. This involved some initial learning curve into the technicalities of predictive analysis and the Extract Transform Load (ETL) process. Statistics, Math, Programming, Machine Learning understanding, Data Engineering and Visualisation were the key skills for a successful career in Data Science.

However, with the wide spread of generative AI - most of these tasks can now be easily implemented into new levels of abstraction, with less code and a seamless infrastructure. Tools like Pecan, Altair and Rivery - very easily streamline the whole process of Data into comprehensible Business. This makes the technical aspect of the Data Scientist less relevant than intuition and business vision.

Big tech companies like Google, Microsoft and Amazon make it very easy for anyone with a superficial knowledge of programming and statistics to deliver powefull analysis that 10 years ago would require a deeper knowledge. This is a paradigm shift for those invested in such a carreer path.

Pronouncing the Data Scientists as dead might be a stretch, but certainly a lot will change.

Lets see what will be the carrier landscape for them.

A smaller middle ground - specialisation

Typically, the data scientist was between a business analyst and a machine-learning/AI engineer. With some knowledge of ML algorithms and AI principles, Data Scientist would trigger these methods to output insights for the business strategy. However, this superficial knowledge can now be much more easily supported by GenAI. Which leaves Data Scientist to decide on a new path to stay relevant within organisation.

mySketch

The original middle ground position between a very technical team(ML/AI/Data engineering)and a not so technical one (Business) will require way less people, and as such the new data scientist will have to revert to either one of the sides of the balance.

The AI Data Scientist

A logic choice for a reinvention of the role is the specialisation in the techicallities of model creation and model testing. Machine learning, Data Engineering and AI development will have a strong demand over the next years. This will most probably be a job more confined to companies specialized in AI and not so much to classical industry companies.

A re-skilling into machine learning principles and methodologies, machine learning ops, data engineering and specific AI fields such as computer vision, NLP, deep learning and so on.

This is a path that will require capacity to re-skill in a highly theoretical and techical field, with a mix of computer science and math, and might turn out to be very difficult if the original data scientific background is grounded on a more superficial understanding of statistics and functional programming.

The Data Strategist

On the other side, a alternative path for reinvention of the Data Scientist roles is more focused on data strategy and its applications to companies operation.

This new data scientist would as a data enabler, envisioning the whole chain of data acquisition potential within a company operation and ensuring that this is applied across the whole infrastructure to between serve the companie purposes. With a very strong envagelization stance, this person would enable all employees to work with data and to act as data translators to the overall company data strategy.

Companies now don’t solely act on classical user data, that more or less comes in a regular day, but also on their own data to gather insights in operation improvement.

Concepts like Process Mining are becoming more popular, where internal process are converted into data points that can then generate insights.

The Data Driven Employee

As AI gets more and more engrained within business operations, data becomes the new fuel for performance. This means that data can no longer be seen as the property of a small team, but as a commodity used by everyone in the company. Hence, Data Science will have to become a company wide discipline.

As new AI models come out that easily allow us to transform data that previous would be hard to evaluate given its qualitative nature, into a comprehensive categorization/evaluation with very little effort. Basically we can now harvest almost all types of data (events, content, audio, image and video, etc) to produce insights.

Each employee will not only be a data consumer, but also an avid data producer - and as such - it will have to be part of his/hers responsabilities to curate its own data pipeline into its full potential

mySketch

The impact on future product development

In the evolving landscape of business operations, every product is now compelled to integrate data considerations beyond the classical event-based Key Performance Indicators (KPIs). No longer can product development solely rely on conventional metrics; instead, they must delve into the intricate layers of user behavior, preferences, and feedback to stay competitive. For instance, an e-commerce platform might need to analyze not just the number of purchases but also the browsing patterns, wishlists, and even sentiment analysis from customer reviews to optimize its offerings.

The new AI data scientist emerges as a pivotal figure in supporting product teams to envision these core layers. With their expertise in machine learning, data engineering, and AI development, they can guide product development by leveraging advanced analytics to understand user needs and predict market trends. By harnessing the power of generative AI and data-driven insights, product teams can iterate faster, tailor offerings to individual preferences, and ultimately enhance user experiences.

Untitled

Companies poised for success are those that can adeptly collect, manage, and profit from their own generated data. With the aid of AI-driven analytics and the strategic guidance of data experts, product development can transcend traditional boundaries, leading to more innovative and competitive offerings.

Conclusion

The trajectory of the Data Scientist has undergone a profound shift, evolving from a mere data insight generator to a crucial enabler of organizational success in the data-driven era. As the demand for data-driven decision-making permeates every aspect of business, Data Scientists must navigate between embracing a more theoretical role or adopting a strategic stance in delivering value to a workforce increasingly versed in data literacy.

While the traditional notion of the Data Scientist may seem obsolete, it's only to resurface within a different data paradigm. Armed with new skills and perspectives, these professionals stand at the forefront of driving innovation, shaping strategies, and empowering organizations to thrive in the data-centric future.

© Vasco Magellan 2024