Mine the Gap: How the Role of Data Scientist Fills a Need in the Pharmaceutical Industry
Authors: Michael S Rimler, FMD K&L, Cincinnati, Ohio, US Jorine Putter, Grünenthal GmbH, Aachen, Germany
SUMMARY OF PAPER:
Attend any industry conference, pharmaceutical or otherwise, and you are bound to hear the new buzzwords and catchphrases driving innovative thought at that time. In the new world of ‘big data’, one such expression that continues to float to the top is ‘data science’ or the ‘data scientist’. Although this field/role may be well defined in industries such as marketing and finance, it has not yet found a home in the pharmaceutical industry (‘pharma’), and least outside of the largest pharmaceutical companies.
The objective of this paper is to present ideas on how the data scientist can make positive enterprise-wide contributions to small- to mid-sized pharmaceutical companies, whether directly employed, contracted via a CRO vendor, or even as an independent contractor. One of the key characteristics of the data scientist in other industries is that one person is performing all data science activities. We propose that the pharma either develop this “one person” skillset or create multifaceted teams to perform the analyses. Without this advancement, we don’t believe that pharma can truly overcome the universal paradigm shift across industries and leverage it to be fit for the future. For the purposes of this paper, we will refer to the combination of traditional analyses in pharma and applications of data science as “pharma data science”.
We aim to motivate members of the pharmaceutical industry to think about how best we can achieve this and ensure that we stay as current as possible in a very fast-changing “data and technology” world. We will discuss the skills and knowledge that a
pharma data scientist would ideally have, as well as the types of cross-functional projects that leverage the pharma data scientist’s expertise. For example, the pharma data scientist may search for unidentified efficacious subpopulations, analyze preclinical expression pattern detection, or undergo data mining and clustering analyses.