10/06/2020 | Research meets practice

A common good? The trend towards data sharing

It's catching on in research and industry

Data sharing is not a completely new concept, but in recent years, the trend has left the abode of science – and for concepts such as the digital twin, it is a mandatory prerequisite.

For academic research, the idea of open access and data sharing is not new. It originated around the turn of the century and was instigated by the runaway subscription cost for scientific serials. But soon, the resulting movement – which benefitted greatly from the evolution of the internet – developed a more profound rationale: In 2002 and 2003 respectively, the Budapest Open Access Initiative and the Berlin Declaration on Open Access set the tone for the ensuing development: “Removing access barriers to this literature will accelerate research, enrich education, share the learning of the rich with the poor and the poor with the rich, make this literature as useful as it can be, and lay the foundation for uniting humanity in a common intellectual conversation and quest for knowledge.”

Today, open access has become a widely accepted standard in publicly funded research, and it extends beyond publications. Rather, protagonists of open access call for the scientific data itself to be made available. Funding bodies such as the National Science Foundation or the EU Research Program Horizon 2020 have made the sharing of research data mandatory. This allows for quality assurance and transparency, but also for more efficiency, cost savings and the opportunity to conduct secondary analysis on existing data.

But a fundamental problem remains: Even though the data may be available, they often do not follow any common standards- beginning from the data acquisition via the formats and the ways the data are stored. In some disciplines, shared infrastructures have been established, such as in the fields of enzymes, genomics or microbes.

This is where a new ambitious initiative in Germany comes into play: The development of a National Research Data Infrastructure (“Nationale Forschungsdateninfrastruktur” NFDI). It was brought under way in 2018 by a common agreement between the Federal Government and the Federal States: “To turn research data into scientifically broadly usable data treasures with added value for society, Germany needs an NFDI.”

The NFDI will be responsible to systematically unlock, secure and provide access to the data pools of science and research. It will also seek an international interlinkage. The establishment of this infrastructure is driven by science in a network structure of independent consortia. The final funding of up to 85 million euros per year can be split between up to 30 consortia. In the first wave, nine consortia have been selected for funding. One of them is NFDI4Cat, the National Research Data Infrastructure for Catalysis.

As an interdisciplinary field, catalysis research is of great strategic importance for the economy and for society as a whole: It is a core technology to mitigate climate change, provide sustainable energy and materials. Examples are the reduction of CO2 emissions, the valorization of plastic waste or CO2 in chemical production, sustainable hydrogen generation, fuel cell technology or the sustainable provision of food for more than 7 billion people on earth. For all of this, revolutionary progress is required in catalysis science and technology. This will only be achieved by a fundamental change in catalysis research, chemical processing and process engineering. One challenge is to bring together the different fields of catalysis research with data scientists and mathematicians with the ultimate goal of creating a “digital catalysis”. It will be aligned with the data value chain from the molecule to the chemical process.

Slowly, the “share culture” is also seeping into the industrial sphere. This is reflected by industry involvement in NFDI4Cat: Catalyst developer hte GmbH takes a leading role, six other renowned industry players will support NFDICat as advisors. In other industries, especially the pharma industry, sharing pre-competitive data is already widely accepted – and for good reasons: Pharmaceutical studies are very expensive and take a long time. Repeating failed approaches can be costly, and apart from economic considerations, it may even cost lives.

The chemical industry is following suit, especially in the area of safety data. This may have been pushed by safety regulation such as the European REACH, which has not only caused a common “pain” of providing large amounts of data that need to be shared along the value chain, but also the cure in the shape of joint initiatives to collect and share data and thus lessen the individual burden.

Other motives are becoming equally urgent: Automation of laboratories and plants requires common data formats. Initiatives like NAMUR bring competitors together to agree on common standards and thus enable technologies that benefit all without distorting competition. The integration of value chains takes this one step further: To exploit the full potential, data needs to be shared between companies at different steps of the value chain.  In a report published in April 2020, McKinsey and Fraunhofer list potential barriers and benefits of “data sharing in industrial ecosystems”. They also point out that there is no such thing as a digital twin without data sharing. Thus, all players in the industrial ecosystem need to do a bit of soul-searching and remember: Knowledge is the only resource that grows when it is put to use.



Kathrin Rübberdt



Always up to date

With our newsletter you will receive current information on ACHEMA on a regular basis. You are guaranteed not to miss any important dates.

Subscribe now