ϳԹ

Can AI help us manage information overload?

T
The Link
By: Saskia Hoving, Mon Jul 25 2022
Saskia Hoving

Author: Saskia Hoving

The digital age has given us unprecedented access to information. Researchers can now obtain far more research into their subject areas than ever before. But how much is too much? And could AI hold the key to tackling information overload in research? In the first of two blogs, we look at the role of publishers in this issue and the development of machine-generated books.

There are now millions of academic articles published every year. Even within niche subject areas, the sheer volume of papers, pre-prints, and data published is far too great for an individual researcher to stay abreast of. The first wave of the Covid-19 pandemic in 2020 is particularly illustrative of this problem. During the first six months of 2020, the number of articles published about Covid-19 grew from zero to 28,000. In mid-May, nearly 3,000 papers were published in a single week. It would be impossible for a researcher to read this many papers – and still manage to do their own research. So how can they make sure they’re reading the seminal papers and most important findings? And is there more that publishers can do to support them?

We first looked at this subject in a webinar with Dr. Stephanie Preuss, Senior Editor at ϳԹ and Markus Kaindl, ϳԹ’s Group Product Manager for Research Intelligence. Here, we review what they covered and further developments that have taken place since.

The role of publishers in tackling information overload

She explained that while publishers can be part of the problem, they can also be part of the solution. And this is why ϳԹ is investing in technology that uses artificial or ‘augmented’ intelligence to offer solutions to the ‘information overload’ challenge.

As a large publisher, we have a lot of different brands, and, of course, those brands publish a lot of research,” said Dr. Stephanie Preuss, speaking at the webinar. “We’re very proud of that research, but it also means we are part of the problem of information overload."

So what exactly can publishers do? Stephanie laid out some of the key development areas:

  1. Structuring existing content: In order for researchers to find the content they need, we’re looking at tools and new technology that will help us to structure existing content to make it easier to find and easier to digest. This includes developing and offering new products like machine-generated books, reports, or apps. As well as auto-clustering of content by, for example, subject area.
  2. Supporting researchers: Clustering and structuring research in the way discussed above can help researchers in several ways. For example, for researchers entering a new field, publishers can create machine-generated overviews or clustered content that allow them to get up to speed with the latest research fast. This is also true for those wanting to stay up to date with research in their field. More than this, because AI can cluster research by a topic – such as climate change – it can also help to overcome research silos. Researchers will see any research on that topic, regardless of which discipline the research originates from, thus broadening their perspective.
  3. New tools for authors: When it comes to writing papers, there are also a number of ways in which AI technology can be put to good use to support authors. For example, automated Table of Contents generation, , and even text generation using generative transformers and large language models to overcome the dreaded ‘writer's block’ (more on this in our second blog).
  4. Shaping the future: As respected publishers, we believe our role is to engage with our communities to understand how best to use these technologies to shape the future of research.

"We think that there are some important questions around the role of artificial intelligence and publishing,” said Stephanie. We think that artificial intelligence will shape the future of our industry."

Making it a reality: machine-generated books

Stephanie went on to explain the development and release of the first-ever machine-generated academic book. The book, , was published in April 2019 following a collaboration between ϳԹ, the Applied Computational Linguistics lab of Goethe University Frankfurt, and Digital Science.

This innovative book prototype provided a compelling machine-generated overview of the latest research on lithium-ion batteries, automatically compiled by an algorithm dubbed “Beta Writer”. The launch of the book generated significant media attention –  with ge.com naming it one of their “coolest things on Earth this week”.

In the three years leading up to the book’s publication, more than 53,000 papers and articles were published about research being conducted in the lithium-ion battery field. But staying on top of all that research would be near impossible.

As Andrew Liszewski , “It’s a firehose of data that ϳԹ has turned into a manageable trickle through this machine-generated publication.”

The algorithm uses machine learning to first analyze thousands of publications to ensure that only those relevant are selected for the book. It then parses, condenses, and organizes those pre-approved, peer-reviewed publications from ϳԹ’s online database into coherent chapters and sections that each focus on a different aspect of battery research. 

The algorithm produces no new results – it’s not new research output – but it accurately provides an unbiased summary of all known facts on a subject to provide a new perspective.

What’s next for machine-generated texts?

The book published in 2019 was only the start of our work looking at machine-generated texts. In 2021, we published over 500 machine-generated literature overviews, and offered a new book format – AI-based literature overviews.

The new product is a mixture of human-written text and machine-generated literature overviews. An author puts these machine-generated reviews, created from a large set of previously published articles in ϳԹ journals, into book chapters to provide a scientific perspective.

"This is an exciting step in our innovation journey that started with the first machine-generated book, as this is effectively a new type of book format that resembles a kind of dialogue between the author (now editor) and the machine."

, edited by Guido Visconti, is the first publication of this kind. Professor Guido Visconti devised a series of questions and keywords related to different aspects of climate studies, examining their most recent developments and their most practical applications. These were queried, discovered, collated and structured by the machine using AI clustering with the results presented in a series of book chapters for Professor Visconti to put into scientific context. The same model was used in 2022 to publish , edited by Ziheng Zhang, Ping Wang, and Ji-Long Liu.

"We are looking forward to seeing how this joint journey of authors, publishers, and machines helps advance science and show authors surprising new opportunities for future research. We hope others will be inspired and invite the submission of new ideas to produce similar publications in other research areas."

Look out for our second blog on this topic, where we’ll consider how AI can help support the research community during times of crisis, from ‘TLDR’ abstracts to automating scientific content generation.

Saskia Hoving

Author: Saskia Hoving

In the Dordrecht office, Marketing Manager Saskia Hoving is chief editor of The Link Newsletter and The Link Blog, covering trends & insights for all facilitators of research. Focusing on the evolving role of libraries regarding SDGs, Open Science, and researcher support, she explores academia's intersection with societal progress. With a lifelong passion for sports and recent exploration into "Women's inclusion in today's science", Saskia brings dynamic insights to her work.