AI - Digital Science https://www.digital-science.com/blog/tags/ai/ Advancing the Research Ecosystem Mon, 17 Nov 2025 13:34:10 +0000 en-US hourly 1 https://wordpress.org/?v=6.8.3 https://www.digital-science.com/wp-content/uploads/2025/05/cropped-favicon-container-2-32x32.png AI - Digital Science https://www.digital-science.com/blog/tags/ai/ 32 32 Machine-First FAIR: Realigning Academic Data for the AI Research Revolution https://www.digital-science.com/blog/2025/11/machine-first-fair-academic-data-for-the-ai-research-revolution/ Mon, 17 Nov 2025 12:16:40 +0000 https://www.digital-science.com/?p=95079 The best way for humankind to benefit from research is to prioritize machines over people when sharing data. Here’s why.

The post Machine-First FAIR: Realigning Academic Data for the AI Research Revolution appeared first on Digital Science.

]]>
The best way for humankind to benefit from research is to prioritize machines over people when sharing data. Here’s why.

We push out the lines that academic research needs to be Findable, Accessible, Interoperable and Re-usable (FAIR) for humans and machines. This suggests humans and machines should get equal priority when it comes to FAIR. This is not the case, we should prioritize the machines. Machine-generated new knowledge will accelerate knowledge discovery. 

While humans can infer insights from sparse information in academic literature and datasets – due to our ability to find more context online – the machines currently cannot. To go further, faster in knowledge discovery we need to move past human-powered knowledge discovery. To do this, the machines need structure and pattern. Every research-generating organization should be prioritizing this.

Academia is Ignoring Decades of Advancement

Academic research generates more than 6.5 million papers annually, and over 20 million datasets, each representing potential training signals for the artificial intelligence systems reshaping discovery. Yet most institutional data remains locked in formats optimized for human consumption rather than computational processing.

While most stakeholders know the theoretical merits of making data FAIR (Findable, Accessible, Interoperable, Reusable) for both humans and machines, the practical reality is starker: in an era where language models can process orders of magnitude more literature than any human researcher, we are still organizing our most valuable research assets for the wrong consumer.

The economic implications are substantial. Organizations like the Chan Zuckerberg Initiative (CZI) have committed over $3.4 billion toward AI-powered biology, funding projects ranging from their 1,024 GPU DGX SuperPOD cluster for computational biology research to the Virtual Cell Platform that aims to create predictive models of cellular behavior. The Navigation Fund, with its $1.3 billion endowment, has invested in AI infrastructure through their Voltage Park subsidiary, while simultaneously funding open science initiatives focused on machine-actionable intelligence and metadata enhancement. Astera Institute has deployed portions of its $2.5 billion endowment to support projects like their $200 million investment in Imbue’s AI agent research and their Science Entrepreneur-in-Residence program specifically targeting scientific publishing infrastructure. Meanwhile, the Allen Institute for AI demonstrates the practical returns on machine-first approaches through projects like their OLMo series of fully open language models, where complete training datasets, code, and methodologies are published in computational formats, and their Semantic Scholar platform, which processes millions of academic papers to extract structured, machine-readable knowledge graphs.

Chan Zuckerberg Initiative (CZI)

Yet the vast majority of academic institutions continue to publish their findings in PDFs or as poorly described datasets. While LLMs are getting better at ingesting multi-modal content, PDF is a format that remains surprisingly resistant to reliable automated extraction, despite decades of advancement in natural language processing. This is not merely a technical limitation. Modern large language models struggle with PDFs because these documents prioritize visual presentation over semantic structure. Critical information becomes trapped in figures, tables, and formatting that computational systems cannot reliably parse. A reaction scheme embedded as an image, a dataset described in paragraph form, or experimental parameters scattered across multiple tables represent precisely the kind of structured knowledge that could accelerate discovery if only machines could access it consistently.

The Architecture of Computational Research Infrastructure

The solution requires a fundamental reorientation toward machine-first data architecture. Rather than retrofitting human-readable outputs for computational consumption, we can take inspiration from pharma and industry writ large, who are designing their data flows to serve algorithms from the ground up, with human-friendly interfaces emerging as downstream products of this computational foundation. 

Consider the transformation pathway implemented by teams working with Digital Science’s suite of computational research tools. We’re building workflows in our tools for automated knowledge extraction at scale. The extracted knowledge gains semantic coherence through integration into domain-specific knowledge graphs. Platforms like metaphacts (metaphactory) provide the infrastructure to align these signals with established ontologies while enforcing quality constraints through SHACL validation integrated into continuous deployment pipelines. The result is not merely a database of facts, but a queryable intelligence system that can answer novel questions through automated reasoning over validated relationships.

Simultaneously, the operational requirements of research continue through dedicated literature management systems. Tools like ReadCube maintain the audit trails and conflict resolution workflows that regulatory environments demand, while ensuring that every screening decision and data extraction connects to persistent identifiers. The curated evidence flows directly into the computational infrastructure rather than terminating in isolated spreadsheets.

The critical innovation lies in packaging. While human researchers expect PDFs and narrative summaries, machine learning pipelines require structured metadata that specifies exactly what each dataset contains, where to retrieve it, and how to interpret every field.

The Metadata Multiplier Effect on Repository Platforms

Academic data repositories like Figshare occupy a unique position in the machine-first FAIR ecosystem. We serve as the critical junction between human research practices and computational discovery. When researchers publish datasets with comprehensive, structured metadata, these platforms transform from simple storage services into computational assets that can feed directly into AI research pipelines. The difference lies entirely in how authors describe their work at the point of deposit.

The REAL (Real-world multi-center Endoscopy Annotated video Library) – colon dataset on Figshare: https://doi.org/10.25452/figshare.plus.22202866.v2

Consider two datasets published on the same platform: one uploaded with a generic title like “experiment_data_final.xlsx” and minimal description, the other with machine-readable field descriptions, standardized vocabulary terms, and explicit links to ontologies and methodologies. The first requires human interpretation before any computational system can make sense of its contents. The second can be discovered, validated, and integrated into training pipelines automatically. Figshare’s API can surface the rich metadata to computational systems, but only if researchers have provided it in the first place.

The platform infrastructure already supports the technical requirements for machine-first FAIR. Persistent DOIs ensure stable identifiers, while structured metadata fields can accommodate everything from ORCID researcher identifiers to detailed provenance information. When authors invest time in describing their data using controlled vocabularies, specifying units of measurement, documenting collection methodologies, and linking to relevant publications, they create computational assets rather than digital archives. The same dataset that might languish undiscovered with poor metadata becomes a valuable training resource when described with machine-readable precision.

This creates a powerful feedback loop. Datasets with excellent metadata get discovered and reused more frequently, driving citation counts and demonstrating impact. Meanwhile, poorly described data remains computationally invisible regardless of its scientific value. Platforms like Figshare could amplify this effect by providing better authoring tools that encourage structured metadata entry, perhaps even using AI to suggest appropriate ontology terms or validate metadata completeness before publication. The infrastructure for machine-first FAIR already exists, it simply requires researchers to embrace metadata as a first-class research output rather than an administrative afterthought. But this is an evolving field, new standards are emerging that repositories need to engage with.

The Croissant format, a lightweight JSON-LD descriptor based on schema.org, provides this computational bridge. A single Croissant file enables any training pipeline to hydrate datasets without custom loaders while simultaneously supporting discovery through standard web infrastructure. 

Practical Implementation in Institutional Contexts

The transition to machine-first FAIR follows a predictable arc when properly resourced. Initial implementations focus on proving the fundamental workflow with narrowly scoped pilot projects. A team might select a single dataset and one sharply defined outcome, perhaps drug-target interaction prediction or materials property modeling and implement the complete pipeline from literature extraction through validated knowledge graph construction to machine-readable packaging.

The critical insight from successful implementations is the importance of automation as the second phase. Manual processes that work for pilot projects become bottlenecks at scale. The most effective teams invest heavily in converting their proven workflows into tested, continuous integration pipelines that enforce quality gates automatically. This includes SHACL validation for knowledge graphs, automated license checking, and provenance tracking.

Production deployment requires infrastructure investments that many academic institutions are not yet considering. Successful implementations provide stable, resolvable URLs for every dataset and descriptor, enable content negotiation so that both machines and humans receive appropriate formats, and implement comprehensive monitoring of data quality trends and usage patterns. This is the stack that Digital Science can provide.

Quantifying Institutional Success

Organizations can assess their progress toward machine-first FAIR through several concrete indicators. Successful implementations demonstrate that every significant dataset resolves to a persistent identifier that returns structured JSON-LD for computational consumers while maintaining readable landing pages for human users. Knowledge graphs pass automated validation, maintain stable URI schemes, and support catalogued query patterns rather than requiring ad hoc exploration.

Literature workflows leave complete audit trails with PRISMA-compliant reporting that can be generated automatically rather than assembled manually. Licensing and provenance information becomes verifiable through computational means rather than requiring human interpretation. Most importantly, the time taken from initial hypothesis to trained model decreases as institutional infrastructure matures and teams spend more of their time on discovery rather than data preparation.

The research organizations that define the next decade will not necessarily be those with the largest datasets, but rather those whose data infrastructure works most effectively at computational scale. Every day spent optimizing publishing workflows for human-readable reports while leaving data computationally inaccessible represents lost ground in an increasingly competitive landscape.

The funders backing this transformation, from CZI’s investments in computational biology to Astera’s focus on AI-native research infrastructure, are betting that machine-first approaches will determine which institutions can effectively leverage artificial intelligence for discovery. The technical architecture exists today. The standards are stable. The remaining barrier is institutional commitment to prioritizing computational accessibility over familiar but inefficient human-centered workflows.

Academic research stands at yet another technology-driven inflection point. The institutions that embrace machine-first FAIR will find themselves having more impact for their research and researchers.

The post Machine-First FAIR: Realigning Academic Data for the AI Research Revolution appeared first on Digital Science.

]]>
AI in drug discovery: Key insights from a computational biology roundtable https://www.digital-science.com/blog/2025/10/ai-in-drug-discovery-key-insights/ Thu, 02 Oct 2025 09:59:50 +0000 https://www.digital-science.com/?p=94608 Experts from across the pharmaceutical and biotechnology landscape share trends, challenges, and opportunities for using AI in drug discovery.

The post AI in drug discovery: Key insights from a computational biology roundtable appeared first on Digital Science.

]]>
This article distills key insights from the expert roundtable, “AI in Literature Reviews: Practical Strategies and Future Directions,” held in Boston on June 25 where a range of R&D professionals joined this roundtable, bringing perspectives from across the pharmaceutical and biotechnology landscape.  Attendees included senior scientists, clinical development leads, and research informatics specialists, alongside experts working in translational medicine and pipeline strategy. Participants represented both global pharmaceutical companies and emerging biotechs, providing a balanced view of the challenges and opportunities shaping innovation in drug discovery and development.

Discussions covered real-world use cases, challenges in data quality and integration, and the evolving relationship between internal tooling and external AI platforms. The roundtable reflected both enthusiasm and realism about AI’s role in drug discovery – underscoring that real progress depends on high-quality data, strong governance, and tools designed with scientific nuance in mind. Trust, transparency, and reproducibility emerged as core pillars for building AI systems that can support meaningful research outcomes.

If you’re in an R&D role, whether in computational biology, informatics, or scientific strategy and looking to scale literature workflows in an AI-enabled world, keep reading for practical insights, cautionary flags, and ideas for future-proofing your approach.

Evolving roles and tooling strategies

Participants emphasized the diversity of AI users across biopharma, distinguishing between computational biologists and bioinformaticians in terms of focus and tooling. While foundational tools like Copilot have proven useful, there’s a growing shift toward developing custom AI models for complex tasks such as protein structure prediction (e.g., ESM, AlphaFold).

AI adoption is unfolding both organically and strategically. Some teams are investing in internal infrastructure like company-wide chatbots and data-linking frameworks while navigating regulatory constraints around external tool usage. Many organizations have strict policies governing how proprietary data can be handled with AI, emphasizing the importance of controlled environments.

Several participants noted they work upstream from the literature, focusing more on protein design and sequencing. For these participants, AI is applied earlier in the R&D pipeline before findings appear in publications.

Stock image

Data: Abundance meets ambiguity

Attendees predominantly use public databases such as GeneBank and GISAID rather than relying on the literature. Yet issues persist: data quality, inconsistent ontologies, and a lack of structured metadata often require retraining public models with proprietary data. While vendors provide scholarly content through large knowledge models, trust in those outputs remains mixed. Raw, structured datasets (e.g., RNA-seq) are strongly preferred over derivative insights.

One participant described building an internal knowledge graph to examine drug–drug interactions, highlighting the challenges of aligning internal schemas and ontologies while ensuring data quality. Another shared how they incorporate open-source resources like Kimball and GBQBio into small molecule model development, with a focus on rigorous data annotation.

Several participants raised concerns about false positives in AI-driven search tools. One described experimenting with ChatGPT in research mode and the Rinsit platform, both of which struggled with precision. Another emphasized the need to surface metadata that identifies whether a publication is backed by accessible data, helping them avoid studies that offer visualizations without underlying datasets.

A recurring theme was the frustration with the academic community’s reluctance to share raw data, despite expectations to do so. As one participant noted:

“This is a competitive area—even in academia. No one wants to publish and then get scooped. It’s their bread and butter. The system is broken—that’s why we don’t have access to the raw data.”

When datasets aren’t linked in publications, some participants noted they often reach out to authors directly, though response rates are inconsistent. This highlights a broader unmet need: pharma companies are actively seeking high-quality datasets to supplement their models, especially beyond what’s available in subject-specific repositories.

Literature and the need for feedback loops

Literature monitoring tools struggle with both accuracy and accessibility. Participants cited difficulties in filtering false positives and retrieving extractable raw data. While tools like ReadCube SLR allow for iterative, user-driven refinement, most platforms still lack persistent learning capabilities.

The absence of complete datasets in publications, often withheld due to competitive concerns, remains a significant obstacle. Attendees also raised concerns about AI-generated content contaminating future training data and discussed the legal complexities of using copyrighted materials.

As one participant noted:

“AI is generating so much content that it feeds back into itself. New AI systems are training on older AI outputs. You get less and less real content and more and more regurgitated material.”

Knowledge graphs and the future of integration

Knowledge graphs were broadly recognized as essential for integrating and structuring disparate data sources. Although some attendees speculated that LLMs may eventually infer such relationships directly, the consensus was that knowledge graphs remain critical today. Companies like metaphacts are already applying ontologies to semantically index datasets, enabling more accurate, hallucination-free chatbot responses and deeper research analysis.

What’s next: Trust, metrics, and metadata

Looking forward, participants advocated for AI outputs to include trust metrics, akin to statistical confidence scores, to assess reliability. Tools that index and surface supplementary materials were seen as essential for discovering usable data.

One participant explained:

“It would be valuable to have a confidence metric alongside rich metadata. If I’m exploring a hypothesis, I want to know not only what supports it, but also the types of data, for example, genetic, transcriptomic, proteomic, that are available. A tool that answers this kind of question and breaks down the response by data type would be incredibly useful. It should also indicate if supplementary data exists, what kind it is, and whether it’s been evaluated.”

Another emphasized:

“A trustworthiness metric would be highly useful. Papers often present conflicting or tentative claims, and it’s not always clear whether those are supported by data or based on assumptions. Ideally, we’d have tools that can assess not only the trustworthiness of a paper, but the reliability of individual statements.”

There was also recognition of the rich, though unvalidated, potential in preprints, particularly content from bioRxiv, which can offer valuable data not yet subjected to peer review.

Conclusion

The roundtable reflected both enthusiasm and realism about AI’s role in drug discovery. Real progress depends on high-quality data, strong governance, and tools designed with scientific nuance in mind. Trust, transparency, and reproducibility emerged as core pillars for building AI systems that can support meaningful research outcomes.

Digital Science: Enabling trustworthy, scalable AI in drug discovery

At Digital Science, our portfolio directly addresses the key challenges highlighted in this discussion.

  • ReadCube SLR offers auditable, feedback-driven literature review workflows that allow researchers to iteratively refine systematic searches.
  • Dimensions & metaphacts offers the Dimensions Knowledge Graph, a comprehensive, interlinked knowledge graph connecting internal data with public datasets (spanning publications, grants, clinical trials, etc.) and ontologies—ideal for powering structured, trustworthy AI models that support projects across the pharma value chain.
  • Altmetric identifies early signals of research attention and emerging trends, which can enhance model relevance and guide research prioritization.

For organizations pursuing centralized AI strategies, our products offer interoperable APIs and metadata-rich environments that integrate seamlessly with custom internal frameworks or LLM-driven systems. By embedding transparency, reproducibility, and structured insight into every tool, Digital Science helps computational biology teams build AI solutions they can trust.

The post AI in drug discovery: Key insights from a computational biology roundtable appeared first on Digital Science.

]]>
How experts are redefining research visibility beyond traditional metrics https://www.digital-science.com/blog/2025/09/research-visibility-beyond-traditional-metrics/ Thu, 25 Sep 2025 09:43:04 +0000 https://www.digital-science.com/?p=94573 A panel of experts explores publication success, new measures of impact, and how digital transformation and AI are reshaping the game.

The post How experts are redefining research visibility beyond traditional metrics appeared first on Digital Science.

]]>
On-Demand Webinar: The Future of Research Visibility: Beyond Traditional Metrics

Introduction

Success in scientific publishing has long been measured by citations and impact factors. Yet in today’s Medical Affairs landscape, the definition of value is shifting rapidly. This article recaps insights from the recent panel discussion The Future of Research Visibility: Beyond Traditional Metrics, where experts from across the field explored how publication success is evolving, which new measures of impact matter most, and how digital transformation and AI are reshaping the game.

Bringing a wealth of diverse perspectives, the panel featured Shehla Sheikh, Head of Medical Communication & Publications at Kyowa Kirin; Kim Della Penna, Scientific Communications Director for Lymphoma, Myeloid, and Multiple Myeloma at Johnson & Johnson; Myriam Cherif, Founder of Kalyx Medical and former Regional Medical Director at GSK; and Carlos Areia, Senior Data Scientist at Digital Science. The discussion was moderated by Natalie Jonk, Enterprise Marketing Segment Lead, who guided the conversation through the critical challenges and opportunities shaping the future of research visibility.

Success: Still a moving target

Defining success remains one of the greatest challenges. For some organizations, it’s still as simple as getting the data published. For others, success means shaping clinical guidelines or influencing real-world decision-making.

Kim explained:

“A lot of these tools help us see who is engaging with our publication. Are they sharing the publication, did they find it important enough to share? Where is the data being incorporated? Is it being used in policy and guidelines, cost data, real-world healthcare data or by population health decision makers for access?”

Myriam emphasized how the lens has broadened over the past decade:

“A decade ago, people just looked at impact factors and citations. Now, we discuss with HCPs how data applies to patients. Sometimes a paper may be more practical for certain regions. We’ve moved toward a more holistic approach.”

Metrics beyond the traditional

Today, a wealth of data is available, but the challenge is deciding which metrics are truly meaningful. Downloads, mentions, and social media shares are only part of the story.

Carlos noted the complexity:

“Things are changing quite fast with data. How do you track success when different publications have different goals? Sometimes the goal is to see how quickly new studies get into clinical guidelines. Other times, it’s about reaching a very specific group of oncologists in one country.”

Sentiment analysis is also emerging as a key tool:

“We can now see if a publication has been well or badly received by, for example, a group of cardiologists. Medical Affairs is adapting rapidly to what real-time data can offer,” Carlos added.

The discoverability dilemma

Shehla raised a critical issue: ensuring publications are findable by the right stakeholders.

“Discoverability is super important. A lot of data ends up in supplementary indices, which aren’t always accessible. If it’s not directly available through the paper, that’s problematic. It raises the question: how much do we include in the main publication versus holding back for supplementary materials?”

The difficulty, she argued, isn’t just in publishing but in making materials trackable. Without DOIs or identifiers, measuring performance across channels becomes impossible.

Carlos emphasized that when any content type, including supplementary data, infographics, and plain language summaries, is uploaded to Figshare and assigned a DOI, it becomes both accessible and trackable.  This is a critical step that several Digital Science customers are already using to monitor and demonstrate the impact of their materials and gain really deep insights regarding who is engaging with their content.

Formats and channels that resonate

Visual and digital formats are transforming scientific communication. With tools like Altmetric and Figshare, it’s now possible to track which content resonates with different audiences,  for example, whether visual abstracts work best for patients, short videos for junior doctors, or news platforms or Medscape for senior clinicians.

Key takeaways from the discussion included:

  • Infographics and visual abstracts help make complex data more digestible for both HCPs and patients.
  • Social media engagement, accelerated since COVID-19, has expanded the demographic reach of publications.
  • Podcasts, YouTube, and blogs are emerging as alternative channels for research dissemination.

Shehla summarized the opportunity:

“Data visualization has been a game changer. It helps people understand complex results without dumbing them down. But it has to be a true representation of the data.”

Strategic decision-making with engagement data

Engagement data is no longer just descriptive – it’s strategic.

Myriam explained:

“This data helps us know which publications to amplify and in what format. If a subgroup analysis is relevant for Asia or South America, we integrate it into the regional strategy. Affiliates want to know how to use this data locally, whether in slides or field medical materials.”

Carlos added an example of reverse engineering success:

“We worked with a partner who had two trials presented at the same congress. One made it into a guideline in a specific country much faster than the other. By looking back at the local attention it had on social media, news and others, we tried to understand why.”

The future: AI, social media, and trust

Looking ahead, AI and digital platforms are set to further disrupt how success is measured.

Myriam highlighted new challenges:

“Citations and downloads will matter less. AI tools are already being used by HCPs to answer questions on diseases and treatments. But a recent study showed less than 15% overlap in references across Google, ChatGPT, and Perplexity when asked the same question. Metadata and referencing are going to be critical to ensure our publications are being picked up correctly.”

Kim added:

“We need to optimize what we create so AI can pick up data through correct tagging. Who is engaging, what types of data they’re engaging with, and what channel they use – these are all factors we have to plan for.”

Carlos cautioned on the risks:

“AI is a wonderful tool if used correctly – but like computer scientists used to say: it’s ‘garbage in, garbage out’. AI is very confident even when it’s wrong. The real value comes from using the right data together with AI to help people understand it better and extract the needed insights from it, whilst mitigating its potential for misuse and misinformation.”

Conclusion: Toward a holistic, dynamic view of impact

As the panel made clear, measuring publication performance can no longer be reduced to a single number. Success is multi-dimensional, context-specific, and evolving alongside technology and stakeholder expectations.

Traditional metrics such as citations and impact factors remain useful, but they are no longer sufficient. Engagement data, sentiment, and discoverability are now central to understanding whether a publication truly resonates and reaches its intended audience. At the same time, AI, social media, and new digital formats are reshaping how, and by whom research is consumed. And sometimes, the most meaningful measures are the informal ones: when medical scientific liaisons hear health care professionals discussing a paper, when KOLs reference it unprompted, or when data directly influences patient care.

A call to reframe success

The future of publication success will depend on Medical Affairs teams embracing this broader, more dynamic definition of impact. By combining rigorous traditional metrics with innovative digital measures, and by ensuring content is discoverable, trackable, and presented in accessible formats, organizations can create lasting value. Most importantly, reframing success around real-world influence and patient outcomes ensures that research doesn’t just get published, it makes a difference.

Continue the conversation

At Digital Science, we’re committed to helping Medical Affairs professionals thrive in an era where research visibility and impact are being redefined. To deepen the insights shared in this panel, we invite you to explore our latest white paper, Empowering Medical Affairs in the Digital Age,” authored by thought leader Mary Ellen Bates. Inside, you’ll find practical strategies to navigate evolving challenges, demonstrate value, and drive measurable outcomes.

Mary Ellen Bates will also be leading our upcoming webinar, “From Data Chaos to Strategic Impact: Transforming Medical Affairs in the Digital Age” (Tuesday 28 October 2025).

The post How experts are redefining research visibility beyond traditional metrics appeared first on Digital Science.

]]>
Altmetric adds Sentiment Analysis to social media tracking https://www.digital-science.com/blog/2025/09/altmetric-adds-sentiment-analysis-to-social-media-tracking/ Tue, 02 Sep 2025 14:26:53 +0000 https://www.digital-science.com/?p=94323 Altmetric has introduced a new AI-powered sentiment analysis feature, providing research teams with deeper insights into the public response and impact of their work on selected social media platforms.

The post Altmetric adds Sentiment Analysis to social media tracking appeared first on Digital Science.

]]>
AI-powered Sentiment Analysis to provide deeper insights into how research is being received

Tuesday 2 September 2025

Digital Science is pleased to announce that Altmetric, which captures the online attention of research, has introduced a new AI-powered sentiment analysis feature, to provide research teams with deeper insights into the public response and impact of their work on selected social media platforms.

Now available in Altmetric Explorer, Altmetric’s AI-powered Sentiment Analysis has been robustly refined to explore the sentiment towards the use of research, thanks to the work of Digital Science Senior Data Scientist Dr Carlos Areia and Head of Data Insights Mike Taylor, in consultation with the research community.

Mike Taylor said: “Impactful research deserves the best possible insights. Our new Sentiment Analysis feature gives some meaning to numbers, leveraging advanced technology to interpret and visualize the sentiment behind mentions on key social media platforms, and brings the potential to turn raw data into actionable insights for members of the research community.”

Using AI to assign scores to mentions, it was possible to create a spectrum of sentiment for given research outputs. By capturing a whole range of reactions and discourse on social media, sentiment analysis supports research teams to better understand how their work is being received and engaged online across different audiences.

“There are many potential benefits from these new insights, including the opportunity for research teams to refine their approach to research publication, communication and dissemination plans,” Taylor said.

Key Features of Altmetric Sentiment Analysis

  • Sentiment Scoring: Automatically assigns a sentiment score to individual social media mentions (ranging from strong negative to strong positive).
  • Sentiment Breakdown Charts: Visualize sentiment trends with clear and concise graphical representations. Research teams can quickly identify changes in perception and respond accordingly.
  • Filtering by Sentiment: Narrow down results in the Altmetric Explorer by sentiment type, allowing users to focus on specific aspects of discussions most relevant to their strategy or goals.

Amye Kenall, Chief Product Officer, Digital Science, said: “The inclusion of Sentiment Analysis into Altmetric data is an important step in helping users get real insight from Altmetric data, enabling researchers and organizations to understand how their publications are being received, discussed and used. Digital Science is committed to using AI responsibly and ethically in ways that drive more value to our users but also protect the community we serve. We’re pleased to bring this feature to our Altmetric Explorer users.

“Medical affairs professionals, academic researchers, scholarly publishers, and R&D specialists alike can fully explore the ‘how and why’ behind their impact, leveraging these insights to maximize the visibility and effectiveness of their published research.”

Introducing Altmetric Sentiment Analysis

About Altmetric

Altmetric is a leading provider of alternative research metrics, helping everyone involved in research gauge the impact of their work. We serve diverse markets including universities, institutions, government, publishers, corporations, and those who fund research. Our powerful technology searches thousands of online sources, revealing where research is being shared and discussed. Teams can use our powerful Altmetric Explorer application to interrogate the data themselves, embed our dynamic ‘badges’ into their webpages, or get expert insights from Altmetric’s consultants. Altmetric is part of the Digital Science group, dedicated to making the research experience simpler and more productive by applying pioneering technology solutions. Find out more at altmetric.com and follow @altmetric on X and @altmetric.com on Bluesky.

About Digital Science

Digital Science is an AI-focused technology company providing innovative solutions to complex challenges faced by researchers, universities, funders, industry and publishers. We work in partnership to advance global research for the benefit of society. Through our brands – Altmetric, Dimensions, Figshare, IFI CLAIMS Patent Services, metaphacts, OntoChem, Overleaf, ReadCube, Symplectic, and Writefull – we believe when we solve problems together, we drive progress for all. Visit digital-science.com and follow Digital Science on Bluesky, on X or on LinkedIn.

Media Contact

David Ellis, Press, PR & Social Manager, Digital Science: Mobile +61 447 783 023, d.ellis@digital-science.com

The post Altmetric adds Sentiment Analysis to social media tracking appeared first on Digital Science.

]]>
New report shows China dominates in AI research – and is western world’s leading collaborator on AI https://www.digital-science.com/blog/2025/07/new-report-shows-china-dominates-in-ai-research/ Thu, 10 Jul 2025 10:46:07 +0000 https://www.digital-science.com/?p=93459 A new report from Digital Science shows China is outstripping the rest of the world in AI research at a critical time.

The post New report shows China dominates in AI research – and is western world’s leading collaborator on AI appeared first on Digital Science.

]]>
Data reveals AI researchers in US, UK and EU all have China as their strongest collaborator

Thursday 10 July 2025

China is outstripping the rest of the world in artificial intelligence research at a time when AI is becoming a “strategic asset” akin to energy or military capability, according to a new report released today by research technology company Digital Science.

The report – entitled DeepSeek and the New Geopolitics of AI: China’s ascent to research pre-eminence in AI – has been authored by Digital Science CEO Dr Daniel Hook based on data from Dimensions, the world’s largest and most comprehensive database describing the global research ecosystem.

Dr Hook has analyzed AI research data from the year 2000 to 2024, tracking trends in research collaborations and placing these within geopolitical, economic, and technological contexts.

His report says AI research has grown at an “impressive rate” globally since the turn of the millennium – from just under 10,000 publications in 2000, to 60,000 publications in 2024.

Dr Hook’s key findings include:

  • China has become the pre-eminent world power in AI research, leading not only by research volume, but also by citation attention, and influence, rapidly increasing its lead on the rest of the world over the past seven years.
  • The US continues to have the strongest AI startup scene, but China is catching up fast.
  • In 2024, China’s AI research publication output matched the combined output of the US, UK, and European Union (EU-27), and now commands more than 40% of global citation attention.
  • Despite global tensions, China has become the top collaborator for the US, UK, and EU in AI research, while needing less reciprocal collaboration than any of them.
  • China’s AI talent pool dwarfs its rivals – with 30,000 active AI researchers and a massive student and postdoctoral population.
  • The EU benefits from strong internal AI collaboration across its research bloc.
  • China dominates AI-related patents – patent filings and company-affiliated AI research show China outpacing the US tenfold in some indicators, underscoring its capacity to translate research into innovation.

“AI is no longer neutral – governments are using it as a strategic asset, akin to energy or military capability, and China is actively leveraging this advantage,” Dr Hook says.

“Governments need to understand the local, national and geostrategic implications of AI, with the underlying concern that lack of AI capability or capacity could be damaging from economic, political, social, and military perspectives.”

Dr Hook says China is “massively and impressively” growing its AI research capacity. Unlike Western nations with clustered AI hubs, he says China boasts 156 institutions publishing more than 50 AI papers each in 2024, supporting a nationwide innovation ecosystem. In addition, “China’s AI workforce is young, growing fast, and uniquely positioned for long-term innovation.”

He says one sign of China’s rapidly developing capabilities is its release of the DeepSeek chatbot in January this year. “The emergence of DeepSeek is not merely a technological innovation – it is a symbol of a profound shift in the global AI landscape,” Dr Hook says.

“DeepSeek exemplifies China’s technological independence. Its cost-efficient, open-source LLM demonstrates the country’s ability to innovate around US chip restrictions and dominate AI development at scale.”

Dr Hook’s report comments further on the AI research landscape in the US, UK and EU.

He says the UK remains “small but globally impactful”. “Despite its modest size, the UK consistently punches above its weight in attention-per-output metrics.”

However, the EU “risks falling behind in translation and visibility”. “The EU shows weaker international collaboration beyond its borders and struggles to convert research into applied outputs (e.g., patents), raising concerns about its future AI competitiveness.”

About Dimensions

Part of Digital Science, Dimensions hosts the largest collection of interconnected global research data, re-imagining research discovery with access to grants, publications, clinical trials, patents and policy documents all in one place. Follow Dimensions on Bluesky, on X and LinkedIn.

About Digital Science

Digital Science is an AI-focused technology company providing innovative solutions to complex challenges faced by researchers, universities, funders, industry and publishers. We work in partnership to advance global research for the benefit of society. Through our brands – Altmetric, Dimensions, Figshare, IFI CLAIMS Patent Services, metaphacts, OntoChem, Overleaf, ReadCube, Symplectic, and Writefull – we believe when we solve problems together, we drive progress for all. Visit digital-science.com and follow Digital Science on Bluesky, on X or on LinkedIn.

Media contact

David Ellis, Press, PR & Social Manager, Digital Science: Mobile +61 447 783 023, d.ellis@digital-science.com

The post New report shows China dominates in AI research – and is western world’s leading collaborator on AI appeared first on Digital Science.

]]>
Digital Science launches new cutting-edge AI writing tools for 20+ million Overleaf users https://www.digital-science.com/blog/2025/06/digital-science-launches-new-cutting-edge-ai-writing-tools-for-20-million-overleaf-users/ Tue, 24 Jun 2025 08:45:00 +0000 https://www.digital-science.com/?p=92707 More than 20 million research writers worldwide now have immediate access to powerful new AI features from Digital Science through an optional add-on for Overleaf: AI Assist.

The post Digital Science launches new cutting-edge AI writing tools for 20+ million Overleaf users appeared first on Digital Science.

]]>
Overleaf’s AI Assist provides advanced language feedback and LaTeX code help

London, UKTuesday 24 June 2025

More than 20 million research writers worldwide now have immediate access to powerful new AI features from Digital Science through an optional add-on for Overleaf.

The add-on, called AI Assist, helps researchers write in LaTeX faster and smarter by combining the power of advanced language feedback with cutting-edge LaTeX AI tools.

Overleaf users can explore the new AI features with a limited number of free uses and upgrade at any time for unlimited access to AI Assist.

Overleaf is the world’s leading scientific and technical writing platform. A LaTeX editor, Overleaf was developed by researchers to make scientific and technical writing simpler and more collaborative. With the launch of AI Assist, Digital Science is bringing powerful AI features from its Writefull service to the global Overleaf community.

With the AI Assist add-on, Overleaf users can take advantage of:

Language and writing tools

  • AI-powered language feedback: Context-aware suggestions to improve grammar, spelling, word choice, and sentence structure, all tailored to the nuances of academic and research writing.
  • Contextual editing tools: Paraphrase selected text, summarize lengthy paragraphs, check synonyms in context, or even generate abstracts and titles with just a few clicks.

LaTeX tools

  • LaTeX error assistance: Instantly identify and fix LaTeX coding errors, to get documents compiling smoothly.
  • LaTeX code generation: Generate LaTeX code, including tables and equations, from simple prompts or even images, saving hours of manual coding.
  • TeXGPT: Ask TeXGPT to help with formatting, figure generation, custom commands, and much more.

Overleaf co-founder Dr John Lees-Miller, Senior VP of B2C Products at Digital Science, said: “The combination of language and writing tools within our AI Assist add-on means millions of Overleaf users can now write their research papers, theses, and technical documents more efficiently and effectively than ever before.

“These AI features will ensure they’ll spend less time wrestling with LaTeX code and perfecting their prose, and more time focusing on groundbreaking research. Users will be able to write with greater confidence, ensuring their documents are error-free, polished, and ready for publication, thanks to the AI Assist add-on.”

Digital Science CEO Dr Daniel Hook said: “Overleaf AI Assist is another example of how Digital Science is bringing tools to our community that save them time and help them to do more research. Responsibly developed AI tools are going to be at the core of giving time back to researchers over the next few years. We are pleased that users can now focus on the important tasks of communicating their research results to the world.”

Find out more about AI Assist and simplify your research writing today.

screenshot of Overleaf AI Assist interface
Overleaf’s AI Assist: Generate equations from simple prompts or images.

About Overleaf

Overleaf is the market-leading scientific and technical writing platform from Digital Science. It’s a LaTeX editor that’s easy enough for beginners and powerful enough for experts. Loved by over 20 million users, it’s trusted by top research institutions and Fortune 500 companies around the world. Users can collaborate easily with colleagues, track changes in real-time, write in LaTeX code or a visual editor, and work in the cloud or on-premises. With Overleaf, anyone can write smarter—creating complex, beautifully formatted documents with ease. Visit overleaf.com and follow Overleaf on X, or on LinkedIn.

About Writefull

Writefull is a Digital Science solution that helps researchers write better, faster, and with confidence, with AI tools that deliver everything from advanced English language edits to research-tailored paraphrasing. It also enables publishers to improve efficiencies across their submission, copy editing, and quality control workflows, and is trusted by some of the world’s leading scholarly publishers. Visit writefull.com and follow @Writefullapp on X.

About Digital Science

Digital Science is an AI-focused technology company providing innovative solutions to complex challenges faced by researchers, universities, funders, industry, and publishers. We work in partnership to advance global research for the benefit of society. Through our brands – Altmetric, Dimensions, Figshare, IFI CLAIMS Patent Services, metaphacts, OntoChem, Overleaf, ReadCube, Symplectic, and Writefull – we believe when we solve problems together, we drive progress for all. Visit digital-science.com and follow Digital Science on Bluesky, on X or on LinkedIn.

Media contact

David Ellis, Press, PR & Social Manager, Digital Science: Mobile +61 447 783 023, d.ellis@digital-science.com

The post Digital Science launches new cutting-edge AI writing tools for 20+ million Overleaf users appeared first on Digital Science.

]]>
AI-powered solutions to transform your research https://www.digital-science.com/blog/2025/05/ai-powered-solutions-to-transform-your-research/ Sun, 11 May 2025 23:43:02 +0000 https://www.digital-science.com/?p=90464 At Digital Science, we recognize that the journey toward AI adoption is as unique as the organizations and individuals we support. From bench researchers to medical affairs professionals to research offices, our approach is grounded in collaboration and deep understanding.

The post AI-powered solutions to transform your research appeared first on Digital Science.

]]>
At Digital Science, we recognize that the journey toward AI adoption is as unique as the organizations and individuals we support.

From bench researchers to medical affairs professionals to research offices, our approach is grounded in collaboration and deep understanding.

Since 2013, we’ve been investing in advanced AI methodologies, expanding our technical and analytical capabilities, and assembling a global team of AI experts. To us, AI isn’t a one-size-fits-all solution; it encapsulates a range of both new and existing capabilities and approaches that when thoughtfully applied, can significantly enhance capabilities and streamline workflows. Our commitment continues to be focused on working closely with our partners, deeply understanding their unique challenges and aspirations, to deliver innovative and responsible AI capabilities that enhance human intelligence, drive progress, and unlock the full potential of the research community.

Our capabilities

For more than a decade, we have focused around machine learning innovations with Dimensions.ai, investment in Writefull and development of different LLMs. Building on this AI lineage, we’ve  seen a continuous flow of new releases, such as Dimensions Research GPT Enterprise and Dimensions Research GPT.

Dimensions in ChatGPT

Available via OpenAI’s GPT Store, these products aim to provide users looking to use ChatGPT for research-related questions with generative answers they can trust – grounded in scientific evidence from Digital Science’s Dimensions database.

Key features of Dimensions Research GPT Enterprise – available to Dimensions customers with a ChatGPT Enterprise licence – include: 

  • Answers to research queries with publication data, clinical trials, patents and grant information
  • Set up in the client’s private environment and only available to client’s end users
  • Notifications each time content generated is based on Dimensions data, with references and citation details
  • Possible for clients to have custom features (following prior discussion with Dimensions).

For Dimensions Research GPT, answers to research queries are linked to tens of millions Open Access publications, and access to the solution is free to anyone with a Plus or Enterprise subscription to OpenAI’s GPT Store.

Next-generation search experience

Dimensions has a summarization feature to support the user in their discovery process for publications, grants, patents and clinical trials. We’ve integrated AI-driven summarization capabilities into the Dimensions web application to enable all users to accelerate the identification of the most relevant content for their research questions. Short, concise summaries are now available for every record in a given search result list with a single click, providing users with AI-generated insights quickly. The Dimensions team has used feedback from members of the research community – including academic institutions, industry, publishers, government, and funders – to develop this summarization feature in the Dimensions web app.

Smarter searching in Dimensions

Other AI solutions continue to be developed by Digital Science, all of which seek to surface AI capabilities to support users with specific, relevant functionalities where AI in particular can offer improved results. Just as importantly, they have been developed with a grounding in reliability and responsibility – and in consultation with our community – so that users can trust them as they do with all our other products.

Connecting your data

AI-powered writing support

Deeper understanding of scholarly papers

More on Digital Science and AI

The post AI-powered solutions to transform your research appeared first on Digital Science.

]]>
How does Digital Science use AI? We ask ChatGPT https://www.digital-science.com/blog/2025/05/how-does-digital-science-use-ai-we-ask-chatgpt/ Sun, 11 May 2025 23:34:24 +0000 https://www.digital-science.com/?p=90460 How is AI being used in the research ecosystem? We asked AI itself for some answers.

The post How does Digital Science use AI? We ask ChatGPT appeared first on Digital Science.

]]>
AI has finally broken into the mainstream over the past few years. But in the research ecosystem, what do we mean by AI and how is it already being used? We did what everyone has been doing and asked AI itself for some answers.

Anyone would be forgiven for thinking that artificial intelligence (AI) had only just been invented, given the huge interest that has suddenly been generated in it, starting in 2023. However, AI actually has a long history, being first conceived of in the 1950s and gradually becoming a reality in the decades after. Nevertheless, a real step forward has been made with Generative AI, which can create seemingly flawless text or images in just seconds. But is it really that new?

For those who write blog posts or even an SMS message, they might be used to accepting one or two suggestions from the predictive text function, itself a form of AI. When you realize that the words you’ve chosen haven;t 100% been your own, you can see how lines have started to blur with the more advanced Generative AI’s capabilities. However, caution must still be used as AI is more than capable of getting things wrong – just read Digital Science CEO Daniel Hook’s post on ‘The lone banana problem’. Moreover, a lack of context or key detail can lead to misrepresentation or redundancy in the content being created.

Digital Science and AI

Digital Science has been using AI and related technologies such as Machine Learning (ML) and Natural Language Processing (NLP) for several years – indeed, Dimensions has had https://www.dimensions.ai as its URL since its launch in 2018 – and it has become a trusted partner for hundreds of organizations that depend on its AI expertise. These are the ways Digital Science and its portfolio companies currently use AI-related technologies:

Featured products

Ask ChatGPT

So what happens if we ask a Generative AI model how Digital Science uses AI? Well, here’s some text we received courtesy of OpenAI’s ChatGPT:

“Digital Science, as a company that provides scientific software and research solutions, utilizes AI in various ways. Here are a few examples:

  • Text and data mining: AI algorithms are employed to analyze large volumes of scientific literature, extracting valuable insights, identifying trends, and accelerating the discovery process.
  • Natural Language Processing (NLP): Digital Science uses NLP techniques to enhance the understanding and processing of scientific texts, enabling advanced search capabilities, automatic categorization, and semantic analysis.
  • Predictive analytics: AI models are trained on scientific data to predict outcomes, such as the success or impact of research projects, funding opportunities, or potential collaborations, helping researchers make informed decisions.
  • Recommendation systems: By employing machine learning algorithms, Digital Science can provide personalized recommendations to researchers, suggesting relevant articles, journals, or research resources based on their interests and previous activities.
  • Data visualization and exploration: AI-powered tools enable the creation of interactive visualizations and data exploration platforms, allowing scientists to gain deeper insights from complex datasets and facilitate data-driven decision-making.

These are just a few examples of how Digital Science incorporates AI into its offerings. The field of AI is rapidly evolving, and new applications are continually emerging to support scientific research and discovery.”
Source: ChatGPT

While all of the information presented by the AI is correct, it is broken down by technical functionality, only presenting ‘examples’ of how technology is used and not what it is used on. As such, it is technically very accomplished, but of little interest to most audiences who would miss the context of how the technology is actually being applied.

AI vs humans

In 2023, while all the excitementaround Generative AI was at its peak, Digital Science launched a campaign called ‘Mind the trust gap’ which sought to underline the importance of trust in research among those who both carry out research itself and the rest of us who depend on its integrity and quality. The developments in AI technology are often compared to humans like it is some sort of competition or even battle, but it is perhaps more sensible to view it as a relationship entering a new phase, and trust is key to that relationship. As long as we can use AI with transparency and openness like the examples above, we can build a better understanding of the world around us.

More on Digital Science and AI

The post How does Digital Science use AI? We ask ChatGPT appeared first on Digital Science.

]]>
Publisher Day 2025: Key insights on integrity, disruption, and innovation in scholarly publishing https://www.digital-science.com/blog/2025/03/publisher-day-2025-key-insights-on-integrity-disruption-and-innovation-in-scholarly-publishing/ Tue, 25 Mar 2025 03:18:00 +0000 https://www.digital-science.com/?p=90639 This year’s Publisher Day offered valuable insights into the evolving world of scholarly publishing, with experts from the industry exploring themes of research integrity, AI disruption, Open Access, and social media strategies. Centered around the theme ‘Insight to Impact,’ the event featured a series of keynotes, panels, and lightning talks that addressed the current challenges […]

The post Publisher Day 2025: Key insights on integrity, disruption, and innovation in scholarly publishing appeared first on Digital Science.

]]>
This year’s Publisher Day offered valuable insights into the evolving world of scholarly publishing, with experts from the industry exploring themes of research integrity, AI disruption, Open Access, and social media strategies. Centered around the theme ‘Insight to Impact,’ the event featured a series of keynotes, panels, and lightning talks that addressed the current challenges and opportunities in scholarly communication.

Jon Treadway, Director at Great North Wood Consulting, delivered the opening keynote, outlining the challenges facing scholarly communication. He highlighted the shift towards isolationism in the geopolitical landscape, impacting scientific collaboration. Treadway emphasized that scholarly communication remains inefficient and fragmented, with AI advancements posing both opportunities and risks. He cautioned that AI’s rapid progress demands vigilance and noted that Open Access has yet to reach its full potential due to siloed discussions rather than collaborative efforts. 

The first panel focused on ensuring transparency and trustworthiness in research. Panelists included Leslie McIntosh, Vice President of Research Integrity at Digital Science; Laura Wilson, Head of Research Integrity & Ethics at Taylor & Francis; Jennifer Wright, Head of Publication Ethics & Research Integrity at Cambridge University Press; and Doug Melville, Technical Product Manager at Sage. Key concerns discussed were AI’s impact on research, the need for transparent licensing conditions, and the challenge of ensuring research security. The panel emphasized that publishers play a critical role in upholding standards, holding the “gold” in narratives and research culture. Collaboration across the industry, especially in disambiguating author identities and spotting bad actors, was stressed as vital to safeguarding integrity.

audience at conference

Ann Campbell, Technical Solutions Manager at Digital Science, and Katie Davison, Insights Analyst at Emerald Publishing, presented a case study on how they’ve worked together to leverage Dimensions on GBQ for data-driven insights. They demonstrated how combining Emerald Publishing’s strategic goals with Dimensions’ extensive dataset allowed them to uncover valuable insights about research trends, author collaborations, and institutional impact. By harnessing these insights, Emerald was able to identify key growth areas, refine their editorial strategies, and better align their publishing efforts with the needs of the academic community.

audience at conference

The second panel addressed how publishers are adapting their strategies in response to evolving social media platforms. The panel included Lou Peck, CEO & Founder of The International Bunch; Jitske de Vries, Head of Marketing at The Company of Biologists; Rowena Gordon, Senior Managing Editor at the British Ecological Society; Daisy Veysey, Social Media Manager at eLife; and Marion Schnelle, Social Media Manager at De Gruyter Brill. Discussions highlighted the emergence of platforms like Mastodon and Bluesky as an alternative to X (formerly Twitter). It was raised that in a poll of 6000 readers of Nature, 70% had moved over from X to Bluesky declaring the platform nicer, kinder and less antagonistic. The panel also raised a growing emphasis on community-building strategies over traditional marketing tactics among publishers. 

Next, Sarah Greaves, Director and Publishing Consultant at Sarah Greaves STEM Consulting, discussed the ongoing trend of consolidation in the publishing industry. She explored how mergers and acquisitions are reshaping the landscape, the potential benefits of improved efficiencies, and concerns about reduced diversity and potential monopolistic behaviors. 

Our lightning talks provided valuable insights into our latest innovations for publishers. Nicholas Bailey, our Senior Product Manager, introduced Dimensions Author Check, our new research integrity tool designed to help publishers verify potential authors, editors, and reviewers. Meanwhile, Mike Taylor, Head of Data Insights, showcased the Altmetric Journal Benchmark dashboard – a powerful resource for publishers seeking to measure success, benchmark their journals against competitors, and identify growth opportunities.

In the closing keynote, Jo Wixon, Director of External Analysis at Wiley, explored how publishers can actively contribute to advancing the United Nations’ Sustainable Development Goals (SDGs). She emphasized the growing importance of aligning publishing practices with global sustainability efforts, showcasing strategies for integrating SDG-related content and metrics into editorial workflows, peer review processes, and publication strategies. By adopting these practices, publishers can amplify research that addresses critical global challenges while enhancing their impact within the academic community.

Our 2025 Publisher Day highlighted both the opportunities and challenges publishers face in adapting to AI advancements, strengthening research integrity, and responding to shifts in social media landscapes. With increasing consolidation and technological disruption, the industry must work collaboratively to ensure the integrity, accessibility, and credibility of scholarly communication.

The post Publisher Day 2025: Key insights on integrity, disruption, and innovation in scholarly publishing appeared first on Digital Science.

]]>
The perpetual research cycle: AI’s journey through data, papers, and knowledge https://www.digital-science.com/blog/2025/03/the-perpetual-research-cycle-ais-journey-through-data-papers-and-knowledge/ Fri, 21 Mar 2025 04:28:00 +0000 https://www.digital-science.com/?p=90658 If AI can help to generate, mine, and refine knowledge faster than human researchers, what does the future of academia look like?

The post The perpetual research cycle: AI’s journey through data, papers, and knowledge appeared first on Digital Science.

]]>
Academics hypothesize, generate data, make sense of it and then communicate it. If AI can help to generate, mine, and refine knowledge faster than human researchers, what does the future of academia look like? The answer lies not in replacing human intellect but in enhancing it, creating a collaborative synergy between AI and human researchers that will define the next era of scientific progress. I’ve been playing around with chatGPT, Google Gemini and Claude.ai to see how well they all do at creating academic papers from datasets. 

AI can also serve as a tool to aid humans in data extraction from many papers. Consider a scenario where AI synthesizes information from hundreds of studies to create a refined dataset. That dataset then feeds back into the system, sparking new research papers.

This cycle—dataset to paper, paper to knowledge extraction, knowledge to new datasets—propels an accelerating loop of discovery. Instead of a linear research pipeline, AI enables a continuous, self-improving knowledge ecosystem.

From data to papers

I looked for interesting datasets on Figshare. The criteria was a) that I knew they would be re-usable as they had been cited several times. And b) the files were relatively small (<100MB) so as not to hit the limits of the common AI tools. 

This one fit the bill:

Rivers, American (2019). American Rivers Dam Removal Database. figshare. Dataset. https://doi.org/10.6084/m9.figshare.5234068.v12

image of database user interface

From there I asked Claude 3.7 Sonnet “Based on the attached files, can you create a full length academic paper with an abstract, methods results, discussion and references”. Followed by “Can you convert the whole paper to latex so I can copy and paste it into Overleaf?”

The resulting paper needs a little tweaking in the layout of the results and graphs, but other than that, has done a great job.

animated scroll of research paper

Papers to new data/knowledge

A single paper is just the beginning. The real challenge is synthesizing knowledge from the ever-growing volume of research. This is where specialized knowledge extraction tools become crucial. How do we effectively mine this knowledge? This is where ReadCube shines. ReadCube helps researchers manage and discover scholarly literature, but its real power lies in its knowledge extraction capabilities. Imagine ReadCube as a powerful filter, sifting through countless pages to extract the nuggets of wisdom.

Tools like ReadCube can then analyze vast collections of papers, uncovering patterns and relationships that human researchers might miss. This process involves:

  • Text and citation mining: AI can analyze papers to identify emerging trends, inconsistencies, or knowledge gaps.
  • Automatic synthesis: AI can compare findings across thousands of studies, synthesizing insights into new, high-level conclusions.
  • Hypothesis generation: By recognizing correlations between disparate research areas, AI can propose new research directions.
view of research paper with highlighted section

The flywheel effect: How the cycle accelerates

The true magic happens when this extracted knowledge becomes the input for the next iteration. Each cycle follows this pattern:

  1. Raw data is processed by AI to generate initial research outputs
  2. Knowledge extraction tools mine these outputs for higher-order insights
  3. These insights form a new, refined dataset
  4. AI processes this refined dataset, generating more precise analyses
  5. The cycle continues, with each rotation producing more valuable knowledge

With each turn of this flywheel, the insights become more refined, more interconnected, and more actionable. The initial analyses might focus on direct correlations in the data, while later iterations can explore complex causal relationships, predict future trends, or suggest optimal intervention strategies.

This AI-driven, data-to-knowledge cycle represents a paradigm shift in research. Imagine the possibilities in fields like medicine, climate science, and economics. We’re moving towards a future where AI and human researchers work in synergy, pushing the boundaries of discovery. Rather than replacing researchers, AI acts as a force multiplier, enabling deeper faster knowledge generation.

The post The perpetual research cycle: AI’s journey through data, papers, and knowledge appeared first on Digital Science.

]]>
Digital Science adds AI-powered summaries to Symplectic Elements to drive research discoverability https://www.digital-science.com/blog/2025/02/digital-science-adds-ai-powered-summaries-to-symplectic-elements/ Thu, 27 Feb 2025 03:08:00 +0000 https://www.digital-science.com/?p=90638 Symplectic Elements will now offer the ability to embed AI-generated summaries for publication abstracts within a researcher’s public profile.

The post Digital Science adds AI-powered summaries to Symplectic Elements to drive research discoverability appeared first on Digital Science.

]]>
Thursday 27 February 2025

Digital Science has today announced new enhancements to Symplectic Elements, which will now offer the ability to embed AI-generated summaries for publication abstracts within a researcher’s public profile. 

Symplectic Elements, a leading research information management system (RIMS), enables the creation of comprehensive public profiles. These profiles are hosted on sleek, modern, and intuitive online portals that offer advanced search and discovery capabilities while ensuring alignment with organizational branding. Profiles can be made available not only for researchers and faculty but also for equipment, services, and other institutional resources. 

Symplectic Elements currently underpins the profiling portals for over 70 organizations, including academic institutions (such as University College London, University of Oxford and Virginia Tech), consortia (Ohio Innovation Exchange) and governmental organizations (Australian Nuclear Science and Technology Organisation).

With the introduction of cutting-edge AI functionality, abstract summaries can now be displayed within a researcher’s public profile. These AI summaries can be generated on-demand by a visitor to provide a concise, at-a-glance synopsis of research outputs. Using the publication title and abstract as a source, the Dimensions powered system generates a summary, key highlights, and top keywords, helping visitors quickly assess the relevance of an output to their research – increasing the discoverability of research work, without adding extra administrative burden to faculty members. This new AI summarization functionality operates in a private and secure environment and opt-out is available at both an institutional / organization level and at an individual level.

“AI is transforming the way we manage, and share information, and Digital Science is proud to lead the way with solutions that enable the academic community to maximize its research impact,” said Jonathan Breeze, EVP Academic Markets at Digital Science. “Adding AI-powered summaries to Symplectic Elements public profiles will aid research discoverability and demonstrates Digital Science’s commitment to making use of AI technologies to support the research community.”

“This is another step in Digital Science’s program of AI-based product enhancements that work synergistically with research workflows,” said Daniel Hook, CEO of Digital Science. “Part of being a responsible contributor to infrastructure in the scholarly space is to ensure that new functionality is augmentative and safe to use and we pride ourselves on an incremental approach designed to support researchers at all stages of their career.”

About Symplectic

Symplectic works in pursuit of the advancement of knowledge, delivering flexible research management solutions that help government and federal departments, funding organisations and academic institutions achieve their research goals. Its flagship products include Symplectic Elements, a research information management system that captures, analyses and showcases scholarly activities, and Symplectic Grant Tracker, providing specialist tools to streamline the grants management lifecycle.

About Digital Science

Digital Science is an AI-focused technology company providing innovative solutions to complex challenges faced by researchers, universities, funders, industry and publishers. We work in partnership to advance global research for the benefit of society. Through our brands – Altmetric, Dimensions, Figshare, IFI CLAIMS Patent Services, metaphacts, OntoChem, Overleaf, ReadCube, Symplectic, and Writefull – we believe when we solve problems together, we drive progress for all. Visit digital-science.com and follow Digital Science on Bluesky, on X or on LinkedIn.

Media contact

David Ellis, Press, PR & Social Manager, Digital Science: Mobile +61 447 783 023, d.ellis@digital-science.com

The post Digital Science adds AI-powered summaries to Symplectic Elements to drive research discoverability appeared first on Digital Science.

]]>
Deep minds: Reflections from the AI for Science Forum https://www.digital-science.com/blog/2024/11/ai-for-science-forum/ Thu, 28 Nov 2024 14:00:00 +0000 https://www.digital-science.com/?post_type=tldr_article&p=74440 Last week, Suze attended the AI for Science Forum, a gathering of incredible minds from across disciplines, each sharing perspectives on how AI is transforming research and impacting society. Organised by Google DeepMind and The Royal Society, the event brought together invited guests from across all segments of the research community to share their experiences and expertise, while also giving opportunities for attendees to meaningfully discuss how we can best wrangle this novel technology to increase the impact and reach of research with the resources currently at our disposal. Here’s a reflection on the day’s proceedings.

The post Deep minds: Reflections from the AI for Science Forum appeared first on Digital Science.

]]>
Last week, I attended the AI for Science Forum, a gathering of incredible minds from across disciplines, each sharing perspectives on how AI is transforming research and impacting society. Organised by Google DeepMind and The Royal Society, the event brought together invited guests from across all segments of the research community to share their experiences and expertise, while also giving opportunities for attendees to meaningfully discuss how we can best wrangle this novel technology to increase the impact and reach of research with the resources currently at our disposal.

The power of AI in research

Fresh-faced after the previous evening’s reception at The Royal Society, James Manyika, Senior Vice President at Google Alphabet, set the tone for the conference with a powerful opening session on the tangible benefits of AI. From AlphaFold’s protein-folding revolution to AI’s role in flood forecasting in Bangladesh which has already impacted millions of people, his talk was a staggering reminder that AI is already being used in various innovative and impactful ways beyond those we are more familiar with. He also discussed AI-enabled solutions through the lens of a public health research focus, reminding attendees of advancements including diabetic retinopathy detection in low-resource settings and the groundbreaking atmospheric simulations helping better prepare agricultural workers with advanced warning of what interventions may be required. This reminded me of our recent TL;DR Shorts episode with Dr Danny Hillis of Applied Invention who talked about the potential impact that automated research could have in helping us help non-traditional researchers. But James’s outlook for AI wasn’t all rosy as he underscored the limitations of this tool and emphasised the need for responsible approaches and equitable access to AI-powered tools, echoing his colleague Dr Astro Teller’s thoughts on this.

CRISPR meets AI

Nobel Prize-winning Chemist Professor Jennifer Doudna, Professor of Biochemistry, Biophysics and Structural Biology at UC Berkeley, and James Manyika picked up on this theme as they explored the synergy between CRISPR and AI. They discussed how CRISPR’s one-and-done gene therapies are accelerating in application thanks to AI’s ability to identify the genetic changes that drive outcomes, and democratising treatment options by providing a range of more affordable therapeutic options. Beyond healthcare, they also chatted about the impressive potential impact that AI will have on climate-related research, from drought-resistant crops to better carbon storage systems. What resonated most with me was their call to reduce the barriers – financial, technical, and geographic – to accessing the outcomes of this technology, making it truly global, and reflecting the recent thought we shared from Professor Lord Martin Rees.

Collaborating across disciplines

The focus shifted to the future of collaboration in a panel led by Eric Topol, author and Executive Vice President of Scripps Research, and featuring Fiona Marshall, President of Biomedical Research at Novartis, Alison Noble, Oxford University Technikos Professor of Biomedical Engineering, and Vice President & Foreign Secretary, The Royal Society, and Pushmeet Kohli, Vice President of Science, Google DeepMind. From AI revealing 2.2 million new stable inorganic crystals for potential use in everything from energy to electronics, to revolutionising biomedical imaging through natural image processing, the discussion highlighted how AI forces us to rethink and redefine what collaboration and trust look like. Alison’s comment on the importance of training scientists to understand AI’s errors stood out for me. The panel emphasised the shifting hierarchies and power dynamics in research, with data scientists increasingly leading labs – a significant cultural change, given that they were often seen as collaborators and service providers who were rarely even named on papers that couldn’t have been published without their expertise.

Climate, complexity, and community

Thomas Friedman gave an evocative talk on “climate weirding”, highlighting how we’re hitting many tipping points all at once – AI’s massive breakthroughs, climate change chaos, and global instability – while also linking AI capabilities to potentially help us with carbon emissions and societal disorder. His call for politics to embrace science felt especially urgent amid discussions of mass migration and global conflict. His optimism about how AI can solve big problems, like making farming more efficient or cutting healthcare costs, came with a warning that we also need solid ethics and politics to go hand in hand with these developments to keep things on track, something that our recent Speaker Series guest Professor Jenny Reardon touched upon.

The next panel discussion on building research infrastructure echoed these sentiments. Chaired by Paul Hofheinz, President and Co-Founder, Lisbon Council, and featuring Asmeret Asefaw Berhe, Professor of Soil Biogeochemistry and Falasco Chair in Earth Sciences, University of California, Merced, Bosun Tijani, Minister of Communications, Innovation & Digital Economy, The Federal Republic of Nigeria, and Fabian J. Theis, Director of the Institute of Computational Biology and Professor at TUM Mathematics & Life Sciences, both Paul and Asmeret stressed the need for equity and inclusion to be at the forefront of people’s agendas as they develop solutions using AI, to ensure that advancements don’t deepen the digital divide. Bosun Tijani’s discussion of Africa’s talent acceleration programs was inspiring – and proof that we can nurture talent globally if we commit to the cause. We recently heard from Joy Owango about how important it is to build infrastructure that ensures the persistence and visibility of research contributions from all across the globe, and how impactful this has already been in Africa and other parts of the Global South.

This theme continued as Lila Ibrahim, Chief Operating Officer at Google DeepMind chaired a conversation with Dame Angela McLean, UK Government Chief Scientific Adviser, Ilan Gur, CEO, Advanced Research and Invention Agency (ARIA), and Sir Paul Nurse, Director of the Francis Crick Institute, Nobel laureate, and returning President of the Royal Society chatted about collaborating for impact. The panellists discussed the importance of thinking big and including diverse perspectives through better community engagement in science, something that X’s Dr Astro Teller talked about in a previous TL;DR Shorts episode. Dame Angela talked about how the government needs to aim higher, pushing for more thoughtful use of AI and predictive models, while Sir Paul stressed the need for mixing disciplines to boost innovation. Ilan shared his excitement about creating spaces where scientists from different fields can cross paths, sparking unexpected ideas. Recorded at Sci Foo, a perfect example of a catalyst for collaboration, Dr Etosha Cave echoes this sentiment and the need for interdisciplinarity for innovation. The panel also discussed building trust, with Angela and Ilan both emphasising the importance of transparency in science and technology. All panellists highlighted the role of public engagement in encouraging people to engage with and trust these cutting-edge advances.

Public engagement and trust

The conference ended with a final discussion featuring recent Nobel Prize winners in Chemistry, Sir Demis Hassabis and John Jumper, as well as former winners Professor Jennifer Doudna, and Sir Paul Nurse. Their reflections on public engagement were poignant: how do we bridge the gap between experts and the public? Sir Paul’s call for deliberate public dialogue reminded me how crucial it is to address fears and misconceptions about AI before they grow into barriers. However, one issue that continually cropped up, and one I may have mentioned once or a million times in the past, is that as it stands, the framework within which we reward research success does not make space for valued and impactful public engagement, or even innovation and entrepreneurship. Mariette DiChristina had a few thoughts on this, and we’ll be hearing more from her in 2025 about the value of effective communication of, and engagement with research in the age of open research, research integrity, and novel technology such as AI.

Some key takeaways

  • AI isn’t just transforming research; it’s reshaping the cultures around it. We’re seeing shifts in leadership, collaboration, and the ethical frameworks underpinning research.
  • Accessibility remains a challenge. Whether it’s CRISPR or AI infrastructure, we need to ensure the benefits reach everyone, not just the privileged few.
  • Collaboration is more vital than ever. From breaking disciplinary silos to engaging the public, success hinges on our ability to connect diverse voices.

In a world increasingly shaped by AI, this conference left me both hopeful and reflective. Science thrives when it’s inclusive, transparent, and collaborative – and AI could give us a chance to embrace those ideals like never before, provided we build research methods and applications in thoughtful, considerate, trustworthy and community-minded ways. My teammates recently authored a report that, in true Digital Science style, was informed by reflections from our own research community. The report looks at the changing research landscape in the age of AI and echoes the many challenges and opportunities discussed at this conference. AI is an exciting development that is already changing the way we do research. However, we must hold each other accountable to ensure that its development and application are open to all.

With thanks to Google DeepMind and The Royal Society for hosting this event. You can watch all sessions on Google DeepMind’s YouTube channel.

The post Deep minds: Reflections from the AI for Science Forum appeared first on Digital Science.

]]>