Investigating Document Type, Language, Publication Year, and Author Count Discrepancies Between OpenAlex and Web of Science

Preprent

paper
In sum, we would argue that assessing the validity of a metadata element can only be done by comparing it to the full range of possible truths and considering the metadata’s consistency with the database’s internal policy.
Authors

Phillipe Mongeon

Madelaine Hare

Poppy Riddle

Summer Wilson

Geoff Krause

Rebecca Marjoram

Rémi Toupin

Published

August 26, 2025

Abstract

Bibliometrics, whether used for research or research evaluation, relies on large multidisciplinary databases of research outputs and citation indices. The Web of Science (WoS) was the main supporting infrastructure of the field for more than 30 years until several new competitors emerged. OpenAlex, a bibliographic database launched in 2022, has distinguished itself for its openness and extensive coverage. While OpenAlex may reduce or eliminate barriers to accessing bibliometric data, one of the concerns that hinders its broader adoption for research and research evaluation is the quality of its metadata. This study aims to assess metadata quality in OpenAlex and WoS, focusing on document type, publication year, language, and number of authors. By addressing discrepancies and misattributions in metadata, this research seeks to enhance awareness of data quality issues that could impact bibliometric research and evaluation outcomes.

Citation

BibTeX citation:
@misc{mongeon2025,
  author = {Mongeon, Phillipe and Hare, Madelaine and Riddle, Poppy and
    Wilson, Summer and Krause, Geoff and Marjoram, Rebecca and Toupin,
    Rémi},
  title = {Investigating {Document} {Type,} {Language,} {Publication}
    {Year,} and {Author} {Count} {Discrepancies} {Between} {OpenAlex}
    and {Web} of {Science}},
  date = {2025-08-26},
  url = {https://doi.org/10.48550/arXiv.2508.18620},
  doi = {10.48550/arXiv.2508.18620},
  langid = {en}
}
For attribution, please cite this work as:
Mongeon, Phillipe, Madelaine Hare, Poppy Riddle, Summer Wilson, Geoff Krause, Rebecca Marjoram, and Rémi Toupin. 2025. “Investigating Document Type, Language, Publication Year, and Author Count Discrepancies Between OpenAlex and Web of Science.” arXiv. https://doi.org/10.48550/arXiv.2508.18620.