Unraveling the Ai2 Asta Scholarly Research Assistant Citation System

Contenido principal del artículo

Enrique Orduña-Malea
https://orcid.org/0000-0002-1989-8477
Carlos Lopezosa
https://orcid.org/0000-0001-8619-2194

Resumen

Despite the growing integration of Deep Research tools into academic workflows, empirical evidence on the operation, stability, and potential biases of their citation systems remains scarce. This study addresses this gap by evaluating the intensity, consistency, and bibliographic characteristics of references cited in the literature reports generated by Ai2 Asta, with the aim of understanding how its citation system operates and assessing its implications for scholarly communication. To this end, ten domain-specific queries were submitted to Asta’s Summarise Literature feature, and two independent rounds of data collection were conducted. From each report, in-text citations, cited references, as well as other metrics related to the response process were extracted and examined. The results reveal high citation intensity, with reports integrating numerous in-text citations grounded in retrieved evidence and a diverse yet concentrated set of venues. However, notable instability is observed in the composition of cited references across identical queries, alongside a lack of concordance between retrieved documents and those ultimately cited, suggesting additional opaque selection mechanisms during report generation. These findings indicate that, while Ai2 Asta produces well-structured and quality reports, its instability and opacity in the citation process pose challenges in quantitative science studies due to their lack of reproducibility and transparency. Despite the restricted number of queries and disciplinary scope, the results offer valuable insights for researchers, bibliometricians, developers, and research evaluators seeking to understand, use or regulate AI-based scholarly assistants responsibly.

Detalles del artículo

Sección

Miscelánea

Cómo citar

Orduña-Malea, E., & Lopezosa, C. (2025). Unraveling the Ai2 Asta Scholarly Research Assistant Citation System. Revista Panamericana De Comunicación, 7(2). https://doi.org/10.21555/rpc.v7i2.3675

Referencias

Binz, M., Alaniz, S., Roskies, A., Aczel, B., Bergstrom, C. T., Allen, C., Schad, D., Wulff, D., West, Jevin D., Zhang, Q., Shiffrin, Richard M., Gershman, Samuel J., Popov, V., Bender, Emily M., Marelli, M., Botvinick, Matthew, M., Akata, Z., & Schulz, E. (2025). How should the advancement of large language models affect the practice of science? Proceedings of the National Academy of Sciences, 122(5). https://doi.org/10.1073/pnas.2401227121

Chen, T. J. (2023). ChatGPT and other artificial intelligence applications speed up scientific writing. Journal of the Chinese Medical Association, 86(4), 351-353. https://doi.org/10.1097/JCMA.0000000000000900

Codina, L. (2025, June 11). Revisiones de la literatura con el uso de inteligencia artificial: propuesta de un nuevo marco de trabajo. Lluis Codina [Blog]. https://www.lluiscodina.com/revisiones-literatura-ia/

Gao, Y., Xiong, Y., Gao, X., Jia, K., Pan, J., Bi, Y., Dai, Y., Sun, J., Wang, H., & Wang, H. (2024). Retrieval-augmented generation for large language models: A survey. ArXiv [preprint]. https://doi.org/10.48550/arXiv.2312.10997

Grossmann, I., Feinberg, M., Parker, D. C., Christakis, N. A., Tetlock, P. E., & Cunningham, W. A. (2023). AI and the transformation of social science research. Science, 380(6650), 1108-1109. https://doi.org/10.1126/science.adi1778

Hassan-Montero, Y., De-Moya-Anegón, F., & Guerrero-Bote, V. P. (2022). SCImago Graphica: a new tool for exploring and visually communicating data. Profesional de la información, 31(5). https://doi.org/10.3145/epi.2022.sep.02; https://graphica.app/papers/310502_Hassan_De-Moya_Guerrero.pdf

Hu, K. (2023). ChatGPT sets record for fastest-growing user base - analyst note. Reuters [media]. https://www.reuters.com/technology/chatgpt-sets-record-fastest-growing-user-base-analyst-note-2023-02-01

Huang, Y., Chen, Y., Zhang, H., Li, K., Zhou, H., Fang, M., Yang, L., Li, X., Shang, L., Xu, S., Hao, J., Shao, K., & Wang, J. (2025). Deep research agents: A systematic examination and roadmap. ArXiv [preprint]. https://doi.org/10.48550/arXiv.2506.18096

Jansen, B. J., Jung, S. G., & Salminen, J. (2023). Employing large language models in survey research. Natural Language Processing Journal, 4. https://doi.org/10.1016/j.nlp.2023.100020

Kinney, R. M., Anastasiades, C., Authur, R., Beltagy, I., Bragg, J., Buraczynski, A., Cachola, I., Candra, S., Chandrasekhar, Y., Cohan, A., Crawford, M., Downey, D., Dunkelberger, J., Etzioni, O., Evans, R., Feldman, S., Gorney, J., Graham, D.W., Hu, F., Huff, R., King, D., Kohlmeier, S., Kuehl, B., Langan, M., Lin, D., Liu, H., Lo, K., Lochner, J., MacMillan, K., Murray, T.C., Newell, C., Rao, S.R., Rohatgi, S., Sayre, P., Shen, Z., Singh, A., Soldaini, L., Subramanian, S., Tanaka, A., Wade, A.D., Wagner, L. M., Wang, L. L., Wilhelm, C., Wu, C., Yang, J., Zamarron, A., van Zuylen, M., & Weld, D.S. (2023). The Semantic Scholar Open Data Platform. ArXiv [preprint]. https://doi.org/10.48550/arXiv.2301.10140

Kousha, K., & Thelwall, M. (2024). Artificial intelligence to support publishing and peer review: A summary and review. Learned Publishing, 37(1), 4-12. https://doi.org/10.1002/leap.1570

Lewis, P., Perez, E., Piktus, A., Petroni, F., Karpukhin, V., Goyal, N., Küttler, H., Lewis, M., Yih, W-T., Rocktäschel, T., Riedel, T., & Kiela, D. (2020). Retrieval-augmented generation for knowledge-intensive NLP tasks. In H. Larochelle and M. Ranzato and R. Hadsell and M.F. Balcan and H. Lin (Eds.). Advances in neural information processing systems (pp. 9459-9474). Curran Associates. https://proceedings.neurips.cc/paper_files/paper/2020/file/6b493230205f780e1bc26945df7481e5-Paper.pdf

Nejjar, M., Zacharias, L., Stiehle, F., & Weber, I. (2025). LLMs for science: Usage for code generation and data analysis. Journal of Software: Evolution and Process, 37(1). https://doi.org/10.1002/smr.2723

Ng, D. T. K., Leung, J. K. L., Chu, S. K. W., & Qiao, M. S. (2021). Conceptualizing AI literacy: An exploratory review. Computers and Education: Artificial Intelligence, 2. https://doi.org/10.1016/j.caeai.2021.100041

Orduña-Malea, E., & Cabezas-Clavijo, Á. (2023). ChatGPT and the potential growing of ghost bibliographic references. Scientometrics, 128(9): 5351-5355. https://doi.org/10.1007/s11192-023-04804-4

Rane, N.L., Tawde, A., Choudhary, S.P., & Rane, J. (2023). Contribution and performance of ChatGPT and other Large Language Models (LLM) for scientific and research advancements: a double-edged sword. International Research Journal of Modernization in Engineering Technology and Science, 5(10), 875-899. https://doi.org/10.56726/IRJMETS45312; https://goo.su/gvScINu

Rossi, L., Harrison, K., & Shklovski, I. (2024). The problems of LLM-generated data in social science research. Sociologica: International Journal for Sociological Debate, 18(2), 145-168. https://doi.org/10.6092/issn.1971-8853/19576

Scherbakov, D., Hubig, N., Jansari, V., Bakumenko, A., & Lenert, L. A. (2025). The emergence of large language models as tools in literature reviews: a large language model-assisted systematic review. Journal of the American Medical Informatics Association, 32(6), 1071-1086. https://doi.org/10.1093/jamia/ocaf063; https://pubmed.ncbi.nlm.nih.gov/40332983

Silva, N., & Wickramaarachchi, D. (2025). Enhancing systematic literature reviews: Evaluating the performance of LLM-based tools across key systematic literature review stages. In 2025 5th International Conference on Advanced Research in Computing (ICARC) (pp. 1-6). IEEE. https://doi.org/10.1109/ICARC64760.2025.10963273

Singh, A., Chang, J.C., Anastasiades, C., Haddad, D., Naik, A., Tanaka, A., Zamarron, A., Nguyen, C., Hwang, J.D., Dunkleberger, J., Latzke, M., Rao, S.R., Lochner, J., Evans, R., Kinney, R., Weld, D.S., Downey, D., & Feldman, S. (2025). Ai2 Scholar QA: Organized literature synthesis with attribution. ArXiv [preprint]. https://doi.org/10.48550/arXiv.2504.10861

Sor, J. (2025). Sam Altman touts ChatGPT’s 800 million weekly users, double all its main competitors combined. Business Insider [media]. https://www.businessinsider.com/chatgpt-users-openai-sam-altman-devday-llm-artificial-intelligence-2025-10

Sun, Z. (2025). Large language models in peer review: challenges and opportunities. Scientometrics, 130, 5503–5546. https://doi.org/10.1007/s11192-025-05440-w

Tay, A. (2025). The rise of agent-based deep research: Exploring OpenAI’s Deep Research, Gemini Deep Research, Perplexity Deep Research, Ai2 ScholarQA, STORM, and more in 2025. Aaron Tay’s Musings About Librarianship [blog].

Walters, W.H., & Wilder, E.I. (2023). Fabrication and errors in the bibliographic citations generated by ChatGPT. Scientific Reports, 13(1). https://doi.org/10.1038/s41598-023-41032-5

Xian, J., Teofili, T., Pradeep, R., & Lin, J. (2024). Vector search with OpenAI embeddings: Lucene is all you need. In Proceedings of the 17th ACM International Conference on Web Search and Data Mining (pp. 1090-1093).

Xu, R., & Peng, J. (2025). A comprehensive survey of deep research: Systems, methodologies, and applications. ArXiv [preprint]. https://doi.org/10.48550/arXiv.2506.12594

Xu, T., Lu, P., Ye, L., Hu, X., & Liu, P. (2025). Researcherbench: Evaluating deep ai research systems on the frontiers of scientific inquiry. ArXiv [preprint]. https://doi.org/10.48550/arXiv.2507.16280

Zheng, Y., Koh, H. Y., Ju, J., Nguyen, A. T., May, L. T., Webb, G. I., & Pan, S. (2023). Large language models for scientific synthesis, inference and explanation. ArXiv [preprint]. https://doi.org/10.48550/arXiv.2310.07984

Artículos similares

También puede Iniciar una búsqueda de similitud avanzada para este artículo.

Artículos más leídos del mismo autor/a