Paper Search Results


AuthorId: 3458736
Limit: 10
Sort by: score
Embedding: s2_recommendations
IP address: 3.139.239.157
Freq flyer: False

authorId(s): 3458736
Author(s): Dirk Groeneveld
scorecitationCountPaperAuthorsyearMore like thisCompare & ContrastProNE-sSciNCLSpecterGNN
378
Construction of the Literature Graph in Semantic Scholar
Bridger Waleed Ammar, Dirk Groeneveld, ..., Oren Etzioni
2018
355
Documenting Large Webtext Corpora: A Case Study on the Colossal Clean Crawled Corpus
Jesse Dodge, Ana Marasovic, ..., Matt Gardner
2021
210
OLMo: Accelerating the Science of Language Models
Dirk Groeneveld, Iz Beltagy, ..., Hanna Hajishirzi
2024
143
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research
Luca Soldaini, Rodney Kinney, ..., Kyle Lo
2024
94
From 'F' to 'A' on the N.Y. Regents Science Exams: An Overview of the Aristo Project
Peter Clark, Oren Etzioni, ..., Michael Schmitz
2019
59
What's In My Big Data?
Yanai Elazar, Akshita Bhagia, ..., Jesse Dodge
2023
43
Documenting the English Colossal Clean Crawled Corpus
Jesse Dodge, Maarten Sap, ..., Matt Gardner
2021
40
A Simple Yet Strong Pipeline for HotpotQA
Dirk Groeneveld, Tushar Khot, ..., Ashish Sabharwal
2020
29
DataComp-LM: In search of the next generation of training sets for language models
Jeffrey Li, Alex Fang, ..., Vaishaal Shankar
2024
23
IKE - An Interactive Tool for Knowledge Extraction
Bhavana Dalvi, Sumithra Bhakthavatsalam, ..., Dirk Groeneveld
2016

Help Bulk Download
GitHub Final Report (YouTube)
JSALT-2023 Contact us (by email)
BETA Version