Paper Search Results


AuthorId: 10780897
Limit: 10
Sort by: score
Embedding: s2_recommendations
IP address: 18.224.70.11
Freq flyer: False

authorId(s): 10780897
Author(s): Damai Dai
scorecitationCountPaperAuthorsyearMore like thisCompare & ContrastProNE-sSciNCLSpecterGNN
345
Knowledge Neurons in Pretrained Transformers
Damai Dai, Li Dong, ..., Furu Wei
2021
334
A Survey for In-context Learning
Qingxiu Dong, Lei Li, ..., Zhifang Sui
2023
288
A Survey on In-context Learning
Qingxiu Dong, Lei Li, ..., Zhifang Sui
2022
233
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Damai Dai, Yutao Sun, ..., Furu Wei
2023
169
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi, Deli Chen, ..., Yuheng Zou
2024
154
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Zhihong Shao, Damai Dai, ..., Huajian Xin
2024
130
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
Lean Wang, Lei Li, ..., Xu Sun
2023
111
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Damai Dai, Chengqi Deng, ..., W. Liang
2024
109
Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers
Damai Dai, Yutao Sun, ..., Furu Wei
2022
79
On the Representation Collapse of Sparse Mixture of Experts
Zewen Chi, Li Dong, ..., Furu Wei
2022

Help Bulk Download
GitHub Final Report (YouTube)
JSALT-2023 Contact us (by email)
BETA Version