Paper Search Results

AuthorId: 10780897
Limit: 10
Sort by: score
Embedding: s2_recommendations
IP address: 18.224.70.11
Freq flyer: False

authorId(s): 10780897
Author(s): Damai Dai

score citationCount Paper Authors year More like this Compare & Contrast ProNE-s SciNCL Specter GNN

345

Knowledge Neurons in Pretrained Transformers

Damai Dai, Li Dong, ..., Furu Wei
2021

334

A Survey for In-context Learning

Qingxiu Dong, Lei Li, ..., Zhifang Sui
2023

288

A Survey on In-context Learning

Qingxiu Dong, Lei Li, ..., Zhifang Sui
2022

233

Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers

Damai Dai, Yutao Sun, ..., Furu Wei
2023

169

DeepSeek LLM: Scaling Open-Source Language Models with Longtermism

DeepSeek-AI Xiao Bi, Deli Chen, ..., Yuheng Zou
2024

154

DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model

Zhihong Shao, Damai Dai, ..., Huajian Xin
2024

130

Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning

Lean Wang, Lei Li, ..., Xu Sun
2023

111

DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models

Damai Dai, Chengqi Deng, ..., W. Liang
2024

109

Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers

Damai Dai, Yutao Sun, ..., Furu Wei
2022

79

On the Representation Collapse of Sparse Mixture of Experts

Zewen Chi, Li Dong, ..., Furu Wei
2022

Help Bulk Download

GitHub Final Report (YouTube)

JSALT-2023 Contact us (by email)

BETA Version