Paper Search Results
Home Page
More Results
Recommendations from ProNE (based on Citation Graph)
AuthorId: 10780897
Limit: 10
Sort by: score
Embedding: s2_recommendations
IP address: 18.224.70.11
Freq flyer: False
authorId(s): 10780897
Author(s): Damai Dai
score
citationCount
Paper
Authors
year
More like this
Compare & Contrast
ProNE-s
SciNCL
Specter
GNN
345
Knowledge Neurons in Pretrained Transformers
Damai Dai
,
Li Dong
, ...,
Furu Wei
2021
Similar to this
Compare & Contrast
334
A Survey for In-context Learning
Qingxiu Dong
,
Lei Li
, ...,
Zhifang Sui
2023
Similar to this
Compare & Contrast
288
A Survey on In-context Learning
Qingxiu Dong
,
Lei Li
, ...,
Zhifang Sui
2022
Similar to this
Compare & Contrast
233
Why Can GPT Learn In-Context? Language Models Secretly Perform Gradient Descent as Meta-Optimizers
Damai Dai
,
Yutao Sun
, ...,
Furu Wei
2023
Similar to this
Compare & Contrast
169
DeepSeek LLM: Scaling Open-Source Language Models with Longtermism
DeepSeek-AI Xiao Bi
,
Deli Chen
, ...,
Yuheng Zou
2024
Similar to this
Compare & Contrast
154
DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Zhihong Shao
,
Damai Dai
, ...,
Huajian Xin
2024
Similar to this
Compare & Contrast
130
Label Words are Anchors: An Information Flow Perspective for Understanding In-Context Learning
Lean Wang
,
Lei Li
, ...,
Xu Sun
2023
Similar to this
Compare & Contrast
111
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models
Damai Dai
,
Chengqi Deng
, ...,
W. Liang
2024
Similar to this
Compare & Contrast
109
Why Can GPT Learn In-Context? Language Models Implicitly Perform Gradient Descent as Meta-Optimizers
Damai Dai
,
Yutao Sun
, ...,
Furu Wei
2022
Similar to this
Compare & Contrast
79
On the Representation Collapse of Sparse Mixture of Experts
Zewen Chi
,
Li Dong
, ...,
Furu Wei
2022
Similar to this
Compare & Contrast
Help
Bulk Download
GitHub
Final Report (YouTube)
JSALT-2023
Contact us (by email)
BETA Version