活动论文风云榜专栏知识树项目社交

手机扫码分享

分享

Mixture-of-Depths: Dynamically allocating compute in transformer-based language models

3

查看论文

热度