DeepSeek-V2: A Strong, Economical, and Efficient
Mixture-of-Experts Language Modelpreprint arXiv:2110.14168, 2021. Y. Cui, T. Liu, W. Che, L. Xiao, Z. Chen, W. Ma, S. Wang, and G. Hu. A span-extraction dataset for Chinese machine reading comprehension. In K. Inui, J. Jiang, V. Ng, and X.0 码力 | 52 页 | 1.23 MB | 1 年前3
共 1 条
- 1













