DeepSeek-V2: A Strong, Economical, and Efficient
Mixture-of-Experts Language ModelMoE-related communication costs. When expert parallelism is employed, the routed experts will be distributed across multiple devices. For each token, its MoE-related communication frequency is proportional selection of routed experts, we additionally ensure that the target experts of each token will be distributed on at most ? devices. To be specific, for each token, we first select ? devices that have experts model serving with pagedattention. In Proceedings of the ACM SIGOPS 29th Symposium on Operating Systems Principles, 2023. G. Lai, Q. Xie, H. Liu, Y. Yang, and E. H. Hovy. RACE: large-scale reading comprehension0 码力 | 52 页 | 1.23 MB | 1 年前3
OpenAI 《A practical guide to building agents》Advances in reasoning, multimodality, and tool use have unlocked a new category of LLM-powered systems known as agents. This guide is designed for product and engineering teams exploring how to build to perform the same workflows on the users’ behalf with a high degree of independence. Agents are systems that independently accomplish tasks on your behalf. A workflow is a sequence of steps that must be and transfer control back to the user. 02 It has access to various tools to interact with external systems—both to gather context and to take actions—and dynamically selects the appropriate tools depending0 码力 | 34 页 | 7.00 MB | 6 月前3
Trends Artificial Intelligence
next layers of AI infrastructure: agentic interfaces, enterprise copilots, real-world autonomous systems, and sovereign models. Rapid advances in artificial intelligence, compute infrastructure, and global implications are just starting to emerge. AI agents could reshape how users interact with digital systems – from customer support and onboarding to research, scheduling, and internal operations. Enterprises For AI = Artificial General Intelligence93 Artificial General Intelligence, or AGI, refers to systems capable of performing the full range of human intellectual tasks – reasoning, planning, learning0 码力 | 340 页 | 12.14 MB | 4 月前3
OpenAI - AI in the Enterprisewill increasingly do their best work with sophisticated, complex, interconnected workflows and systems. We’re seeing AI deliver significant, measurable improvements on three fronts: 01 Workforce performance our own work. An example: Our support teams were getting bogged down, spending time accessing systems, trying to understand context, craft responses, and take the right actions for customers. So we we built an internal automation platform. It works on top of our existing workflows and systems to automate rote work and accelerate insight and action. Our first use case: working on top of Gmail to0 码力 | 25 页 | 9.48 MB | 5 月前3
清华大学 DeepSeek+DeepResearch 让科研像聊天一样简单snails distributed across a vertical rocky intertidal gradient. Functional Ecology 25:177-185 Bourdeau PE(2011) Constitutive and inducible defensive traits in co-occurring marine snails distributed across0 码力 | 85 页 | 8.31 MB | 8 月前3
OctoML OSS 2019 11 8Nenana Intel orMicrosof Apple Qualcomm 40+ years of combined experience in computer systems design and machine learning tr tvm 。 @zxnet 和os 全 W Open Source at OctoML ee We0 码力 | 16 页 | 1.77 MB | 5 月前3
TVM: Where Are We GoingSubsystem TPUsTensorization Challenge Compute primitives scalar vector tensor Challenge: Build systems to support emerging tensor instructionsTensorization Challenge C = tvm.compute((m, n),0 码力 | 31 页 | 22.64 MB | 5 月前3
共 7 条
- 1













