OpenAI 《A practical guide to building agents》agents Introduction Large language models are becoming increasingly capable of handling complex, multi-step tasks. Advances in reasoning, multimodality, and tool use have unlocked a new category of LLM-powered where smaller models succeed or fail. In summary, the principles for choosing a model are simple: 01 Set up evals to establish a performance baseline 02 Focus on meeting your accuracy target with the best documents to create LLM-friendly routines. In customer service for example, routines can roughly map to individual articles in your knowledge base. Prompt agents to break down tasks Providing smaller0 码力 | 34 页 | 7.00 MB | 6 月前3
XDNN TVM - Nov 2019Xilinx Cloud DPU Processor (xDNNv3) >> 3 ˃ Configurable Overlay Processor ˃ DNN Specific Instruction Set Convolution, Max Pool etc. ˃ Any Network, Any Image Size ˃ High Frequency & High Compute Efficiency SCHEDULER PE Array PE PE PE PE DISPATCHER ... EXTERNAL MEMORY INSTR FETCHER DECODER REG MAP WB WR SCHEDULER CTRL SIGNALS MISC CALC AVG POOL MAX POOL ROI POOL ELEMENT WISE ... Efficiency Xilinx Inference Flow >> 5 MxNet CPU Layers FPGA Layers Runtime Image Model Weights Calibration Set Quantizer Compiler Tensor Graph Optimization Framework Tensor Graph to Xilinx Tensor Graph Frontend0 码力 | 16 页 | 3.35 MB | 5 月前3
Trends Artificial Intelligence
Intelligence (AI) May 30, 2025 Mary Meeker / Jay Simons / Daegwon Chae / Alexander Krey2 Context We set out to compile foundational trends related to AI. A starting collection of several disparate datapoints Public Launch (Google = 9/98, ChatGPT = 11/22)21 In 1998, tapping emerging Internet access, Google set out to ‘organize the world’s information and make it universally accessible and useful.’ Nearly For TiVo, we use the launch of consumer sales on 3/31/99, when TiVo charged $499 for its 14-hour box set. We do not count TiVo subscription costs. We also use the iPhone 1’s 4GB entry level price of $4990 码力 | 340 页 | 12.14 MB | 4 月前3
Bring Your Own Codegen to TVMDispatch Codegen Built Shared Library runtime::PackedFunc DNNLModule::GetFunction( const std::string& name, const std::shared_ptr& sptr_to_self) { if (name == "init") { return PackedFunc([sptr_to_self this->Init(args[0]); . }); } else { std::string curr_id = GetSubgraphID(name); return PackedFunc([sptr_to_self, curr_id, this](TVMArgs TVMRetValue* rv) { auto out = reinterpret_cast (args[args.size() - 1]>data); std::string encoded_name = kDnnlPrefix + curr_id; . auto func_s = reinter 0 码力 | 19 页 | 504.69 KB | 5 月前3
DeepSeek-V2: A Strong, Economical, and Efficient
Mixture-of-Experts Language Modeland supports a context length of 128K tokens. DeepSeek-V2 adopts innovative architectures including Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guarantees efficient inference through significantly boosts the maximum generation throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-quality and multi-source corpus consisting of 8.1T tokens, and further perform Supervised Fine-Tuning (SFT) and Reinforcement Introduction 4 2 Architecture 6 2.1 Multi-Head Latent Attention: Boosting Inference Efficiency . . . . . . . . . . . . . 6 2.1.1 Preliminaries: Standard Multi-Head Attention . . . . . . . . . . . .0 码力 | 52 页 | 1.23 MB | 1 年前3
Deploy VTA on Intel FPGA2 Moore’s Law is Slowing Down MOTIVATION©2019 HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED 3 Multi-Vendor Support MOTIVATION©2019 HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED 4 Terasic DE10-Nano INTERNATIONAL INDUSTRIES, INCORPORATED 7 Software - Driver Cyclone V & Arria V SoC HPS Physical Memory Map DEPLOY VTA ON INTEL FPGA©2019 HARMAN INTERNATIONAL INDUSTRIES, INCORPORATED 8 Hardware Configure0 码力 | 12 页 | 1.35 MB | 5 月前3
OctoML OSS 2019 11 8truncating division. e Unified Object and Node system for TVM runtime o Lays groundwork forimproved multi-language support for expPosing runtime, and |IRs. QQ octoML Unified Object Protocol vm::Object0 码力 | 16 页 | 1.77 MB | 5 月前3
DeepSeek图解10页PDF更强的长距离依赖建模能力。Transformer 由多个关键组件组成:1. 自注意 力机制(Self-Attention):模型在处理文本时,会自动关注句子中的重要单 词,理解不同词语间的联系。2. 多头注意力(Multi-Head Attention):使用 多个注意力头同时分析不同的语义信息,使得模型的理解能力更强。3. 前 馈神经网络(FFN):非线性变换模块,提升模型的表达能力。4. 位置编码 (Positional0 码力 | 11 页 | 2.64 MB | 8 月前3
开源中国 2023 大模型(LLM)技术报告技术也发挥了关键作用。此外,它还在代码 生成、文本摘要、翻译等任务中展现了强大的通用性。 本报告从技术人视角出发,将深入探讨 LLM 技术的背景、 基础设施、应用现状,以及相关的工具和平台。 2 / 32 LLM Tech Map 向量数据库 数据库向量支持 大模型框架、微调 (Fine Tuning) 大模型训练平台与工具 基础设施 LLM Agent 备案上线的中国大模型 知名大模型 知名大模型应用0 码力 | 32 页 | 13.09 MB | 1 年前3
Google 《Prompt Engineering v7》tokens and what the LLM has seen during its training. When you write a prompt, you are attempting to set up the LLM to predict the right sequence of tokens. Prompt engineering is the process of designing Engineering February 2025 12 • If you set temperature to 0, top-K and top-P become irrelevant–the most probable token becomes the next token predicted. If you set temperature extremely high (above 1–generally predicted token. • If you set top-K to 1, temperature and top-P become irrelevant. Only one token passes the top-K criteria, and that token is the next predicted token. If you set top-K extremely high,0 码力 | 68 页 | 6.50 MB | 6 月前3
共 14 条
- 1
- 2













