AI Performance - IT文库_程序员IT互联网编程电子书和文档免费下载，助您码力十足！

首页文库资料文章资讯上传文档发布文章登录账户

Performance Matters

PERFORMANCE MATTERS (joint work with Charlie Curtsinger, Grinnell College) emeryberger.com, @emeryberger Emery Berger College of Information and Computer Sciences UMASS AMHERSTA short time ago : un.bmp Ogle is too slow! OGLE’84 is too slow!Transistors (millions) Clock Speed (MHz) Performance used to be easy 0.001 0.01 0.1 1 10 100 1,000 10,000 1970 1975 1980 1985 1990 1995 gle loading… No mojitos for me… Back to the present…Transistors (millions) Clock Speed (MHz) Performance not easy anymore 0.001 0.01 0.1 1 10 100 1,000 10,000 1970 1975 1980 1985 1990 1995

0 码力 | 197 页 | 11.90 MB | 6 月前
3
Performance Engineering: Being Friendly to Your Hardware

Being Friendly to Your Hardware Performance Engineering A gentle introduction to hardware for software engineers 2Where does C++ run? 3On an abstract C++ machine 4On an abstract C++ machine? In most practical cases at boot time only Same capacity, different composition => different performance profile From JESD 79-4 DDR4 specificationMemory • Memory system is in the uncore • Cores act Multiple instructions resulting in fewer operations • ISA restrictions may have impact to performance Imaginary ARM mov r20, 0x123456789abcdef0Register renaming 52 Branching Fetch Decode Queue

0 码力 | 111 页 | 2.23 MB | 6 月前
3
Modern C++ for Parallelism in High Performance Computing

Poster submission: Modern C++ for Parallelism in High Performance Computing Victor Eijkhout CppCon 2024 Introduction This poster reports on ‘D2D’, a benchmark that explores elegance of expression and context of a High Performance Computing ‘mini-application’. The same code has been implemented using a number of different approaches to parallelism. Implementations are discussed with performance results. Relevance multi-dimensional arrays through ‘mdspan’, it is interesting to explore what C++ can offer for lower level performance critical operations. Scientific computing is an interesting test cases since many algorithms are

0 码力 | 3 页 | 91.16 KB | 6 月前
3
High-Performance Numerical Integration in the Age of C++26

Introduction Firsts steps Context Theoretical foundations Outline of an implementation Conclusion High-Performance Numerical Integration in the Age of C++26 Vincent Reverdy Laboratoire d’Annecy de Physique des past, other languages do far better in terms of everything: functionality, ease of use, and even performance This talk The goal is NOT to revolutionize everything or show a library that beats everything algorithms Runge-Kutta Methods (RK) yn+1 = yn + h s � i=1 biki ki = f(tn + cih, yn + (ai1k1 + ai2k2 + · · · + ai,i−1ki−1)h) Linear Multistep Methods (LLM) yn+s + as−1 · yn+s−1 + as−2 · yn+s−2 + · ·

0 码力 | 57 页 | 4.14 MB | 6 月前
3
Powered by AI: A Cambrian Explosion for C++ Software Development Tools

`University of Massachusetts Amherst Powered by AI:  A Cambrian Explosion  for C++ Software Development Tools Emery BergerCretaceous–Paleogene (K-Pg) extinction eventCretaceous–Paleogene (K-Pg) extinction ALLOCATED MEMORY USAGE GPU UTIL %, PEAK MEMORY (MB/s) MEMORY PYTHON NATIVE AI-powered optimizations!AI-powered optimizations... COMING SOON!evolveevolve profiler that suggests optimizationsevolve

0 码力 | 128 页 | 23.40 MB | 6 月前
3
3.云原生边云协同AI框架实践

云原生边云协同AI框架实践普杰华为云边缘云创新Lab 高级工程师 KubeEdge SIG AI Tech Lead 目录 Edge AI现状与趋势 01 Sedna：边云协同AI框架 02 Sedna-GM：K8S Operator 03 实践案例 04 Edge AI现状与趋势第一部分 Why Edge AI？ • Cloud中心化的AI计算范式不足以应对端上AI 应用对实时性、准确性和强交互性的需求 devices Edge AI • 随着大模型的发展，AI 计算对算力需求大幅且快速增长 AI应用到越来越多的边缘场景分布式协同AI 概念将人工智能相关的部分任务部署到边缘设备，基于边缘设备、边缘服务器、云服务器，利用分布式乃至分布式协同方式实现人工智能的技术数据在边缘产生边侧逐步具备AI能力分布式协同AI 核心驱动力分布式协同AI核心驱动力 • 随着边侧算随着边侧算力逐步强化，边缘AI持续演变至分布式协同AI 分布式协同AI技术挑战 1. 边缘资源碎片化 2. 边缘数据孤岛 3. 边缘样本少 4. 边缘数据异构分布式协同AI 技术挑战边云协同AI框架第二部分首个分布式协同AI开源项目Sedna 基于KubeEdge提供的边云协同能力，支持现有AI类应用无缝下沉到边缘为分布式协同机器学习服务 ✓ 降低构建与部署成本 ✓ 提升模型性能

0 码力 | 37 页 | 2.36 MB | 1 年前
3
Nim - the first high performance language with full support for hot codereloading at runtime

Nim - the first high performance Nim - the first high performance language with full support for hot code- language with full support for hot code- reloading at runtime reloading at runtime by Viktor 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 18 Simply Nim Simply Nim statically typed high performance (compiles to native binaries - comparable to C/C++) very clean & elegant - no, beauty is NOT subjective next big thing: - Andrei Alexandrescu Nim is one of the most logical paths forward on-par performance with C/C++ (compiles to them) some of the most easy interop with C/C++ ........ (compiles to them)

0 码力 | 63 页 | 2.91 MB | 1 年前
3
Writing Python Bindings for C++ Libraries: Easy-to-use Performance

volume in terabytes ● Program analysis research and functional programming in a past life ● Love performance, software abstractions, and clean APIsWhy Python? ● Writing extensive APIs in Python - low boilerplate We’re at CppCon :) Why Python? Why C++?● Why? ○ Avoid reimplementing complex code for Python ○ Performance ○ Back and forth with user’s python code ○ Interoperability with data structures in Python - things: ○ Deal with actual pointers and C++ data types ● The compiled program keeps most of the performance and dynamism of an interpreted language, and: ○ is now a C++ .so ○ is not an interpreted scriptCython

0 码力 | 118 页 | 2.18 MB | 6 月前
3
High-Performance Cross-Platform Architecture: C++20 Innovations

is moved into general-purpose registers for computations • Depending on the platform, may see a performance gain at this stageQuat Functions template inline

0 码力 | 75 页 | 581.83 KB | 6 月前
3
Symbolic Calculus for High-Performance Computing: From Scratch Using C++23

Binding Constraints Architecture Substitution Construction Conclusion Symbolic Calculus for High-Performance Computing from Scratch using C++23 Vincent Reverdy Laboratoire d’Annecy de Physique des Particules all know about optimization, performance, parallelism, . . . What this talk is not about Complicated maths (you are smart people, you can do it yourself) High-performance computing (you all know about concepts Symbolic calculus (derivatives, integrals) Full blown custom rule-based rewriting High-performance Since formulas have the entire information on the mathematical AST, it’s possible to generate

0 码力 | 70 页 | 1.80 MB | 6 月前
3

共 1000 条前往

页

分类

语言

格式

Performance Matters

Performance Engineering: Being Friendly to Your Hardware

Modern C++ for Parallelism in High Performance Computing

High-Performance Numerical Integration in the Age of C++26

Powered by AI: A Cambrian Explosion for C++ Software Development Tools

3.云原生边云协同AI框架实践

Nim - the first high performance language with full support for hot codereloading at runtime

Writing Python Bindings for C++ Libraries: Easy-to-use Performance

High-Performance Cross-Platform Architecture: C++20 Innovations

Symbolic Calculus for High-Performance Computing: From Scratch Using C++23