Building a Coroutine-Based Job System Without Standard LibrarySYSTEM isa::await_suspend Scheduler Worker Worker Worker Worker Job Queue Job Queue Job Queue Job coroutine_handle Job::Job *isa == initial_suspend_awaitable get_return_object token<> isa Initial_suspend 4546 isa::await_suspend Scheduler Worker Worker Worker Worker Job Queue Job Queue Job Queue Job coroutine_handle Job::Job *isa == initial_suspend_awaitable get_return_object token<> isa Initial_suspend together. 4647 isa::await_suspend Scheduler Worker Worker Worker Worker Job Queue Job Queue Job Queue Job Job::Job *isa == initial_suspend_awaitable get_return_object isa Initial_suspend0 码力 | 120 页 | 2.20 MB | 6 月前3
Performance Engineering: Being Friendly to Your Hardwarea pattern • And logically combine them • Multiple instructions resulting in fewer operations • ISA restrictions may have impact to performance Imaginary ARM mov r20, 0x123456789abcdef0Register renaming MEM Retirement • Execution of operations • With some exceptions • Memory operations • Within ISA memory model restrictionsWhat is speculative? 59 Branching Fetch Decode Queue Allocation Scheduling 0f 1f 84 data16 data16 nop WORD PTR cs:[rax+rax*1+0x0] 129b: 00 00 00 00 00 Scalar base ISA only, no vectorization Memory operations noticeably suboptimal Caching system will step in, with some0 码力 | 111 页 | 2.23 MB | 6 月前3
Just-in-Time Compilation - J F Bastien - CppCon 2020binary translation: execute the program from one Instruction Set Architecture in another (or the same) ISA, performing the translation dynamically. In other words, disassemble the binary as you try to execute isn’t actually x86. * Hardware that can get faster through firmware updates. * A stable-seeming ISA, with hardware that can radically change at each generation.The VLIW’s native instruction set bears — 2017 The marriage of PNaCl and Emscripten / asm.js. With a strong execution model: the virtual ISA is well defined. It pretends to be a modern CPU, but is actually portable. There’s still much more0 码力 | 111 页 | 3.98 MB | 6 月前3
CppCon 2021: Persistent Data Structuresprovides instructions to ensure durability and ordering. Example: ▶ clwb: x86 ISA cacheline writeback ▶ sfence: x86 ISA fence A Persistent Hash Map for Graph Processing Workloads and a Methodology for0 码力 | 56 页 | 1.90 MB | 6 月前3
simd: How to Express Inherent Parallelism Efficiently Via Data-Parallel Typesthe target architecture. Use native_simdwith the TS! • The ABI tag enables support for future ISA extensions without breaking existing code. The dreaded ABI break becomes an ABI addition… Consequence 0 码力 | 160 页 | 8.82 MB | 6 月前3
共 5 条
- 1













