 A New Dragon in the Den: Fast Conversion From Floating-Point Numbersi7 (10510U) - gcc 13.2.1 Benchmarks - centred (a)Benchmarks - centred (b) teju x dragonbox wins ties losses 99.5% 0.0% 0.5% Intel i7 (10510U) - gcc 13.2.1Benchmarks - centred (b) teju x dragonbox 5% 0.0% 0.5% teju x ryu wins ties losses 100.0% 0.0% 0.0% Intel i7 (10510U) - gcc 13.2.1Benchmarks - centred (b) teju x dragonbox wins ties losses 99.5% 0.0% 0.5% teju x ryu wins ties (10510U) - gcc 13.2.1 Benchmarks - uncentred (b)Benchmarks - uncentred (a) teju x dragonbox wins ties losses 38.2% 0.0% 61.8% Intel i7 (10510U) - gcc 13.2.1Benchmarks - uncentred (a) teju x0 码力 | 171 页 | 6.42 MB | 6 月前3 A New Dragon in the Den: Fast Conversion From Floating-Point Numbersi7 (10510U) - gcc 13.2.1 Benchmarks - centred (a)Benchmarks - centred (b) teju x dragonbox wins ties losses 99.5% 0.0% 0.5% Intel i7 (10510U) - gcc 13.2.1Benchmarks - centred (b) teju x dragonbox 5% 0.0% 0.5% teju x ryu wins ties losses 100.0% 0.0% 0.0% Intel i7 (10510U) - gcc 13.2.1Benchmarks - centred (b) teju x dragonbox wins ties losses 99.5% 0.0% 0.5% teju x ryu wins ties (10510U) - gcc 13.2.1 Benchmarks - uncentred (b)Benchmarks - uncentred (a) teju x dragonbox wins ties losses 38.2% 0.0% 61.8% Intel i7 (10510U) - gcc 13.2.1Benchmarks - uncentred (a) teju x0 码力 | 171 页 | 6.42 MB | 6 月前3
 Template-Less Meta-Programmingmd 48 / 58Benchmarks - Benchmarks - https://qlibs.github.io/mp https://qlibs.github.io/mp 49 / 5850 / 5851 / 5852 / 5853 / 5854 / 5855 / 58Benchmarks Benchmarks 56 / 58Benchmarks Benchmarks Circle-lang to compile all around Circle-lang meta model is the fastest to compile all around 56 / 58Benchmarks Benchmarks Circle-lang meta model is the fastest to compile all around Circle-lang meta model is the recursive template instantiations ( ) boost.mp11 boost.mp11 std::tuple std::tuple 56 / 58Benchmarks Benchmarks Circle-lang meta model is the fastest to compile all around Circle-lang meta model is the0 码力 | 130 页 | 5.79 MB | 6 月前3 Template-Less Meta-Programmingmd 48 / 58Benchmarks - Benchmarks - https://qlibs.github.io/mp https://qlibs.github.io/mp 49 / 5850 / 5851 / 5852 / 5853 / 5854 / 5855 / 58Benchmarks Benchmarks 56 / 58Benchmarks Benchmarks Circle-lang to compile all around Circle-lang meta model is the fastest to compile all around 56 / 58Benchmarks Benchmarks Circle-lang meta model is the fastest to compile all around Circle-lang meta model is the recursive template instantiations ( ) boost.mp11 boost.mp11 std::tuple std::tuple 56 / 58Benchmarks Benchmarks Circle-lang meta model is the fastest to compile all around Circle-lang meta model is the0 码力 | 130 页 | 5.79 MB | 6 月前3
 Designing a Slimmer Vector of Variantscandidate designs, refining as we go, and then presents interesting implications of the approach, benchmarks, and lessons learned • I implemented the data structure mostly for the fun of it, as well as to Finance L.P. All rights reserved. Benchmarks! 56© 2024 Bloomberg Finance L.P. All rights reserved. Benchmarks! 57© 2024 Bloomberg Finance L.P. All rights reserved. Benchmarks! 58© 2024 Bloomberg Finance Finance L.P. All rights reserved. Benchmarks! 59© 2024 Bloomberg Finance L.P. All rights reserved. Benchmarks! 60© 2024 Bloomberg Finance L.P. All rights reserved. Benchmarks! 61© 2024 Bloomberg Finance0 码力 | 64 页 | 1.98 MB | 6 月前3 Designing a Slimmer Vector of Variantscandidate designs, refining as we go, and then presents interesting implications of the approach, benchmarks, and lessons learned • I implemented the data structure mostly for the fun of it, as well as to Finance L.P. All rights reserved. Benchmarks! 56© 2024 Bloomberg Finance L.P. All rights reserved. Benchmarks! 57© 2024 Bloomberg Finance L.P. All rights reserved. Benchmarks! 58© 2024 Bloomberg Finance Finance L.P. All rights reserved. Benchmarks! 59© 2024 Bloomberg Finance L.P. All rights reserved. Benchmarks! 60© 2024 Bloomberg Finance L.P. All rights reserved. Benchmarks! 61© 2024 Bloomberg Finance0 码力 | 64 页 | 1.98 MB | 6 月前3
 When Lock-Free Still Isn't Enough: An Introduction to Wait-Free Programming and Concurrency Techniquesimplications • An example of an elegant wait-free algorithm and wait-free design • Some simple benchmarks Some assumed knowledge • You know what std::atomic does and what it is used for • You’ve heard performance. Never guess. The rest of this talk: • How to guess about performance • We’ll do some benchmarks too I promise10 Progress guarantees • Progress guarantees are a way to theoretically categorize this step Solution: One and only one decrement must “take credit” for zeroing the counter29 Benchmarks • My atomic When Lock-Free Still Isn't Enough: An Introduction to Wait-Free Programming and Concurrency Techniquesimplications • An example of an elegant wait-free algorithm and wait-free design • Some simple benchmarks Some assumed knowledge • You know what std::atomic does and what it is used for • You’ve heard performance. Never guess. The rest of this talk: • How to guess about performance • We’ll do some benchmarks too I promise10 Progress guarantees • Progress guarantees are a way to theoretically categorize this step Solution: One and only one decrement must “take credit” for zeroing the counter29 Benchmarks • My atomic- implementation using the wait-free counter versus the lock-free counter 0 码力 | 33 页 | 817.96 KB | 6 月前3
 MACRO-FREE TESTING WITH C++2010 / 14BENCHMARKS - BENCHMARKS - 10'000 TESTS, 20'000 ASSERTS, 100 CPP FILES 10'000 TESTS, 20'000 ASSERTS, 100 CPP FILES SUITE+ASSERT+STL SUITE+ASSERT+STL 11 / 14BENCHMARKS - BENCHMARKS -0 码力 | 53 页 | 1.98 MB | 6 月前3 MACRO-FREE TESTING WITH C++2010 / 14BENCHMARKS - BENCHMARKS - 10'000 TESTS, 20'000 ASSERTS, 100 CPP FILES 10'000 TESTS, 20'000 ASSERTS, 100 CPP FILES SUITE+ASSERT+STL SUITE+ASSERT+STL 11 / 14BENCHMARKS - BENCHMARKS -0 码力 | 53 页 | 1.98 MB | 6 月前3
 CppCon 2021: Persistent Data Structuresto store everything else (e.g. code) ▶ The OS is Ubuntu 18.04 LTS ▶ The application and micro-benchmarks were compiled using gcc 7.4 with the -O3 optimization flag and C++14 standard flags A Persistent Persistent Transactional Data Structures Live Demonstration References Experimental Setup Micro-benchmarks ▶ Operation ratio for write-dominated workload ▶ Lists: 40% Insert, 40% Delete, 20% Find ▶ Map: Persistent Transactional Data Structures Live Demonstration References Demonstration Settings Micro-benchmarks ▶ Operation ratio: 33% Insert, 33% Delete, 34% Find ▶ Number of Transactions: 10K ▶ Key Range:0 码力 | 56 页 | 1.90 MB | 6 月前3 CppCon 2021: Persistent Data Structuresto store everything else (e.g. code) ▶ The OS is Ubuntu 18.04 LTS ▶ The application and micro-benchmarks were compiled using gcc 7.4 with the -O3 optimization flag and C++14 standard flags A Persistent Persistent Transactional Data Structures Live Demonstration References Experimental Setup Micro-benchmarks ▶ Operation ratio for write-dominated workload ▶ Lists: 40% Insert, 40% Delete, 20% Find ▶ Map: Persistent Transactional Data Structures Live Demonstration References Demonstration Settings Micro-benchmarks ▶ Operation ratio: 33% Insert, 33% Delete, 34% Find ▶ Number of Transactions: 10K ▶ Key Range:0 码力 | 56 页 | 1.90 MB | 6 月前3
 Performance Mattersacross the whole benchmark suite evaluation of LLVM’s optimizations with STABILIZER first, build benchmarks with STABILIZERBuild programs with STABILIZER > szc main.c> szc main.c Build programs with Build programs with STABILIZER now run the benchmarks0% 10% 20% 30% 40% 85.0 87.5 90.0 92.5 95.0 Time (s) Percent of Observed Runtimes Run benchmarks as usual A′ A ×30 ×30 drop the results0 码力 | 197 页 | 11.90 MB | 6 月前3 Performance Mattersacross the whole benchmark suite evaluation of LLVM’s optimizations with STABILIZER first, build benchmarks with STABILIZERBuild programs with STABILIZER > szc main.c> szc main.c Build programs with Build programs with STABILIZER now run the benchmarks0% 10% 20% 30% 40% 85.0 87.5 90.0 92.5 95.0 Time (s) Percent of Observed Runtimes Run benchmarks as usual A′ A ×30 ×30 drop the results0 码力 | 197 页 | 11.90 MB | 6 月前3
 Hidden Overhead of a Function APIcompare performance? ● Benchmarks at this low level are not too reliable, and also don’t represent performance in large projects well. 11How will we compare performance? ● Benchmarks at this low level are Dynamic instruction count is more reliable on modern CPUs. 12How will we compare performance? ● Benchmarks at this low level are not too reliable, and also don’t represent performance in large projects0 码力 | 158 页 | 2.46 MB | 6 月前3 Hidden Overhead of a Function APIcompare performance? ● Benchmarks at this low level are not too reliable, and also don’t represent performance in large projects well. 11How will we compare performance? ● Benchmarks at this low level are Dynamic instruction count is more reliable on modern CPUs. 12How will we compare performance? ● Benchmarks at this low level are not too reliable, and also don’t represent performance in large projects0 码力 | 158 页 | 2.46 MB | 6 月前3
 Algorithmic Complexityto perform the two operations inside a single loop? It might be better due to data locality see benchmarks with std::list and std::vector (and see also SO discussion with additional alternatives).Algorithmic prediction / other accelerations: This is one of the most famous questions in SO On the other hand, benchmarks are quite confusing… - a benchmark without optimization (not a good way to benchmark) - a benchmark0 码力 | 52 页 | 1.01 MB | 6 月前3 Algorithmic Complexityto perform the two operations inside a single loop? It might be better due to data locality see benchmarks with std::list and std::vector (and see also SO discussion with additional alternatives).Algorithmic prediction / other accelerations: This is one of the most famous questions in SO On the other hand, benchmarks are quite confusing… - a benchmark without optimization (not a good way to benchmark) - a benchmark0 码力 | 52 页 | 1.01 MB | 6 月前3
 Continuous Regression Testing for Safer and Faster Refactoringstudent.gpa, touca::decimal_rule::min_absolute(3));57 Aurora Innovation Tracking performance benchmarks $ touca plugin add plugins://google_benchmark $ touca google_benchmark output.json #include 40, "mhz_per_cpu": 2801, "cpu_scaling_enabled": false, "build_type": "debug" }, "benchmarks": [ { "name": "BM_String", "iterations": 94877, "real_time": 29275,0 码力 | 85 页 | 11.66 MB | 6 月前3 Continuous Regression Testing for Safer and Faster Refactoringstudent.gpa, touca::decimal_rule::min_absolute(3));57 Aurora Innovation Tracking performance benchmarks $ touca plugin add plugins://google_benchmark $ touca google_benchmark output.json #include 40, "mhz_per_cpu": 2801, "cpu_scaling_enabled": false, "build_type": "debug" }, "benchmarks": [ { "name": "BM_String", "iterations": 94877, "real_time": 29275,0 码力 | 85 页 | 11.66 MB | 6 月前3
共 77 条
- 1
- 2
- 3
- 4
- 5
- 6
- 8
相关搜索词
 NewDragonintheDenFastConversionFromFloatingPointNumbersTemplateLessMetaProgrammingDesigningSlimmerVectorofVariantsWhenLockFreeStillIsnEnoughAnIntroductiontoWaitandConcurrencyTechniquesMACROFREETESTINGWITHC++20CppCon2021PersistentDataStructuresPerformanceMattersHiddenOverheadFunctionAPIAlgorithmicComplexityContinuousRegressionTestingforSaferFasterRefactoring













