OpenMetrics - Standing on the shoulders of TitansPeople Acknowledgements Main work has been done by Prometheus team Ben Kochie Brian Brazil myself Google Sumeer Bhola Uber Jerome Froelich Rob Skillington Richard Hartmann, RichiH@{freenode,OFTC,IRCnet} OpenMetrics Outro People First commitments, too many for full list Cloudflare CNCF at large GitLab Google Grafana InfluxData Prometheus ;) RobustPerception SpaceNet Uber Richard Hartmann, RichiH@{freenode support since 0.4.0 Test your own OM output: robustperception.io/checking-openmetrics-output-is-valid Google and Uber want to create another reference parser to weed out bugs Richard Hartmann, RichiH@{freenode0 码力 | 21 页 | 84.83 KB | 1 年前3
Prometheus Deep Dive - Monitoring. At scale.Prometheus Deep Dive Introduction Intro 2.0 to 2.2.1 2.4 - 2.6 Beyond Outro Prometheus 101 Inspired by Google’s Borgmon Time series database int64 timestamp, float64 value Ecosystem of instrumentation & exporters What do? We are spinning out Prometheus’ exposition format Face-to-face kick-off last August at Google London Independent CNCF member project, IETF RFC, test suite, etc We are writing code in Prometheus Outro OpenMetrics First committers to adopt, too many to list all Cloudflare CNCF at large GitLab Google Grafana InfluxData Kausal.co Oath.com / Yahoo / Verizon RobustPerception SpaceNet Uber Richard Hartmann0 码力 | 34 页 | 370.20 KB | 1 年前3
Intro to Prometheus - With a dash of operations & observabilityPrometheus Introduction Background Operations & observability Outro Prometheus 101 Inspired by Google’s Borgmon Time series database unit64 millisecond timestamp, float64 value Instrumentation & exporters0 码力 | 19 页 | 63.73 KB | 1 年前3
1.6 利用夜莺扩展能力打造全方位监控系统如果贵司的业务强依赖IT技术,IT故障会直接影响营业收入, 稳定性体系一定要重视起来,而监控,就是稳定性体系中至 关重要的一环 运维监控需求来源 01.监控的原始需求来自业务稳定性 左图是2013年的一个新闻,讲 Google宕机的影响。2020年也出现 过aws大规模宕机的情况,影响不 止是55万美元,直接影响大半个 互联网! 2018年有美国调研机构指出,如 果服务器宕机1分钟,银行会损失 27万美元,制造业会损失42万美0 码力 | 40 页 | 3.85 MB | 1 年前3
共 4 条
- 1













