文件名称:Web Operations:Keep data on time(网站运维)
文件大小:12.12MB
文件格式:PDF
更新时间:2014-06-16 16:51:55
web operations 网站运维
内容简介 网络应用牵涉到很多专业人土,而网站运维人员必须确保应用的每一部分在其整个生命周期中都能正常工作。当初创公司遭遇了未曾预期的访问流量尖峰,或者当某个新特性导致成熟应用失效时,你就需要这样的专业知识。在这部文章和访谈集中,网站运维老手theo schlossnagle、baron schwartz和alistair croll向这个日新月异的领域提供了他们的真知灼见。你还将学到如何使网站蓬勃发展的秘诀,这是来自·最大规模网站建设者的第一手资料。 ·学习网站运维技能,了解这些技巧来自于经验而非学校教育的原因 ·理解为何从应用程序和基础设施收集统计数据都很重要 ·为数据库架构和规模日益增长带来的隐患考虑通用的处理方法 ·学习如何处理宕机和降级相关的人为因素 ·找到在蜂拥而至的巨大流量后避免灾难的方法 ·问题发生后了解症结所在,防止其再次发生 ·查看全部>>目录foreword preface 1 web operations: the career theo schlossnagle why does web operations have it tough? from apprentice to master conclusion 2 how picnik uses cloud computing: lessons learned justin huff where the cloud fits (and why!) where the cloud doesn't fit (for picnik) conclusion 3 infrastructure and application metrics john aiispaw, with matt massie time resolution and retention concerns locality of metrics collection and storage layers of metrics providing context for anomaly detection and alerts log lines are metrics, too correlation with change management and incident timelines making metrics available to your alerting mechanisms using metrics to guide load-feedback mechanisms a metrics collection system, illustrated: ganglia conclusion 4 continuous deployment eric ries small batches mean faster feedback small batches mean problems are instantly localized small batches reduce risk small batches reduce overhead the quality defenders' lament getting started continuous deployment is for mission-critical applications conclusion 5 infrastructure as code adam jacob service-oriented architecture conclusion 6 monitoring patrick debois story: "the start of a journey" step 1: understand what you are monitoring step 2: understand normal behavior step 3: be prepared and learn conclusion 7 how complex systems fail john aiispaw and richard cook how complex systems fail further reading 8 community management and web operations heather champ and john aiispaw 9 dealing with unexpected traffic spikes brian moon how it all started alarms abound putting out the fire surviving the weekend preparing for the future cdn to the rescue proxy servers ?corralling the stampede streamlining the codebase how do we know it works? the real test lessons learned improvements since then 10 dev and cps collaboration and cooperation paul hammond deployment shared, open infrastructure trust on-call developers avoiding blame conclusion 11 how your visitors feel: user-facing metrics alistair croll and sean power why collect user-facing metrics? what makes a site slow? measuring delay building an sla visitor outcomes: analytics other metrics marketing cares about how user experience affects web cps the future of web monitoring conclusion 12 relational database strategy and tactics for the web baron schwartz requirements for web databases how typical web databases grow the yearning for a cluster database strategy database tactics conclusion 13 how to make failure beautiful: the art and science of postmortems jake loomis the worst postmortem what is a postmortem? when to conduct a postmortem who to invite to a postmortem running a postmortem postmortem follow-up conclusion 14 storage anoop nagwani data asset inventory data protection capacity planning storage sizing operations conclusion 15 nonrelational databases eric florenzano nosql database overview some systems in detail conclusion 16 agile infrastructure andrew clay sharer agile infrastructure so, what's the problem? communities of interest and practice trading zones and apologies conclusion 17 things that go bump in the night (and how to sleep through them) mike christian definitions how many 9s? impact duration versus incident duration datacenter footprint gradual failures trust nobody failover testing monitoring and history of patterns getting a good night's sleep contributors index