Never Have I Valued Infrastructure More ▪ Things I detest now ▪ Everything outside of my application ▪ Connecting to anything to anything ▪ Updating dependencies ▪ Secrets management ▪ Bash ▪ YAML ▪ Patching ▪ Building kubernetes deployment files (mostly by Googling) ▪ Why my cloud costs are so high @RealGeneKim
The Value Of Platforms ▪ Enable developer productivity ▪ Self-service ▪ On-demand ▪ Immediacy and fast feedback ▪ Focus and flow ▪ Joy ▪ Monitoring, deployment, environment creation, security scans, orchestration… @RealGeneKim
There’s Never Been A Better Time for Infrastructure and Operations @RealGeneKim
Flow: Dr. Mihaly Csikszentmihalyi ● State of Flow ● Two types of learning ● Procedural Learning ● One-shot Learning @RealGeneKim
@RealGeneKim
“What is your lead time for changes?” “How long does it take to go from code committed to code successfully running in production?” @RealGeneKim
Product Design and Development Product Delivery (Build, Test, Deploy) Enable fast flow from development to Create new products and services that solve production and reliable releases by customer problems using hypothesis-driven standardizing work, reducing variability and delivery, modern UX, design thinking batch sizes Feature design and implementation may require Integration, test and deployment must be work that has never been done before performed continuously, as quickly as possible Cycle times should be well-known and Estimates are highly uncertain predictable Outcomes are highly variable Outcomes should have low variability Change Committed Into Version Control @RealGeneKim Source: The DevOps Handbook
Product Design and Development Product Delivery (Build, Test, Deploy) Enable fast flow from development to Create new products and services that solve production and reliable releases by customer problems using hypothesis-driven standardizing work, reducing variability and delivery, modern UX, design thinking batch sizes Feature design and implementation may require Integration, test and deployment must be work that has never been done before performed continuously, as quickly as possible Cycle times should be well-known and Estimates are highly uncertain predictable Outcomes are highly variable Outcomes should have low variability Change Committed Into Version Control @RealGeneKim Source: The DevOps Handbook
Product Design and Development Product Delivery (Build, Test, Deploy) Enable fast flow from development to Create new products and services that solve production and reliable releases by customer problems using hypothesis-driven standardizing work, reducing variability and delivery, modern UX, design thinking batch sizes Feature design and implementation may require Integration, test and deployment must be work that has never been done before performed continuously, as quickly as possible Cycle times should be well-known and Estimates are highly uncertain predictable Outcomes are highly variable Outcomes should have low variability Change Committed Into Version Control @RealGeneKim Source: The DevOps Handbook
What Is The One Question That Predicts Performance With Startling Accuracy? @RealGeneKim
“To what degree do we fear doing deployments?” @RealGeneKim Source: Puppet Labs 2015 State Of DevOps: https://puppetlabs.com/2015-devops-report
The Second Ideal: Focus and Flow ▪ Ideal: when you can implement and test your feature on your Dev laptop, and learn whether it worked in seconds ▪ Not Ideal: when the only way you can determine whether you feature worked is waiting minutes, hours, or days… or weeks… @RealGeneKim
The Second Ideal: Focus and Flow ▪ Ideal: trunk based development ▪ Not Ideal: 5 days merging, 50 people in conference rooms @RealGeneKim
@RealGeneKim
Ideal #3: Improvement Of Daily Work Session ID: @RealGeneKim
Third Ideal: Improvement of Daily Work ▪ Not Ideal: TWWADI ▪ “The Way We’ve Always Done It” ▪ Ideal: MTBTT ▪ “Make Tomorrow Better Than Today” (Google SRE Principle #2) @RealGeneKim
Not Ideal “In manufacturing, the absence of effective feedback often contribute to major quality and safety problems. In one well-documented case at the General Motors Fremont manufacturing plant, there were no effective procedures in place to detect problems during the assembly process, nor were there explicit procedures on what to do when problems were found. “As a result, there were instances of engines being put in backward, cars missing steering wheels or tires, and cars even having to be towed off the assembly line because they wouldn’t start.” @RealGeneKim Source: DevOps Handbook
Ideal Create as much feedback in our system, from as many areas in our system, sooner, faster, and cheaper, with as much clarity between cause and effect. Why? Because the more assumptions we can invalidate, the more we learn, improving our ability to fix problems and innovate. @RealGeneKim Source: DevOps Handbook
@RealGeneKim
How many times per day is the andon cord pulled in a typical day at a Toyota manufacturing plant? 3,500 times per day @RealGeneKim Source: http://www.gembapantarei.com/2008/04/how_many_times_do_you_pull_the_andon_cord_each_day.html
Greatness Isn’t Free… The Need To Pay Down Technical Debt Session ID: @RealGeneKim
Fast Push To Market Features Debts & Risks Defects Quality @RealGeneKim
Fast Push To Market — Continued Defects Defect fixing dominates work Site reliability tanks Slower and slower velocity Debts & Risks Customers leave Morale plunges Devs leave because everything is hard Features Quality @RealGeneKim
Who hasn’t felt this? You hire a bunch of developers, but you still can’t ship the features you promised… …and maybe you even have the feeling that things are slowing down… @RealGeneKim Source: https://twitter.com/johncutlefish/status/1046169469268111361
Risto Siilasma, NOKIA @RealGeneKim Source: The Unicorn Project (2019) / Transforming NOKIA (2019)
Near Death Experiences ● Ebay (1999) ● Microsoft (2002): Bill Gates memo ● Google (2005): Automated testing culture ● Amazon (2004): Jeff Bezos memo ● Twitter (2008) ● LinkedIn (2009) ● Etsy (2009) @RealGeneKim
2002 Microsoft Security Standdown ▪ Famously, Microsoft after SQL Slammer required every product group to freeze feature @RealGeneKim Source: https://www.wired.com/2002/01/bill-gates-trustworthy-computing/
The Feature Freeze / Standdown Features Quality Debt Features Defects @RealGeneKim
@RealGeneKim
Quote from Marty Cagan from his book Inspired The deal [between product owners and] engineering goes like this: Product management takes 20% of the team’s capacity right off the top and gives this to engineering to spend as they see fit. They might use it to rewrite, re-architect, or re- factor problematic parts of the code base…whatever they believe is necessary to avoid ever having to come to the team and say, ‘we need to stop and rewrite [all our code].’ If you’re in really bad shape today, you might need to make this 30% or even more of the resources. However, I get nervous when I find teams that think they can get away with much less than 20%. Cagan notes that when organizations do not pay their “20% tax,” technical debt will increase to the point where an organization inevitably spends all of its cycles paying down technical debt. At some point, the services become so fragile that feature delivery grinds to a halt because all the engineers are working on reliability issues or working around problems. @RealGeneKim
The Third Ideal: Enabling Greatness ▪ Ideal: 3-5% of developers dedicated to improving developer productivity ▪ Google: likely 1,500+ devs ($1B+) ▪ Microsoft: likely over 3,000 devs ▪ Not ideal: assigned to summer interns and “people not good enough to be developers” @RealGeneKim
There cannot be a more important thing for an engineer, for a product team, than to work on the systems that drive our productivity. So I would, any day of the week, trade off features for our own productivity. I want our best engineers to work on our engineering systems, so that we can later on come back and build all of the new concepts we want. - Satya Nadella @RealGeneKim Source: Satya Nadella, CEO, Microsoft (@satyanadella)
Breaking The Bottlenecks In The Flow ▪ Environment creation ▪ Code deployment ▪ Test setup and run (mention @rohansingh) ▪ Overly tight architecture ▪ Development ▪ Product management @RealGeneKim
Google Dev And Ops (2013) ▪ 15,000 engineers, working on 4,000+ projects ▪ All code is checked into one source tree (billions of files!) ▪ 5,500 code commits/day ▪ 75 million test cases are run daily "Automated tests transform fear into boredom." -- Eran Messeri, Google @RealGeneKim
@RealGeneKim
The Third Ideal: Improvement ▪ Not Ideal: No one cares if someone breaks the build, or checks in code that breaks our tests ▪ Ideal: When someone breaks our build or our tests, fixing it becomes the most important work of the moment @RealGeneKim
The Third Ideal: Improvement ▪ Not ideal: When someone needs a peer review, that person has to wait until someone else frees up ▪ Ideal: Whatever I’m working on, if someone needs a peer review, I drop whatever I’m doing to help @RealGeneKim
@RealGeneKim
@RealGeneKim
Ideal #4: Psychological Safety Session ID: @RealGeneKim
DevOps Enterprise: Lessons Learned ▪ In 2019, we’ll hold the sixth year of the DevOps Enterprise Summit, a conference for horses, by horses ▪ Over the years, we’ve had nearly 350 leaders from: ▪ Capital One, KeyBank, Barclays, GE Capital, ING Bank, Fidelity, PNC, ADP, BofA, Western Union, BBVA ▪ Nationwide Insurance, Zurich Insurance, Allstate, Hiscox, Aviva, LV= ▪ Walmart, Nordstrom, Target, Macy’s, Marks and Spencer ▪ Nike, Adidas, Sherwin Williams ▪ Verizon, Telstra, T-Mobile, Orange, CSG ▪ Raytheon, Lockheed Martin, Northrop Grumman, CSRA, Jaguar Land Rover, Fiat/Chrysler, Cisco ▪ Disney, Ticketmaster, NBC/Universal, Comcast ▪ Kaiser Permanente ▪ US Citizenship & Immigration Services, UK HM Revenue Collection, DISA Forge.mil, NZ Ministry of Social Development, UK Welfare and Pensions, US Joint Warfare Analysis Center ▪ Amazon PrimeNow, CA, Compuware, Google Search, IBM, MicroFocus, Microsoft, SAP @RealGeneKim
@RealGeneKim
@RealGeneKim Source: Puppet/DORA: 2017 State Of DevOps Report: https://puppet.com/resources/whitepaper/state-of-devops-report
One Of The Highest Predictors Of Performance @RealGeneKim Source: Typology Of Organizational Culture (Westrum, 2004)
One Of The Highest Predictors Of Performance @RealGeneKim Source: Typology Of Organizational Culture (Westrum, 2004)
One Of The Highest Predictors Of Performance @RealGeneKim Source: Typology Of Organizational Culture (Westrum, 2004)
Google: Project Aristotle, Oxygen, re:Work @RealGeneKim Source: https://rework.withgoogle.com/blog/five-keys-to-a-successful-google-team/
Great Practices Enabled ▪ Blameless post-mortems ▪ Chaos Monkeys @RealGeneKim
Modeling Continual Learning ▪ “When adult learners start trying to learn a new skill, they will often do it in private, because of the embarrassment associated with doing something they’re not good at.” ▪ We can help by saying “I don’t know" @RealGeneKim
Ideal #5: Customer Focus Session ID: @RealGeneKim
@RealGeneKim Courtesy: Compuware (Chris O’Malley, @chris_t_omalley; Jim Bryan, @jimbryan82)
The Fifth Ideal: Focus On The Customer ▪ Core vs. Context ▪ Enabled reallocation of $8MM back into R&D @RealGeneKim
The Fifth Ideal: Focus On The Customer ▪ Not ideal: Functional silo managers prioritize silo goals over business goals ▪ Ideal: Functional silo managers make decisions based on what the customer values, and helps ensure their teams have the skills to thrive in the long term @RealGeneKim
Why Do I Think This Is Important? @RealGeneKim
Recommend
More recommend