DevOps
Culture, Agile Development, & Cloud Native Technologies
DevOps is a recognition that Development and Operations needs to stop working alone in their "siloed" towers and start working together.
To do this we need:
A culture of collaboration valuing openness, trust, and transparency
An application design that does not require entire systems to be redeployed just to add a single function
An application design that does not require entire systems to be redeployed just to add a single function
A dynamic software-defined, programmable platform to continuously deploy onto
Traditional Waterfall Development
Requirements -> Design -> Code -> Integration -> Test -> Deploy
In traditional waterfall development: 1. Each step ended when the next begins; 2. Mistakes found in the later stages are more expensive to fix; 3. No provisions for changing requirements
Problem with this approach:
No provisions for changing requirements
Because all of the teams worked separately, the development team was not always aware of operational roadblocks that might prevent the program from working as anticipated
The people the furthest from the code who knew the least about it were deploying it into production
Extreme Programming
In 1996, Kent Beck introduced Extreme Programming based on an interactive approach to software development. It was intended to improve software quality and responsiveness to changing customer requirements. It was one of the first agile methods.
Agile Manifesto
In 2001, seventeen software developers met at a resort in Snowbird, Utah to discuss these lightweight development methods. Together, they published the Manifesto for Agile Software Development.
Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan
That is, while there is value in the items on the right, we value the items on the left more.
Agile Development
Cycle: ... -> Requirements -> Plan -> Design -> Develop -> Release -> Track & Monitor -> ...
Requirements and solutions evolve through the collaborative effort of self-organizing and cross-functional teams and their customers
It advocates adaptive planning, evolutionary development, early delivery, and continual improvement
It encourages rapid and flexible response to change
Agile Dilemma / Why Agile alone not good enough?
While Agile improved the speed and accuracy of software for developers, it did nothing for operations. Many development team just got frustrated by ops not being able to deliver at the speed of development.
How Agile are your teams?
Working as an Agile Team
Iterative Sprints
Groomed Backlogs
Customer Stories
2 Week Deliverables
Agile Antipatterns
You will fail if you...
Lack of real Product Owner
If your teams are too large
If your teams and not dedicated
If your teams are geographically distributed
If your teams siloed
If your teams are not self managing
The history reminds us that DevOps is:
From the practitioners, by practitioners
Not a product, specification, job title
An experience-based movement
Decentralized and open to all
Goal of Microservice is Agility
Smart experimentation
Moving in-market with maximum velocity and minimum risk
Gaining quick valuable insight to continuously change the value proposition and quality
DevOps Thinking / Tenets
Social Coding
Behavior and Test Driven Development
Working in small batches
Build Minimum Viable Products for gaining insights
Failure leads to understanding
Think Cloud Native
The Twelve-Factor App describes patterns for cloud-native architectures which leverage microservices
Applications are design as a collection of stateless microservices
State is maintained in separate databases and persistent object stores
Resilience and horizontal scaling is achieved through deploying multiple instances
Failing instances are killed and re-spawned, not debugged and patched (cattle not pets)
DevOps pipelines help manage continuous delivery of services
Think Microservices
The microservice architectural style is an approach to developing a single application as a suite of small services, each running in its own process and communicating with lightweight mechanisms, often an HTTP resource API. These services are built around business capabilities and independently deployable by fully automated deployment machinery.
Design for Failure
Embrace failures: they will happen! Move from "How to avoid" —> "How to identify & what to do about it". Move from "Pure operational concern" —> "developer concern".
External calls to other services that you don’t control are especially prone to problems:
Use separate thread pools
Time out quickly
Circuit breaker pattern: identify problem and do something about it to avoid cascading failures
Bulkhead pattern: Isolation from start to limit scope of failure (separate thread pools)
Monkey testing: test by breaking (yes, on purpose! see: Netflix Chaos Monkey and Simian Army)
Think Continuous Automation
Automation is speed + repeatability.
Continuous Integration (CI)
Continuous Delivery (CD)
Build Automation
Canary Rollouts / Blue-Green
Failing Forward
Application Release Automation
Working DevOps
Facilitate a culture of teaming and collaboration
Establish agile development as a shared discipline
Automate relentlessly to enable rapid DevOps response
Push smaller releases faster, measure and remediate impact
Taylorism
Named after the US industrial engineer Frederick Winslow Taylor (1856-1915) who in his 1911 book 'Principles Of Scientific Management' laid down the fundamental principles of large-scale manufacturing through assembly-line factories.
Adoption of Command and Control Management: the dominant method of management in the Western world
Organizations divided into (ostensibly) independent functional silos: organizations divided into (ostensibly) independent functional silos
Decision-making is separated from work: managers do the planning and decide what workers should do; workers mindlessly do the tasks they are ask to accomplish
Taylorism is not appropriate for Craft Work:
Taylorism may have been good during the industrial revolution, but not so much in the technology revolution
Taylorism may have been good during the industrial revolution, but not so much in the technology revolution
The people power requirements have been lowered
Taylorism is not appropriate for "knowledge" work like software development
Required DevOps behaviors
Organizational silos and hand-offs
Shared ownership and high collaboration
Fear of change
Risk management by embracing change
Build once, hand crafted "snow flakes"
Ephemeral infrastructure as code
Manual fulfillment
Automated Self Service
Alarms, call-backs, and escalations
Feedback loops and data driven responses
How DevOps Manages Risk
Deployment is king
Deployment must be painless
You have deployed the same thing several times before it gets to production
Deployment is decoupled from activation
Risk is managed via activation controls (blue-green deploys with canary testing)
10% of the user base, waves of activation, etc.
Deployment is not “one size fits all”
It is a rail yard of interconnecting steps and microprocesses
Ephemeral Infrastructure
Cattle, not Pets
Servers are built on demand via automation
Logging into the server is seen as failure
Release through parallel infrastructure
Build the new version on new infrastructure; stage transition between environments (zero downtime)
Transient Infrastructure
Throw away when it is no longer needed
This eliminates entropy - a major source of failure
Immutable Delivery
Applications are packaged in containers
Same container that developer runs on their laptop runs in production
Rolling updates with immediate roll-back
No variance limits side-effects
Dependencies are contained
Zero-Downtime Deployments
Blue-green deployment is a zero-downtime deployment technique that consists of two nearly identical production environments, called Blue and Green.
They differ by the artifacts that the developer has intentionally changed, typically by the version of the application. At any given time, at least one of the environments is active.
Using the blue-green deployment technique, you can realize the following benefits:
Take software quickly from the final stage of testing to live production.
Deploy a new version of an application without disrupting traffic to the application.
Rollback rapidly. If there is something wrong with one of your environments, you can quickly switch to the other environment.
Spotify Case Study
Organizational Structure
Squads are grouped into Tribes (light-weight matrix)
Chapters of competency areas are formed across Squads
Guilds are informal light-weight community of interests across the company
Autonomous Squads
Each Squad has its own mission aligned with the business
Feels like a "mini-startup"
Self Organizing / Cross-functional
5-7 engineers, less than 10
Squads have end-to-end responsibility for what they build
Build, commit, deploy, maintenance, operations, EVERYTHING!
With a long term mission usually around a single business domain
DevOps Organizational Objective
Shared Consciousness with Distributed (local) Control
Actions v.s. Consequences: Functional Silos Breed Bad Behavior
Bad behavior arises when you abstract people away from the consequences of their actions.
Functional silos abstract people away from the consequences of their actions.
For example: By adding a QA Team, developers are abstracted away from the consequences of writing buggy code.
Actions have Consequences
Make people aware of the consequences of their actions
Create cross-functional teams - or -
Have developers rotate through operations teams
Have operations people attend developer standups and showcases
Make people responsible for the consequences of their actions
Having developers on Pager Duty, or own the SLA for the products and services they build
DevOps Measurement / Metrics
DevOps changes the objective of the measurement from Mean Time To Failure (MTTF, make sure you never go down) -> Mean Time To Recovery (MTTR, you will go down, make sure you can recover quickly)
Metrics:
A BASELINE provides a concrete number for comparison as you implement your DevOps changes:
It currently requires six team members 10 hours to deploy a new release of our product.
This costs us $X for every release
Metric GOALS allow you to reason about these numbers and judge the success of your transition process:
Reduce deployment time from 10 hours to 2 hours.
Increase percentage of defects detected in testing from 25% to 50%
Actionable Metric Examples
Reduce time-to-market for new features.
Increase overall availability of the product.
Reduce the time it takes to deploy a software release.
Increase the percentage of defects detected in testing before production release.
Make more efficient use of hardware infrastructure.
Provide performance and user feedback to the product manager in a more timely manner
Top 4 Actionable Metric
Mean Lead Time: How long does it take from idea to production?
Release Frequency: How often can you deliver changes?
Change Failure Rate: How often to changes fail?
Mean Time to Recovery (MTTR): How quickly can you recover from failure?
Culture Measurements
On my team information is actively sought
On my team failures are learning opportunities and messengers of them are not punished
On my team responsibilities are shared
On my team cross functional collaboration is encouraged and rewarded
On my team failure causes inquiry
On my team new ideas are welcomed
Key metric: Cycle Time
Cycle time is a key metric for Agile kanban teams.
Cycle time is the amount of time it takes for a unit of work to travel through the team’s workflow–from the moment work starts to the moment it ships.
By optimizing cycle time, the team can confidently forecast the delivery of future work.
Keys to High Performance
You MUST do all three:
Technical Practices
Lean Processes (Agile)
Culture
Busted DevOps Myths
You cannot buy DevOps In-A-Box
You cannot order 20 units of DevOps for this quarter
You cannot sprinkle DevOps on something to make it better
You cannot become DevOps without changing your culture
You can’t change your companies culture just by adopting new tools …but they can help reinforce it
Using Containers won't fix your broken culture
You cannot maintain your current organizational structure and become DevOps
DevOps Summary
A Cultural Movement
Emphasizing Collaboration, Sharing, and Transparency
Promoting Automation and Infrastructure as Code
Achieving Continuous Integration and Delivery of Changes
Immutable Delivery
With One set of Metrics to rule them all
DevOps Maturity Matrix
Level 1:
Ad Hoc
Silo based
Blame and finger-pointing
Dependent on experts
Lack of accountability
Manual processes
Tribal knowledge the norm
Unpredictable and reactive
Manual builds and deployments
Manual testing
Environmental inconsistencies
Level 2:
Repeatable
Manual builds and deployments
Manual testing
Environmental inconsistencies
Processes established within silos
No standards
Can repeat what is known, but can’t react to unknowns
Automated builds
Automated tests written as part of story development
Painful but repeatable releases
Level 3:
Defined
Collaboration exists
Shared decision making
Shared accountability
Process automated across the software life cycle
Standards across organization
Automated build and test cycle for every commit
Push button deployments
Automated user and acceptance testing
Level 4:
Measured
Collaboration based on shared metrics with a focus on removing bottlenecks
Proactive monitoring
Metrics collected and analyzed against business goals
Visibility and predictability
Build metrics visible and acted on
Orchestrated deployments with automatic rollbacks
Nonfunctional requirements defined and measured
Level 5:
Optimized
A culture of continuous improvement permeates through the organization
Self-service automation
Risk and cost optimization
High degree of experimentation
Zero downtime deployments
Immutable infrastructure
Actively enforce resiliency by forcing failures
Key Takeaways
DevOps is about breaking down the silos and working as a Single Agile Team
Culture is the #1 success factor in DevOps. Building a culture of shared responsibility, transparency and faster feedback is the foundation of every high performing DevOps team
DevOps starts with learning how to work differently. It embraces cross-functional teams with openness, transparency, and respect as pillars
Being able to recover quickly from failure is more important than having failures less often
Measurements should encourage innovation and collaboration, and not punish failure (blameless culture)
Last updated