The 15-point DevOps Checklist
DevOps is a culture that requires some practices and a new vision, its common goal is unifying people and organizations around unique goals.
The DevOps Checklist is neither static nor unique, there is no manifesto that describes DevOps, but it should be adapted to the organization need, human interactions, and other specific criteria.
In other words, the checklist could help you proceed with setting up a DevOps culture but don’t consider it as a unique way to proceed with your organization transformation.
You are not obliged to fully implement all of these points, but just start with what you can do and what you should do and work on their improvement continually.
These points are cultural, process-related or technical.
In all cases, something I had never or rarely found in other checklists which is reliability not just your live environments but also processes and the first thing reliability requires is the simplicity.
Simplicity is not an element in the following checklist because it must demonstrate in each of these points, so keep thing simple.
01 - A Cross-Functional Team
A cross-functional team is a group of people with different functional expertise (marketing, operations, development, QA, account managers ..etc) working for the same goals and projects.
A group of individuals of various backgrounds and expertise is assembled to collaborate in better manners and solve problems faster.
As said in Wikipedia: The growth of self-directed cross-functional teams has influenced decision-making processes and organizational structures.
Although management theory likes to propound that every type of organizational structure needs to make strategic, tactical, and operational decisions, new procedures have started to emerge that work best with teams.
In DevOps context, the dev and ops teams should not live in separate silos.
Each team should provide support and pieces of advice in order to take advantage of the skills of everyone.
According to some management studies, like Peter Drucker’s on management by objectives, cross-functional teams are less goal dominated and less unidirectional which stimulates the productivity and the capability of dealing with fuzzy logic.
02 - Communication Culture & Global Thinking
When working together on the same products, communication is bound in achieve better results and reach valued goals.
DevOps is mainly a culture of communication and cross-functional collaboration.
Individuals and departments could speak in different professional languages, which creates different types of communication types like it is the case between developers and operation teams.
Every team has its own goals, operation engineers seek stability, while developers tend to make changes that may affect the stability in some cases.
This is not just the case for developers and operation engineers, every department in a non-DevOps environment has its own goals and give almost 100% of the effort and the time to achieve it.
Obviously, with different goals, different responsibilities, and different professional languages, communication becomes difficult.
From that point departments will throw responsibilities of problems that they either create or encounter.
Being rigid in setting goals for each department is not a good idea in many cases.
Having common goals, goals shared between departments and make managers, executives and all members of a team aware of the fact that there are common goals can reduce this gap between two or more teams.
Setting up departmental and local goals brings up the “not-my-job” problem where each department throw responsibilities on another one while establishing common and global goals encourage people to work together.
The following hacks could help your organization to ameliorate communication:
Motivate through Gamification
Gamification is a good way to keep your team motivated while playing.
When I used to use Hipchat, I integrated several chatbots but the one that I like the most is the Karma bot: everyone in my team instead of saying “Thank you”, can give one or several karma points to another colleague.
This is somehow a kind of building a symbolic “meritocracy”.
The GNOME Foundation, Apache Software Foundation, Mozilla Foundation, and The Document Foundation are examples of (open source) organizations that officially claim to be meritocracies.
You can find other tools that could help in the gamification of workspaces and the daily professional life like Game Effective to gamify sales, customer service, and employee training.
The Smiley Board
At the end of each workday, everyone is required to draw a picture of their face in one of three modes:
- Happy face
- Blah face
- Sad face
The Smiley Board is also called the Niko-Niko Calendar (or Smiley Calendar).
It is a Japanese creation where each member shall put a smile on his own schedule at the end of the day, before leaving the office.
This gives a view of the well-being and motivation of each member of the project.
Open & shared spaces
Open spaces arranged in the right way are a key to valuable communications and collaboration.
Chat Rooms
Internal chat applications are widely used in many organizations, generally, chat rooms are specific to a team or a project.
Adding chatbots in tools like Hipchat, Slack or any of their alternatives helps teams to communicate on several recurring events and being transparent and aware about:
- Deployments
- Incidents
- Builds
- Developers’ commit to the versioning server
In my company I have even set up bots to post daily motivational quotes, we have also some chat rooms where we talk about non-work related stuff.
Hackathons
A hackathon, hackfest, codefest or hackday is an event in which individuals involved in software or hardware development, design, UX, project management, collaborate intensively on software/hardware projects.
Hackathons tend to have a specific focus on technology, a theme or a project but some hackathons are open and participants have the full freedom.
Hackathons are a great way to build communications and allow people from the same organization to work together and have the same goal.
Internal Hackathons are also a way that some companies like Netflix, Facebook, Google, Microsoft, Hewlett Packard organize to promote new product innovation by the engineering staff.
For example, Facebook’s Like button was conceived during a hackathon.
Team building
Or helping a team to make it more efficient in terms of operability, more cohesive, consistent in its results and competent.
This type of accompaniment is particularly relevant when the professional situation is internally complex during a process of reorganization or profound change in the business, or when the situation is externally complex during a competitive pressure or a changing market.
It helps the team to co-build the solution and develops their collective intelligence and autonomy.
Playing Games
Game playing is a good way to create good communication behavior.
A good example is the Kanban simulation game simulates variable workflow for a SaaS company The getKanban Board Game is a physical game designed to teach the concepts and mechanics of Kanban for software development in a class or workshop setting.
Outlets, group outing, Friday drinks
03 - Customer-Oriented Culture
Product-centric culture will not work in most cases, especially if you are a startup with an MVP or even more and trying to be competitive in the market.
No one know how a product should be better than the customer himself.
To build a customer-oriented culture, you should:
- Not build the best product but create the best solution for your customer.
- Stop worrying about creating new products or features and instead of that search for your customers’ new needs to fill.
- Hire people who fit and reward people having deep insights into customers instead of rewarding new features or product development.
- Share your customers needs explicitly with your developers and operation engineers.
Being transparent about business needs and customers needs is the first step to create a customer-oriented culture.
Actually, being focused on customers is the best way to align teams towards the same valuable goals without creating an inter-department war.
The DevOps feedback loop is also a good way to keep your systems stable and scalable in the same time, customers need this: stability and scalability.
04 - Source Control & Revisions
According to Wikipedia
A component of software configuration management, version control, also known as revision control or source control, is the management of changes to documents, computer programs, large web sites, and other collections of information.
Source control can be critical to your success.
When developing a digital product, it is important to give developers a technical tool equivalent to project management tools but from a purely technical view.
Source control was created to resolve real problems that developers encountered during the coding process.
It allows them to keep a manage:
- The code modification history, so that change management and getting back to older versions of a working code could be easier.
- Concurrent file editing, in the case, when multiple developers work on the same code.
- Tagging
- Branching
- Merging
Version control is not just for developers since one of the best practices in startups is versioning documentation.
Documentation versioning will help you:
- Track incremental backups and recover
- Record any change and revert documentation to an earlier version
- Track co-authoring and collaboration and individual contributions
Most of the modern documentation software use versioning like Atlassian Confluence or other open source software like wiki software (MediaWiki, DokuWiki ..etc).
05 - Infrastructure As Code
Infrastructure management and provisioning are moving to the next big thing: Infrastructure As Code.
Infrastructure as Code (IaC) is the usage of definition and configuration files to create, start, stop, delete, terminate and restart virtual or bare-metal machines.
When mastering IaC organizations can reduce costs and time of infrastructure management in order to focus more on the product development.
With the rise of DevOps movement the fact of enabling the Continuous Configuration Automation approach is becoming a key step in the life cycle of a product.
I published a post on medium where I explain how Infrastructure As Code can work using a configuration management tool.
06 - Routine Automation
The DevOps philosophy could be described in different manners, but I saw once on Twitter a good point of view about defining DevOps from an automation perspective: “Using Things You Can Program, and Programming the Things You Use”.
Automation in the DevOps philosophy is about making faster tasks in order to interface things and create automated pipelines.
Almost everything could be done manually, but in order to focus on product development and to create continuous pipelines (continuous integration, delivery, testing, deployment ..etc) and feedback loops, everything starts with automation:
- Automate infrastructure
- Automate integration
- Automate delivery
- Automate feedbacks
- Automate scalability
- Automate bugs hunting
- ..etc
The continuous processes rely on already automated tasks.
07 - Self-Service Configuration
Using cloud technologies with configuration management software allows automated provisioning of infrastructure and services.
Generally, the operation engineers after listening to developers need for product development, install and configure software on production-like servers.
Using tools like Chef, Puppet, Ansible or SaltStack, cloud infrastructures like AWS and Digital Ocean, versioning systems like Github or Bitbucket, containers technologies like Docker, continuous integration and deployment servers like Jenkins or Rundeck.
- Your system configuration should be always into a source control service/servers
- Developers should be able to create systems with production-like configurations and data.
- Developers should have access to continuous integration tasks to build software and test artifacts in a short period:
I prefer working with git hooks, where an automatic build is launched right when a developer push a change to the development branch.
- In a stage of maturity, Developers could be able to deploy a change to production:
This can be complex to achieve, in some cases, it is preferred to keep operation engineers deploy to production.
08 - Automated Builds
With the increasing number of Jenkins plugin, automating builds is becoming easier.
An example For web applications is using automated build tools like Ant with package managers like npm and dependencies managers like php composer.
The main types of automated builds are:
- On-demand automated builds where the user run a script to launch the build if it is needed:
This is not really fully automated and it is used in the cases where scheduled and triggered builds are complex or useless.
- Scheduled automated builds like it are the case with the continuous integration servers (such as Jenkins) running nightly builds
- Triggered automated builds where builds in a continuous integration server are launched just after commit to a git repository.
09 - Continuous Integration
Continuous integration use automated build to create a process where the integration server build a project if any changes were made or periodically.
The process could also include automated tests and automated delivery.
Continuous integration is an important brick in the DevOps settlement and the weak link in the automation process since it is positioned between development and operations in order to automate the flow and fluidity the passage of an application from development to post-development operations.
10 - Continuous Deliver
Continuous delivery is use both the continuous integration and automated builds to deliver software to other teams as fast as possible, ideally after a code change.
The product is then delivered in an artifact server or at least an FTP server. As an example, QA teams cloud access to the last delivery to do their tests.
If the transition from the test phase to the production deployment phase is automatic, we call this continuous deployment. The big difference between continuous is the automated deployment.
A stable continuous delivery is a sign of success for a great part of a DevOps journey. If you already implemented this in your DevOps pipeline, you are then mastering the DevOps art.
11 - Incremental Testing & Test-Driven Development
Testing relies on the concepts of integration testing and incremental testing relies on continuous integration and delivery.
Incremental testing is continuous and repetitive as new and fresh functionality is added.
The incremental approach has the advantage of detecting a fresh defect.
Because finding bugs in an early stage will help your organization spend less money and stabilize production environments, incremental testing is one of the best practices in DevOps.
Different testing approaches could be adopted:
- Top-down: Testing takes place from top to bottom, following the control flow. Example: starting from the GUI to the program core.
- Bottom-up: Testing takes place from the bottom to top following. Example: starting from the system components to the GUI.
- Functional incremental: Testing functions and functionalities like described in the functional specification.
Test-driven development is one of the concepts of extreme programming( a software development methodology which is intended to improve software quality and responsiveness and one of the agile software development practices).
TDD is one of the best practices to adopt in a DevOps culture, it is based on incremental testing and the repetition of a very short development cycle.
Kent Beck, who is credited with having developed or rediscovered the technique, stated in 2003 that TDD encourages simple designs and inspires confidence.
Test-driven development is a test-first concept that developers can apply in order to improve and debug their code or even legacy code developed with older techniques and methodologies.
12 - Automated release management
Product development is not just code.
The product has a whole life cycle, from development, version control, builds, repositories and artifact delivery, tests and acceptance, server provisioning, application configuration to production deployment.
This cycle describes the release management, while automation is one of the pillars of the DevOps culture, the release management should be automated for better business results.
The automation of release management relies on automating all of its stages, that is why automating releases require setting up a continuous delivery strategy.
13 - Shorter Development Cycles & Time-To-Market
The motivation of being customer-oriented organization while using the techniques mentioned above will induce development cycle length and in a result the time-to-market.
Time-to-market (TTM) is the length of time it takes from the conception stage to the product being deployed and running in production servers.
TTM is important in all industries where products are outmoded quickly as it happens in the organization working on web applications, SaaS or PaaS products.
In order to know the existent and improve it, measuring the time-to-market is a good practice, one of the simplest approaches to measure it is counting days from the conception of a feature to the deployment of the stable release containing the feature to the production.
Agile methodologies and DevOps culture advocates working in developments sprint, while routine tasks are automated, integration and tests are continuous, a sprint may take from one to two weeks and so the time-to-market.
14 - Key Performance Indicators
In my actual job, I am leading a DevOps transformation. It could be painful sometimes, but measuring your success is motivating and rewarding. Since my first month, I identified major problems and picked out some indicators, the first thing I have done was measuring, after ten months (almost a year), I started gathering the same indicators to measure. Well, things evolved :
- Uptime vs downtime
- Errors ratio
- Responsiveness
- Reject ratio
- Load capacity
- Many other specific indicators
And of course the time-to-market.
As said above, measuring the time-to-market is a good practice — it is one of the business-oriented indicators. In order to reduce the length of a product TTM, we should measure its actual one and this is similar for other improvements.
Well, key performance indicators (KP) are the key indicators of the efficiency, performance and the good result. It can be collective or personal and to promote sharing of global goals, departmental KPIs are less important than global KPIs.
A key performance indicator can meet the following objectives:
- evaluation
- diagnostic
- communication
- information
- motivation
- continuous progress
Indicators are a key-concept in the continuous improvements and it relies on:
- Choosing your indicators, rather choosing the good indicators — well it simply depends on your organization specificities, priorities, and goals.
- Automating measurements: Working on sprints of improvements is one of the best practices.
Improvements are not a stage or a list of tasks to achieve after deploying a product, but a continuous work that requires a continuous measurement that could be automated.
Having the number of errors and bugs happening in your live or in your integration environments is a good example.
I always considered having a screen with live KPI measurements in an open space is a genius idea.
15 - The DevOps Feedback Loop
To achieve an incremental workflow, a constant, feedback loop between operation and software development teams is important.
The feedback is the description of the current state of the software across its life cycle.
The feedback loops are one of the most important DevOps principles, they can describe the DevOps and the Lean methodology goals: Continuous improvements while having a fail-fast workflow between development and IT.
This feedback could be a real-time flow in its most advanced forms.
Monitoring, log gathering and analyzing is a practice among others forming this feedback loop and if you are used to this kind of practices, you will understand that this is about communication and transparency between two teams: dev and ops.
In startups, change is a law so the high deployment rates will often result in a supercharge for IT operations.
Clyde Logue, the founder of StreamStep, said: “Agile was instrumental in development regaining the trust in the business, but it unintentionally left IT operations behind.
DevOps is a way for the business to regain trust in the entire IT organization as a whole.”
“The Phoenix Project: A Novel About IT, DevOps, and Helping Your Business Win” book describes the fundamental principles of DevOps, describes “The Three Ways” from which are the principles that all of the DevOps patterns can be derived from.
Systems Thinking
System thinking highlights in the first place the performance of the entire system so that no more departmental goals will be considered as more important as the global goals. The focus is on all business value streams that are enabled by IT.
Amplifying Feedback Loops
This way is about creating a feedback loop in order to achieve improvements while amplifying the loops and this helps understanding customers needs and to respond to all the internal needs (both developers and system administrators ..etc).
Amplifying the feedback loop is a way to increase the global degradation caused by local changes.
Some local changes are just optimizations intended to achieve an individual or a local goal and putting the amplified feedback in work will never pass a defect to downstream work centers.
Culture Of Continual Experimentation And Learning
This way is about continuous experimentation, taking risks and continuous learning.
The repetition in this way is the prerequisite to mastery and the experimentation is a path towards improvements which is daily work. Feedback could be done in a shorter time.