Arc-E-Tect: DevOps

Showing posts with label DevOps. Show all posts

March 28, 2019

Blame prevention through devops

This post is a follow up of my previous post The question that takes away all blame.
Blameless postmortems, or blameless RCA’s are supposed to be the new-normal in devops organisations, but all too often we see that first the team and sometimes the person to blame is sought, and then we tell them to ‘fix it’.

You might’ve noticed that I wrote devops in all lower case in this post's title. I did that on purpose.

Devops sanse capital 'd'

Even though you typically see DevOps instead of devops. I think that DevOps implies that it’s about Development and Operations engineers working together. In my opinion, devops is about combining the responsibility of the development team and the responsibility of operations team, turning them into the responsibility of a single team. This would make devops a matter responsibility and less of an organisational concern.

The organisational aspect would then be in the form of ‘Product Teams’, responsible for a product with a Product Owner that is accountable for that product.
Something for another day. This article is about blameless post-mortems and root cause analysis driven through devops-tinted glasses.
Blameless post-mortems are something that comes more natural in environments of shared responsibilities. Environments where the same people are responsible for both the quality of the product as well its usage. Environments where devops is considered the combined responsibility of development and operations within a single team.

Silos vs Accountability

I have observed in a number of organisations that one of the main reasons these organisations are considering the move towards devops is based around the concept of shared responsibility. It is the idea that silos prevent this sharing of responsibility. It is a misconception though. Silos don't prevent shared responsibility, although culturally they'll probably inhibit the sharing of responsibilities. What the real problem is, is the lack of accountability in a siloed organisation. Or quite the opposite; Too many persons are accountable for different/conflicting objectives.

In a siloed organisation, each silo is primarily responsible and even accountable for its own output, its immediate contribution to the process of product delivery, but not the full process of delivery itself. Meaning that when the outcome of the process is of the unwanted kind (caused an incident), either one of the silos’ outputs caused the problem (who is to blame?) and when there is no single silo to be blamed, nobody can be held accountable.

This can go as far that a sales team is responsible for selling a product. A signed contract is considered a success. The development team is responsible for changing the product. The release of the change into production is considered a success. The operations team is responsible for 'running' the product and it is considered to be successful when there are no incidents. This would be for SaaS vendors. For more traditional software companies, i.e. those that require implementations of a product at the customer site, the operations team is part of the customer's organisation or at least the operations accountability is typically with the customer. It'll be more complicated, because there will likely be an implementation team that is successful when the product is implemented according to the contract sold.
Success of the product is defined as selling/changing/operating/implementing. With different persons accountable for each of these successes, you see that conflicts are imminent. So when a problem happens anywhere in the delivery, each of the accountable persons will elaborate that they're not to blame, because they are successful. Actually, sales and development were successful, and operations and implementation were given something that prevented them from being successful. Contract was signed based on the availability of missing features at the time the implementation project reached completion. Future releases are feature complete and functionality fully tested. But it's unmanageable, not performant and definitely not secure. And can't be implemented as integrations with other systems not available until project end.

Siloed organisations are structured around tasks, competencies and expertise. By centralising capabilities, they can be shared across products. Siloed organisations are build on shared service centres. Reason behind these structures is cost reduction through utilisation optimisation. I'm not a fan, see: Perish or Survive, or being Efficient vs being Effective.

Output vs Outcome

It’s the difference between output and outcome that often drives ‘blaming’ in a post-mortem.

In siloed organisations, each silo’s focus is on output, its output. The silos are in many cases the result of centralising the responsibility for specific aspects of the delivery process, with a lack of accountability for the full process. Specialists are responsible for doing their 'thing' is efficient as possible.

Often responsibility is mistaken for accountability, so these task-optimised teams, teams of experts, are held accountable for what they deliver, which is not the outcome of the process, but output of their effort. Because they are the experts, they perform their task for different products, i.e. they participate in various delivery processes. And are held accountable for the number of tasks that they completed in total.

The problem therefore is in that the silos are operating truly independent of each other. Each silo services several product delivery processes. Because servicing only one (product delivery) process, would mean that a significant amount of time the silo would be idle. Since the silos are there to optimise resource utilisation, idle time is undesired. Idle time is considered wasted time by many non-Lean'ers. In Lean wasted time is time spend on something that is not immediately needed for the delivery of a product.

Every silo will work hard to meet its numbers. Meet its targets. And when the target is the number of tasks performed instead of the number of products delivered, we're doing a lot and contributing nothing.

Within a context of blameless post-mortems. In a context where blaming should be prevented, we need to make sure that responsibility is shared on the (product delivery) process outcome. Accountability is set to manage that outcome. Meaning that development and operational responsibilities are both defined to contribute to the outcome of the process. Something devops shines at.

Thanks once again for reading my blog. Please don't be reluctant to Tweet about it, put a link on Facebook or recommend this blog to your network on LinkedIn. Heck, send the link of my blog to all your Whatsapp friends and everybody in your contact-list. But if you really want to show your appreciation, drop a comment with your opinion on the topic, your experiences or anything else that is relevant.

Arc-E-Tect

The text very explicitly communicates my own personal views, experiences and practices. Any similarities with the views, experiences and practices of any of my previous or current clients, customers or employers are strictly coincidental. This post is therefore my own, and I am the sole author of it and am the sole copyright holder of it.

March 21, 2019

The question that takes away all blame

Blameless postmortems, or blameless RCA’s are supposed to be the new-normal in devops organisations, but all too often we see that first the team and sometimes the person to blame is sought, and then we tell them to ‘fix it’.

It’s for a large part a remnant of the siloed organisation and the culture that stems from it. And it is a matter of asking the wrong questions. This is, I would say, about 80% of the reason why we are unable to prevent incidents from recurring.

The cool part is in the ‘asking the wrong questions’ thing.

Now, before going into it, let me emphasise that postmortems, RCA’s are about preventing incidents to occur again. You want to know the root cause, because you want to prevent something like the incident to ever happen again. Stop reading if you disagree.

‘What?’ is wrong

All too often, when we are dealing with the aftermath of an incident, we wonder ‘what caused this incident?’. Which is a valid question, but not one that is very valuable. The issue I have with this approach is that when we have an answer to this question, we think we found the root cause of the incident. Which we haven’t.

The answer to the question ‘What caused the incident?’ is typically something technical. There was not enough memory in the server. There was a bandwidth problem. There was a bug in the software that resulted in the incident.

There is something very satisfying in the answer to the question ‘What caused the incident?’, it is in the gratification you get from knowing where the weakness in the product is. It’s in the software, in the hardware configuration, in the network infrastructure. Because when we know where the weakness was, we know who is responsible for the weak component. It’s the development team, automation team, the network team. And when we know who is responsible for that component, we know who to blame and who to tell to go fix it.

In about all cases I’ve been involved in RCA’s, putting the blame on somebody was not about punishing that person, it was about identifying who should fix the problem.

The problem is in ‘responsibility’, because the person being held responsible is not necessarily the person that is accountable for the incident. Often, especially in a siloed organisation there is no one accountable.

Although it is important to understand what went wrong, and what caused the impact, we need to realise that this is not the same as understanding what caused the incident. We’re not at the root-cause just yet. But we want to, because this investigation it painful. Colleagues are to blame, and the responsible persons must be called to justice. They must be told that we can never ever feel that impact again. And so, we make sure that next time the impact will be bearable. We increase memory in the server, increase the bandwidth in our network, fix the bug in our software. All holes are plugged. Ready to go.

If only we had addressed the root cause, it all would be honky dory.

‘Why?’ is right

The question that should be asked is not so much about what caused the incident, it’s about why the incident could occur in the first place.

That’s a tough question to answer. Why was there a bug in the software? Why was there not enough bandwidth? Why was there not enough memory in the server?

And that’s only the first ‘Y’.

A very common, tried-and-tested, effective way of identifying the ‘real’ root cause of an incident is by applying the 5-Y method. In this approach you ask 5 times ‘Why could the previous answer happen?’. Experience has taught us that going 5 levels deep will get you to the root cause of the problem, sometimes less, hardly ever more than five levels deep.

Let’s assume that the incident was due to insufficient bandwidth and let’s start asking ‘Why?’

Why was there not enough bandwidth? Too many customers accessed the newly released API.
Why did too many customers access the new API? Because we announced it prematurely in our global newsletter.
Why was it announced in our global newsletter? Because the marketing manager wasn’t aware that the API was to be released following the ‘soft-launch protocol’.
Why was the marketing manager not aware of the fact that the release was to follow the soft-launch protocol? Because she was not in the meeting in which it was decided to follow the soft-launch protocol.
Why wasn’t she in the meeting in which it was decided that the API was going to follow the soft-launch protocol? Because she was on vacation and didn’t appoint a delegate.

Now we know why the incident could occur. Not what caused it, but why it could be caused. Making sure that the marketing manager or a delegate is attending meetings in which product launch strategies are decided will prevent this incident to occur in the future.

Of course, the above is only an example, but it shows that by asking ‘What?’ the solution would be a costly technical solution and by asking ‘Why?’ the solution is better meeting attendance.

Another important conclusion you might’ve drawn is that asking ‘What?’ only involves technical people. Further leading the path to a solution down the costly technical path. Whereas the ‘Why?’ question requires all parties involved in the delivery of the product (the API) to attend the postmortem. Getting to the bottom of the incident’s cause, requires a multi-disciplinary team. Just like delivering a product requires many disciplines.

It makes no sense to think that creating a success is requiring many disciplines, but when it results in a failure, to prevent it, only requires a single discipline. There is no difference between delivering something that works and something that doesn’t. Not from a product delivery perspective.

Product Owner or Problem Owner?

The question is of course: Who would go through all this trouble and assemble all these people that are involved in delivering a product into the hands of our customers? It’s the one that is held accountable for the incident. More importantly, it’s the person that is held accountable for the fact that the incident doesn’t occur again.

The Product Owner would be my preferred role, ownership of the product implies ownership of the success of the product and all challenges that come with it.

There is a follow-up story that you can find here.

January 14, 2018

The Arc-E-Tect predictions for 2017 - In hindsight [2/2]

Last year, like every year, I did some predictions on what would be in and what would be out in 2017. But unlike other years, last year I actually posted those predictions on the internet.
Before I start with my predictions for 2018, I will go back to my predictions for 2017 and see how things turned out.
This is part two, and part one you can find here.

#6: KVI in, KPI out

"Forget about performance. Performance, in the end, means nothing when it comes to an organisation’s bottomline. What matters is value. However you want to cut it, unless value is created, it’s not worth the effort. And by value being created I mean that the difference between cost and benefit increases."

First prediction that I am looking at in this post is a bust. Although more and more teams and organisations are transforming into agile adopters, the value driven aspects of agility is still undervalued by most. I hardly come across organisations, departments or even just teams where success is measured in terms of realised value. Vanity metrics are pretty much still the norm. It's a shame because it also means that the promise of applying agile concepts are still a long way from being realised.

#7: Products in, Projects out

"It shouldn't surprise you, but I'm not a big proponent of projects and instead love to see it when organisations switch to a product focused approach. But in 2017 it will turn out that I'm not the only one."

This is happening big time in a lot of environments I've been in. The main reason why organisations transition from a project perspective towards a product perspective is because of CI/CD (Continuous Integration/Continuous Delivery). With reduced cycle times as a result of automation of the software delivery process, it is almost impossible to not release a product early and keep on working on it. Hence, delivery to production does not result in the end of a team.
My main concern in these situations is the lack of a Product Owner who has mandate over scope. The Project Manager typically does not have that mandate. It is the next step.

#8: Heterogeneous in, Homogeneous out

"In 2017 we’ll truly face the uprising of new and more technologies, concepts, architectures, models, etc. And in order to be able to manage this we will finally understand that we need to embrace the fact that our environments consist of a multitude of everything. In many smaller organisations that are at the forefront of technology and that are working in agile environment it is a given, but now that large organisations have also set out to adopt the ‘Spotify’ concept and thus teams have a huge amount of autonomy, polyglot is key."

Yes! Most organisations have dropped their need for huge standardisation efforts. Instead I see that architecture principles are becoming highly popular. With that and the gradual move towards autonomous teams I do see a shift in mindset where homogeneous environments is no longer considered the answer to all IT problems. This is also a mindset shift from efficient towards effective.

#9: Activities in, Roles out

"The thing is, we’re moving, as an industry, in the direction where we want be able to get feedback as early in the process as possible, which means that every person concerned with creating and delivering a products will be involved in everything needed to create that product and ensure that it works as intended and more importantly as needed. In this setup, everybody is what we in 2016 called a full-stack developer."

In 2017 this didn't happen. The T-Shaped employee and the Full-stack developer are found in small organisations. Large enterprises still have a culture based on decades of functional hierarchies. Contracts are still based on roles where T-Shaped and Full-stack have yet to find their spot. Unless agile transformations are no longer considered to be merely an IT and even just a software development thing, it will become very hard to get into cultures where teams are considered to be the atomic entity in product development and instead of roles and responsibilities, tasked are performed as activities.

#10: Agile in, Waterfall uhm... also in

"Well, agile is finally in and is going to replace waterfall projects in those organisations where there is an active movement towards agile. Which nowadays are the majority of enterprises. These organisations are heavily invested in dropping the traditional practices and adopting new, more business value oriented practices."

In 2017 I saw more and more organisations realising that the typical waterfall projects can actually be done in agile ways. This notion is actually causing the existence of waterfall to be questioned. Do we still need waterfall? No, not at all. But we still need large projects. In 2017 I saw a realisation by many managers as well as architects that large project and waterfall are not different words for the same behemoth, instead there is no a clear tendency to actually do large projects by applying agile practices and waterfall seems to be relegated to only tiny projects. Ironic, but pretty awesome.

This was part two of a two part on a quick glance on my predictions of 2017. Yesterday, I have posted part one of the series and see about how the first 5 predictions turned out. Next week will be about my predictions for 2018.

I hope you enjoyed this post. Thanks once again for reading my blog. Please don't be reluctant to Tweet about it, put a link on Facebook or recommend this blog to your network on LinkedIn. Heck, send the link of my blog to all your Whatsapp friends and everybody in your contact-list. But if you really want to show your appreciation, drop a comment with your opinion on the topic, your experiences or anything else that is relevant.

Arc-E-Tect

January 11, 2018

The Arc-E-Tect predictions for 2017 - In hindsight [1/2]

#1: Microservice in, SOA out

"In 2017 people will start looking at Microservices as something that is useful and way better to have in your architecture than services. So a Microservices Architecture will replace Service Oriented Architectures in 2017."

With a massive transition towards agile practices and organisations embracing scaled agile frameworks, it has been inevitable, the Microservice Architecture (MSA) has been broadly embraced. Or has it?

In 2017 I've seen that those organisations that require true agile concepts in practice in order to be(come) sustainable also embraced MSA as the architecture of choice. The change in mindset that is required for MSA to thrive in an IT landscape and an organisation itself for that matter turns out to be more encompassing than mostly thought. I've seen it fail in those organisations that merely do agile, and succeed in those situations that are agile. Yes, MSA and Agile are going hand in hand.

#2: API's in, Webservices out

"Okay, in 2017 we'll feel ashamed when we talk about web-services and SOA. Instead we'll talk about API's. This is closely related to my first prediction on Microservices, which you can read here."

Here I can be short: There's hardly any talk about web-services anymore. It's all about API's nowadays and that has been the case for the better part of 2017. Over the course of 2017 the notion of API's also shifted from merely glorified web-services towards true business services.

#3: Application Architecture in, Application Model out

"Yes, in 2016 I've been confronted with application models. Again and again I have been slapped with models of applications and yes, I've been on the other end of the slapin' stick as well. Shoving application models into other people's faces. Stuffing it down their throats, making them, no forcing them to understand."

Unfortunately this prediction didn't come true at all. Although it depends on how you look at it. In 2017 I've been in more discussions than before about Application Architectures, although in most cases people were actually talking about models. I guess the terminology is out of vogue, but a lot of architects still have a hard time to use the correct terminology. Still, to me the Application Model isn't out and the Application Architecture isn't in. Just yet. Probably with a more widespread adoption of MSA, we're bound to ditch the model and embrace the architecture.

#4: Internet in, Intranet out

"So the internet will be in, and no longer will we consider the intranet as the context in which our software is running. Talk with any cyber security firm and they will tell you that security has become a real issue since computers got connected. Networks are the root of all evil when it comes to viruses and the likes. The larger the networks, the bigger the problems. And with heterogeneity the number of threats only grew, probably exponentially."

This so turned out to be a correct prediction, and like I envisioned, one of the main drivers has been security. And the lack of it, in many cases.

In most environments I've been working in and with over the course of 2017 there was a real notion that no longer was it affordable to not consider security on an application level and assume that applications could be accessed from the internet. Even when that wasn't supposed to happen. Finally we know that assuming the network to be secure is an assumption that really does make an ass of u and me (assume -> ass-u-me)

The good if not best aspect of this is a security-by-design mindset in most if not all people involved in product development.

#5: DevOps in, Scrum out

"I can be very short about this. Business has finally come to understand that IT is not something that enables them to deliver new products to their customers but instead IT is what they deliver to their customers. IT has become a product, and therefore an immediate business concern."

In 2017 it turned out to be not that short, unfortunately. What I've seen happening is that unless agility is a true business concern, a matter of business sustainability, DevOps is not something organisations want to embrace. Although this is primarily a matter of large enterprises, those with seemingly enough money in the bank to linger a while longer before feeling the need of being able to wart of the threads of start-ups and their agility.

This was part one of a two part on a quick glance on my predictions of 2017. In the next couple of days, possibly tomorrow, I will post part two of the series and see about how the remaining 5 predictions turned out. Next week will be about my predictions for 2018.

Arc-E-Tect

December 13, 2017

What makes an ideal agile team?

Summarising

In many occasions I am being asked by clients to share my thoughts on how they are (planning to) form their Scrum teams. My recurring comments always are to have the team as small as possible, stable and consisting of full-time individuals, maxing out at 9 persons. Have one or more coaches linked to the teams and make sure that there is a mandated Product Owner.
Make sure that relationships are based on trust and ensure that accountability and responsibility are two aspects addressed organisationally with full support of upper management.

The other day I was asked by a manager of one of my clients to comment on a proposal of one of his vendors. A key vendor that has been under delivering since a very long time, but still considered to be a strategic partner in most of my client's business. The vendor is currently experimenting with a different way of addressing its customers by better aligning its services to the needs of its customers. The vendor is proposing to setup a cross-functional team that will be tasked to deliver relevant products in short cycles.

My client is a little sceptical but willing to give the vendor the benefit of the doubt. As such he's agreeing to become one of the first customers to work according to the new proposed process and is now asking me to see what to look for in the proposal. Being the first customer and thus investing time and effort in amending the current mode of operations with this vendor, there's some head room to 'ask' for specific adjustments to the proposal.

The vendor is going to deliver new products and services in a variety of categories to my client and is proposing to setup a Scrum team. My client now wants to know if there are specific aspects of this team that need to be addressed. Also the role of my client with regard to this team is a point of interest in the negotiations. And obviously the delivery of products and services to my client.

Basically the question is: How should my vendor organise its team in order to better meet our requirements? The answer is reasonably straight forward.

In principle you want to keep a Scrum team as small as possible, that is actually a basic principle. The maximum size for a well-functioning team is 9 people. This has already been discovered by the ancient Romans, no kidding.

I would like to argue to start with 6 or 7 people. Rather 6 than 7 by the way. The reason for this is that otherwise you will get specialisations within the team that come to lie with one person, that will immediately become the SPOC / F, who can not get sick or go on holiday.
The strength of a well-collaborating team lies mainly in the knowledge spread throughout the team as well as complementing personalities of the team. This allows members to challenge, supplement and assist each other. It is a common mistake to see what kind of techniques and technologies you want to use and to find the experts for that. It is a Team and not a Group we're trying to forge. Understand the difference, it's important.

I find the use of FTEs in resource planning of Scrum teams confusing and dangerous. I've seen this plenty of time and it never resulted in anything good. One FTE can be filled in by 1 or more people, and that is what you want to prevent. You want the team to be stable, not only within a sprint but also over sprints. Almost at any cost. Again, I'm not talking about a group but about a Team. And, they must all have an engineering mindset. People who dislike doing things themselves and want to automate everything by definition.

Recently I was asked on the subject within the context of an initiative to start treating an operating environment as a platform, which was going to be treated like a product. With a Product Owner at the helm. And a group of my client's experts was assembled to form the 'Scrum team'. I was asked to provide my thoughts on this setup.

We're talking about server provisioning, networking, identity and access management, firewalls, certificates etc. This team is also going to be responsible for operating the platform they're developing. My initial thought is to have therefore functional expertise (provisioning, networking, identity management, loadbalancer, firewalls) and an engineering mindset (automating, monitoring, everything as code), if possible either a senior in one or more functional areas and medior in engineer and vice versa. Depending on the team composition, there may also be a medior / medior combination when there's enough seniority in the overal team. Attitude is more important than knowledge in my opinion. I would therefore always prefer someone who wants to automate, will always test and always asks for insight into production.

I do not see why one would need a full-time Scrum Master, that's probably overkill. Having said that, it seems wise to let the team choose their Scrum Master from the team itself. And with experienced teams that have been working together for a considerable time this is likely a viable option. But when you're just starting with Scrum in your organisation or when the team is just starting with Scrum and Agile concepts, I always prefer a Scrum Master from outside the team. Since the Scrum Master is the team's conscience, and at times will have to use strong words to get focus back on the team's goals and principles... it will be challenging for a Scrum Master from within the team.
Add a coach for roughly the first 6 sprints to coach the Scrum Master, even though the Scrum Master might be experienced, it is still good to have a coach. The team chooses the Scrum Master because it will ensure that it's their choice, a choice supported by the team. You want good chemistry between team and conscience.
The role of the Scrum Master is all about addressing people to the standards and values of the team, but also facilitating them in doing their work. I have seen in various situations that it can work when the Scrum Master is also facilitating in another supportive role, e.g. as Tool integrator in Continuous Deployment environments. I'm not a proponent of this though.

I told the client that I was missing the Product Owner (PO) in his approach. The PO is relevant because the PO determines what the team is going to work on. That would be the person who talks to the customers about what is needed, etc. And to the users about what was delivered. Therefore the PO is accountable for what the team will deliver. The team is responsible for what it creates, the PO is accountable. These are the two most important aspects.

Keep in mind that responsibility can be delegated and shared, accountability cannot be delegated nor shared. So a delegated PO doesn't exist, as it would mean delegated accountability.

So a stable team and a PO (outside the team) defining the team's priorities. No one else but the PO is mandated (i.e. empowered) to determine what the team is working on, because the PO defines the priorities of the products and features to be developed. The team plans the work. So there has to be, by definition, a lot of trust between the PO and the Team, because the PO must be able to rely on the team's commitment to what they are going to do in a sprint. Here comes the Scrum Master in the picture again. Because the Scrum Master has to make sure that the team does not overcommit.

Here's an interesting aspect, the trust aspect. An aspect I will address in a future post, where I will cover more on metrics and KPI's and the trouble of the user story in this regard.

In my opinion, the role of customers should be not only one of the customer, but also one of the user. This will allow a user-centric approach on developing the product, and at the same time be very customer aware.

You should look for a future post on stakeholder management and agile processes to get some more insights on this topic.

I told in another case with another client of mine that it makes perfect sense to start treating their Scrum teams as internal startups. Basically consider the PO of these teams as the CEO of the startup and the management team would be the Venture Capitalists. By doing so, there's ample opportunity to experiment and evolve into a value driven minded organisation.
In this particular case I suggested to see if it would be feasible to go along these same lines.

The PO will need to determine what an MVP would be. Something small delivered in short increments, so to quickly find out if something is usable. Start cultivating a mindset where 'Done' means 'In use at a customer!' (Done = Live).
Agile means being able to deal with change in a timely manner. So here comes the point of view that the PO is a person who has a strong personality and is met with a lot of respect. Somebody that knows what the product should look like, with a product vision. And the PO will have to be mandated/empowered. It should never be the case that the customer can circumvent the PO to get things prioritised. This is for internal and external customers. And managers.

Again, I told my client to get a coach involved here as well. Even if it is an experienced PO, it is important to be coached. In the beginning intensive, but perhaps later (even after 6 sprints) it may be a bit less. It might be an opportunity to invest in an agile-coach, who will take care of all the coaching work (team, Scrum Master and PO).
Just like the PO is mandated to define what is and what is nog going to be in the product, the Team and the PO should be granted autonomy, independence and self-reliance. The PO is positioned external to the Team and in case of scaling up the team to two teams I would like to argue that both teams get the same PO.

In this particular setting, one could even consider starting with three small teams of 3 to 4 people who all work on their own part within the platform, three products if you will. A team for provisioning servers, one that works on network and connectivity (including DNS, firewalls, loadbalancers) and a team that focuses on IAM Automation (Directory services for example) and certificates. Combine these teams with a single Scrum Master and a PO with three Teams. Obviously an Agile coach for the whole bunch.

So, concluding: A maximum of 9 individuals, all full-time, in the Team. One of which will assume the role of Scrum Master. Alternatively, form 3 teams of 3 to 4 full-time individuals with an overarching Scrum Master. In addition, a PO with knowledge of the subject matter and full mandate and an agile-coach.

Arc-E-Tect

August 4, 2017

The "Eat your own dog-food" fallacy

Why having people eat their own dog-food doesn't accomplish anything sustainable.

Summarising

Having somebody eat their own dog-food doesn't necessarily make the improve its taste but might make them get used to the taste instead. Awareness of the (lack off) quality in one's work is important when transitioning from a Dev/Ops organisation towards a DevOps organisation. But experiencing a lack of quality first hand is often not the way to do so, or even an option. Agile transformations are only possible by means of cooperation.

There's a common understanding in the world of organisations that are transforming from Dev/Ops towards DevOps organisation that this works best when the devs are forced to eat their own dog-food. It's a reaction from the Ops people towards the Dev people.

Basically it means that once you've got the Devs supporting their own products, they'll make sure that those products are of a high quality. Common believe is that since they, those Devs, don't want to get called Sunday night at 3 AM because their software crashed. Which makes perfect sense, who does want to get called Sunday night at 3 AM because something they created crashed? I wouldn't, would you?

So, if you want those Devs to focus more on quality, on less crashing software, you have to make them support that same software themselves. In other words, turn those Devs into Ops people and that'll show them.

About a year ago, I joined a team at one of my customers to help them transform selected development teams into more agile teams utilising Continuous Delivery mechanics and move towards, lord forbid, DevOps. One of the slogans we used to get the necessary buy-in was:

"Eat Your Own Dog-Food"

And later we added "And Clean Your Own Shit!". Totally convinced that this is how it works. Make people feel the pain they're causing and they'll become better persons. If you would do a little time-travel and rewind about 2 years, you'll hear me saying that "one should feel the pain they're inflicting".

I think you'll recognise this when you've been in that situation where you want to move from Dev/Ops towards DevOps. Or become more agile. Or maybe you need to convince people that agile or DevOps is the way to go.

It makes total sense. People with kids know this. You want them to stay away from the fire, let them burn their fingers. Or those people with dogs shitting all over the place... let them clean that shit every time their dogs drop their poop in the playground. It works, really does.

But there's a huge problem in this, I'll get to that. First I want to ask you if you've noticed that slightly grumpy undertone in me mentioning of the "Eat your own dog-food" slogan and everything associated with it.

Agile and SCRUM in particular are developer driven ways of working. It's the developers that want to change things. Reason, obviously, is that developers want to develop software that people are using, so they want to get create something that is as close to what is usable as possible. Meaning that they want so put something in the hands of the user as quickly as possible and then make adjustments and put that into those same hands. And again. And again. And again.
Our organisations are such that specialised Ops people are managing those applications when in the hands of those users. And they need to keep up with all those adjustments... You understand the predicament those Ops people are in.
So when you tell these Ops people, that you want on your side while transforming into agile organisations that those Devs should eat their own dog-food. Ever tasted dog-food? So how do you think that this resonates with those Ops people? Pretty awesome, don't you think?
That connotation of dog-food is getting the nay-sayers called Ops on your hand, shouting "Yay" instead of "Nay". Mission accomplished. You're done.

Well forget it. That's the "Eat your own dog-food" fallacy. It's a fallacy because it doesn't solve anything and it certainly won't help you in your agile transformations. Considering your organisation has separate Dev and Ops teams, and considering that the reason for this is a more efficient Ops team because they will support many products. Because supporting a quality product is not a full-time job. Read my post on this topic. There's no way that having the Devs eat their own dog food will improve quality. Which was the premise in the first place, remember? And now you should say that in fact it doesn't make sense at all. Because there's an Ops team and for a good reason. So what makes you think that anybody in the organisation will just allow the Devs take over the Ops jobs? Ain't gonna happen. No way. Meaning that whatever you're trying, the Devs won't even get the opportunity to eat their own dog-food, even if they would want to. You can fix it by job-rotation... in certain situations, in certain organisations. Fallacy explained.

Considering the above, how would you need to address the agile transformation? How to move towards a DevOps setting? The answer is quite simple and probably extremely hard to implement. The more alluring the "eat your own dog-food" approach is, the harder it will be to do the correct thing. The sustainable approach, let's call it.

If you want the developers to develop software of a higher quality, you need to make them aware of the problems they are causing because of the lack of quality in the software. And you do this, by introducing them to their operations colleagues. Let them work closely together. Geographically close. As in side by side. Not pair programming close, but across the desk close.
What will happen is that the colleague from Ops will complain to his Dev colleague about a problem instead of to his Ops colleague. Feedback loop is tiny and closed. And most likely, it's friendly constructive feedback, because it's directed to the immediate colleague from across the desk. That person with whom lunch is shared. Not bottled up feedback. It becomes a feedback loop in which the Dev can immediately ask the Ops person why it's such a huge problem, this tiny thingy.
Getting the Dev and the Ops to work at the same desk, allows them to become aware of each other's work. And generates understanding. Understanding towards their respective worlds. It creates an atmosphere where the Dev will improve the quality of the product, because otherwise the Ops colleague is called Sunday morning 3 AM, and that's not something the Dev wants.

As a parting gift, another tip: Make sure that Devs and Ops don't huddle together when you put them in the same room. Instead put the Dev next to the Ops, side by side, hand in hand. When you allow them to huddle together, you should put them in separate rooms. Just to make sure that their respective complaining is not effecting them during work hours.

By allowing those to blood-types in your organisation to become aware of each other and befriend each other, you'll pave the way to become a true agile organisation with a smooth transition towards a DevOps mindset.

Here's an exercise for you: See how it works the other way around. Feel free to use the comments to discuss.

Arc-E-Tect

Translate

March 28, 2019

Devops sanse capital 'd'

Silos vs Accountability

Output vs Outcome

March 21, 2019

‘What?’ is wrong

‘Why?’ is right

Product Owner or Problem Owner?

January 14, 2018

#6: KVI in, KPI out

#7: Products in, Projects out

#8: Heterogeneous in, Homogeneous out

#9: Activities in, Roles out

#10: Agile in, Waterfall uhm... also in

January 11, 2018

#1: Microservice in, SOA out

#2: API's in, Webservices out

#3: Application Architecture in, Application Model out

#4: Internet in, Intranet out

#5: DevOps in, Scrum out

December 13, 2017

Summarising

August 4, 2017

Summarising