Monthly Archives: November 2022

The 6 Signs of a Tipping Point in Software Engineering Organizations

Like most people, my career hasn’t been with a single team or company. I recently realized that I may have observed patterns with teams, organizations, companies in the last 20 years in software, but a list of warning signs isn’t something that’s readily available.

The tower of Pisa began to lean during construction in the 12th century, due to soft ground which could not properly support the structure’s weight. It worsened through the completion of construction in the 14th century. By 1990, the tilt had reached 5.5 degrees.

As I went through my notes of the peaks in valleys of my career, these are the things that really stood out to me.

Over the course of many months, a foundation that was once stable can start to lean after each misstep – over time and many faults, it creates a leaning tower that people see each day when they come into work.

1. Consistent Feedback that Management isn’t Listening to Feedback

From my experience, when this happens over ~2 years, it’s a sign that something is wrong.

If employees don’t trust the feedback system in place, it’s hard to create a flywheel to improve everything around them. One consistent thing that I noticed within organizations that fail is that employees don’t believe meaningful action will be taken from the surveys that they pour their feedback into. While there are several reasons for this, one key DevOps area that can be borrowed here is to make work visible.

Make action items public for the effort your leadership team is dedicating to making things better. Overlay feedback on monthly retrospectives to get continual feedback from your teams.

2. Last Minute Budget Cancellations

I’ve typically seen Q4 Travel and Education (T&E) budgets shrink to zero as the company panics to make their most important quarter successful. While T&E budget usually doesn’t represent a significant portion of the Profit and Loss (P&L) statement, when the company provides guidance out of the blue to cancel all travel, where it then must be approved by a VP, things aren’t looking good. If this happens successive quarters in a row, or every year for 2 years, it’s a symptom of poor budgeting.

In tech companies, the largest swath of Operational Expenditures (OpEx) is with People, running infrastructure, and licences whereas Capital Expenditure (CapEx) is with buildings and infrastructure procurement. If software is only using 30% of the available hardware capacity, there’s a 70% gap in Operational Costs just sitting on the datacenter floor.

This is what reactive finance looks like

3. Significant Attrition

When a Vice President, a Senior Director, Senior Managers, and Senior Engineers leave an organization within a few months, you likely have a confluence of problems that has finally reached a tipping point.

With stocks down post-COVID, employees no longer have RSU handcuffs that allowed them to deal with constant heartache at work. This normalizes all other tech companies and allows the veil to be lifted to find greener pastures where they might be valued and treated better. Now, human kindness and emotional quotients take priority over a devalued company.

For us, we lost our senior engineer tech lead, followed by our director, then another director, then a few talented engineers – all within 6 months. This started a downward spiral, because we were also blocked to backfill some of these positions, where the culture shifted, morale declined, burnout was high, empathy went away – fast-forward another year and significant attrition among all levels was realized. The fundamental reason why attrition numbers aren’t usually known is to protect the company, so most people won’t even realize it’s bad until it’s too late.

Another late-breaking “solution” that didn’t work here is the concept of a Stay Interview – for us, this was too little, too late because RSU value dropped, the organizational track record wasn’t positive, and asking people why they would leave shows that the leadership team lacks context. Invest in people from start to finish, and not when things are in shambles.

The positive side of a forest fire is that the forest usually grows back more dense, in the case of software engineering teams, I haven’t yet seen this.

4. Doing Scrum Wrong

  1. Excessively long agile meetings with more than 20 😱 people
  2. Agile without outcomes
  3. Project Managers that only focus on getting Stories closed and not the context of the work
  4. Months without (start, stop, continue) retrospectives
  5. Retrospective feedback, when given, isn’t acted upon
  6. Teams not tracking their work
  7. Tracking defects in Excel or Slack to avoid being tracked in Jira/GitHub
  8. An excessively long backlog (e.g. 50-200 Stories) that never diminishes

When this has occurred, I’ve noticed that engineers don’t feel the empowerment to change the status quo because the people that run the meetings aren’t able to take stock and auto-correct with feedback. This means that 6 months goes by, until an official performance cycle when feedback is gathered, to make a correction.

The best tool we have to correct errant agile is a focused and honored retrospective, performed at least every month, with clear action items and clear progress made towards them.

5. Lack of Standard Hiring Bar / Backbone

If your organization struggles to define the roles of a Software Engineer, they will struggle with hiring the right people for the job. For us, we had engineers that did Data Analytics, Data Engineering, managed Airflow jobs, performed System Administrator roles, Network Engineering, were a Scrum Master, and others that wrote code – we didn’t have a common definition of what it meant to be a Software Engineer, nor did each manager have a common way to ensure we were hiring top talent with a diverse candidate pool (ideally n > 5).

This creates a pay gap where people are paid for the time they work and the criticality of their role (e.g. managing a batch cluster is more critical than writing code for an internal tool), but not the work that they do. There were times where someone would be under-performing as a software engineer, but would be saved from a “does not meet” rating, because they were available 24×7 and responsive on Slack (so they were more DevOps than Software Engineer). When you don’t have consistent expectations for roles, and managers aren’t calibrated to those expectations – merit increases were seemingly random. Other companies work to calibrate managers and provide guidance that engineers must at least meet the expectations on every axis of their growth framework (see Square’s, for example).

Stacking another level on an already skewed tower, some of the organizations I’ve worked in were risk averse. So much so that people were managed-out by transferring them to other teams, when it was obvious they were underperforming overall. This allows a manager to pass the buck and receive a backfill, instead of solving the root cause of the problem. Only worse, when an under-performer would leave, we had no control in place to both ensure they could not rejoin and worst of all, receive a promotion when an underperformer rejoins another team within the same company.

Non-standard practices create inconsistent pizzas with olives only on one side

6. Referring to Employees as Fungible

The tough years came when we started to determine how many people could be replaced in San Jose with people in Chennai, with a focus on complete cost savings without regard for people. The going rate was about 3 software engineers in Chennai for 1 in the Bay Area. Finance’s perception was that a software engineer in one location at one level is equivalent in another location.

What they failed to realize is:

  1. The emotional toll that layoffs have on an organization
  2. Late night meetings for teams becomes a new normal that no one wants
  3. Performance and potential greatly differs between engineers and regions, a senior software engineer with 1 month domain experience is not the same as a senior engineer with 4 years of domain experience
  4. Team meetups and bonding is restricted (see travel budget cancellations above)
  5. Work becomes transactional, you lose connectivity between people and their work
  6. You aren’t fixing the aforementioned upstream problems of: (1) improving organizations based on feedback, (2) last minute budget cancellations, (3) significant attrition, (4) doing scrum wrong, (5) a standard bar – this just magnifies the problem by three.

I want to be short and succinct with these opinions I write, so we’ll stop here for now. In the future, I’ll be writing about these other areas:

  • Forced Curve Calibrations
  • Promotions Limited by Available Budget
  • Solving Failures with More Compliance Training
  • Not Trusting Employees to Take on a Challenge
  • Delayed Merit Increases
  • Constant Layoffs
  • Continual Ask for Managers to Calibrate Employees
  • How to Create Spaghetti Software with Top-Down Feature Requests
  • Focus on Moving Employees to Low Cost Areas
  • No Significant Achievements, but a lot of PowerPoints
  • The Long Jumper Problem: Human Tracking of Key Results
Not all of Pisa is leaning – just the tower. If you work in a place that’s been leaning for a couple of years, there are other options out there.