Happy New Year! 2022 was crazy - let’s see how 2023 pans out! I’ve already booked my flight to San Diego 😎.
I managed to post every week last year and I hope to continue at this cadence - it feels very much part of my existence now, but I know that some day I will want to stop. I keep wondering if I will run out of stuff to say, but my backlog of posts has never been so long…
Thank you for persisting with me for a whole year. I don’t have a plan for how much I intend to write about tools, data leadership or practice, but I will try to strike a good balance.
If there is a topic you’d like to hear me write about, then please add this in the thread on my substack chat, comment on this post or come find me on data-folks.masto.host
Katie Bauer recently posted about how data teams can force themselves into positions where they can influence their companies:
I want to be hopeful about some of the possible outcomes of this post. Nevertheless, I feel the post assumes that everyone at the organisation does want to serve it well, but may have different ideas about how best to do this. Sadly, on many occasions in my professional life, I’ve worked with people who would undermine good ideas and people because it suits them, even if it is to the detriment of the organisation. This is where I feel the logic breaks down - but to be fair, nothing works well under these circumstances other than incentivising these self-interested folks to be aligned with their organisation… or better yet, rooting them out.
However, it’s a new year so let’s be hopeful and assume the best of each other - good will to all men and all that stuff. So, I’m writing a letter to our hypothetical stakeholders and colleagues outside of data, listing the things we need from you in order for us to be more effective:
When you ask us for help, tell us what you’re trying to achieve. Don’t just say “I need this piece of data”… tell us that you’re trying to achieve this higher level goal that you believe will be enhanced or achieved by X outcome, where you could use data to make a more optimal decision or enrich your product. Treating us like data monkeys will make us look for somewhere we will be treated like data people, or will dehumanise us so we behave as we are treated.
Where we’ve set up systems to make structured work requests (eg Jira, Linear etc), please try to stick to your original request. This is helped when your request is as detailed as it needs to be with specific details and outcomes. We do understand that, as we go through the work, there may need to be some clarification of elements of the requirements that will be surfaced by doing the work, but this should be within the clear constraints of the original work request.
If our team has taken the time to make it possible to self-serve, do try to do this. There are genuinely people in an org who may not be tech savvy OR have the time to use data tooling themselves OR their time is so valuable that trying to self-serve doesn’t make sense. This is not the case most of the time, though. If my CEO was rushing between back to back meetings, I’d build them what they needed and make it easy to find and use.
There are also scenarios where you can try to self-serve too much and you end up building Frankenstein’s monster of a dashboard, with 107 merged queries etc. Self-serve data is, by nature, meant to deal with relatively simple requests - if the question you are trying to answer in one table or graph is beyond “I want to see this metric/s split by these dimensions, possibly with some table calculations on top like running totals”, then the chances are you need help, where we will build you something more advanced.
What is clearly wrong to you is not always so to data folks. Often when data folks get things wrong and seem puzzled, it’s because they feel like they’ve followed the right process. Even if it’s glaringly obvious that it’s wrong to you, as you know what the business norms are, bear in mind there is probably only a small switch to be made or upstream problem solved that would eliminate the data issue. Data stacks can be sensitive to pretty small issues.
Not every data person knows the business like you do - they have to know a lot of other things. Some Data Analysts and Analytics Engineers may know some parts of the business very well, perhaps as well as you, but it’s rare for any data person to know their whole business to a great degree of detail. Data folks are trying to align the world in the data systems AND the actual world, this is rarely (read never) straightforward.
Involve us in strategy. If data can be used in making decisions/strategy, then we should be involved, end of. Yes, we should be enabling others to use data for this purpose, but there are nuances and expanding circumstances that people who are not data professionals will likely miss. This is why you include Finance, Product and Marketing folks in these discussions - you don’t understand their fields as well as they do. You don’t understand the data field as well as data folks do. We have an equally valuable perspective as folks from these other fields - take our input. Just like any other input, it doesn’t have to be followed fully or at all, but our input is valuable enough to be added into the equation.
Question whether the work you are asking for is worth the total cost. Data resource is scarce, and often what you think may be a low cost piece of work is actually much higher. Be clear about whether this is a one-off piece of work or something that will need to live on. Please be honest about this: most of the time, if a piece of data work is valuable, it lives on for some time - often enough that good rigour around data & analytics engineering is worthwhile. This means, most of the time, the total cost of Data work = Data Engineering + Analytics Engineering + Analytics/Data Science + Maintenance + Upgrades. Bear in mind that it’s therefore unlikely for any piece of work not to cost thousands of dollars.
If you don’t have a data team or data person, do you really need them? There aren’t enough people to go around in the industry - they want to be where they are the most needed, will have the highest impact and will have the most ability to influence their orgs. Hiring data people to do work that isn’t that important when there is so much interesting and important work to be done, is a good way to churn through hiring and waste budget. This is unwise: even if you’re lucky enough to have ample budget in these circumstances, when you do come to need data people… bear in mind they read Glassdoor.
Do challenge priorities, but don’t ask data teams to do more than is reasonable or to look at your work above the rightful top priorities. We’ve all had that unreasonable stakeholder who wanted their work done regardless of whatever workload we had, or to be wrongly prioritised above current work… don’t be that person, it won’t work in your favour. I’ve had amazing stakeholders too, who’ve said things like: “Oh please don’t worry about this before priority X, which is related to the main OKR this quarter”. We’re humans and primates and goodwill with us is an asset - somehow those great stakeholders get their problems solved earlier than they expected. Data folks usually want to help and especially help those stakeholders who are reasonable and kind to us. 🤷♂️
Expecting data people to succeed in isolation is unreasonable: there are unicorns out there who can do this, but they are rare. Usually, data folks will hope to have at least a Head of/VP/CDO level person leading Data in the org, as well as some support from other data disciplines, as most data folks aren’t strong in all areas.
Data infra work can, should and often does have a long-term positive ROI in being a multiplier on future capacity or capability. It also increases work satisfaction in the data team - in my experience, data folks are neat creatures… they want their repos and workspaces to be as clean as possible. Ignoring data infra work for the long-term is perilous, as it results in lower efficiency, lower satisfaction in the data team… a recipe for turnover and failure. There has to be balance between this work and the top priorities of the business at any given time. This is why you will see roles such as Data Platform Engineer become prevalent in the future: these folks purely focus on data infra work that others often are forced to neglect.
All of the things I’ve asked from a stakeholder above could be reversed onto the data team to elbow their way into or improve upon, and they should at times, but it’s much better if it isn’t always a fight.
In a recession, where profitability is more important than top line growth, things are more constrained and I believe the impact of data practitioners is more profound. The maths starts to be constrained - you’re not guessing about where the edge of diminishing return is, you’re trying to find the optimal point between known performance boundaries.
Despite it being a recession, data folks are still in high demand, especially at a senior level. Whilst this isn’t a reason to coddle them, you should bear in mind that you don’t want the grass to seem greener…
If we seem forceful (with pointy elbows) in making our way into discussions, it’s probably not us being nosy or disrespectful. We want to serve our organisations as best we can. It’s no fun being a prophet in the wilderness.
Thanks for reading davidj.substack! Subscribe for free to receive new posts and support my work.
Excellently put David! I feel like you've put words to many of the issues I've felt (but often failed to identify) in prior projects and work. I think this is a nice complement to Katie's post as well, and you do a great job framing the other side of the equation. Ultimately, the best data orgs are the ones who both advocate for themselves, and have the best partners.
I also appreciate the notes you made about practitioners rarely being the best in all areas of data, and about the importance of infrastructure work. Hopefully those can be more talked about over the coming years.
Really enjoyed this one!