I had been meaning to tell this story for a while, but Lindsay’s post on LinkedIn a few weeks ago has refocused me.
Lindsay is Head of Data @ Secoda, we worked together recently on organising MDS Fest (V2 of the conference is running from April 8th to 12th!). My reply to Lindsay may sound a bit terse in writing, but it’s not meant to be - Lindsay is amazing and I love what she does in the global data community. However, I have a lived experience to share that is the root of me saying this, plus, a repeatable, usable method.
From what I’ve seen, two forms of diversity are a particular challenge in tech and data: female representation and black representation. I say this having travelled a lot over the last couple of years and having worked in data for a long time. I’ve met or seen thousands of data folks at conferences like Snowflake Summit, dbt Coalesce and Big Data London.
It’s really hard to split socio-economic reasons for under-representation in tech from poorer outcomes for minority groups in the population more generally. That’s not to say we shouldn’t do anything, I’m just saying I’m not sure what we should do. It feels like there should be a wider and larger societal play, but we have great folks in the tech and data space trying nonetheless.
The part of the population that identifies as female is not a minority… in fact it’s slightly over 50% and therefore a majority. For the sake of the rest of this post, let’s just say it is 50%, for statistical ease. From my time at university, I remember that about 50% of the year on the Mathematics course was female, but less than 5% of the Computer Science course was. There is no difference in the type of talent required for the two courses - therefore, the reasons for the variance in diversity are purely environmental. I hope that there aren’t men out there who think women don’t have the required natural talent to work in tech and data… although there almost certainly are...
So, essentially, if talent is shared evenly across biological sexes, then what sex a data person is should be able to be modelled by a fair coin. As we hire and build teams, over time they should end up 50% female… a Bernoulli process with p=0.5. Yet, as we know and have experienced, this is not the case. Throughout the time I’ve been hiring data teams, it has been a struggle to attract female candidates to apply for roles.
There are many reasons for this:
Shopping list job descriptions which have been proven to put female candidates off, who think they have to have all the skills mentioned and not just some. Male candidates don’t generally think this way. I’ve always just applied for roles I want anyhow, even if I only have half the list.
Office-first workplaces systematically discriminating against women who take the bulk of caring responsibility within society. (I am worried about the current pushes towards office-first by big tech, undermining progress in diversity of all kinds.)
Heavily or solely male teams being unattractive places for female candidates, who will have experienced poor cultures within these kinds of teams in the past.
Not enough candidates being produced by the education system who would naturally go into data roles. eg STEM backgrounds. I say naturally with purpose here: people don’t need STEM backgrounds to work in data. I have one and it’s been useful, but what I’ve learned can be learned later in life, too.
All of the above resulting in fewer female candidates in the overall talent pool to choose from. Harnham published their diversity survey results recently - it shows some worrying things, and even where there has been progress, it shows a long road ahead.
The UK saw a small rise in the percentage of female data professionals in the industry from 27% to 29%. Although just a modest increase, we are hopeful that this marks a turn in the stagnation of last year.
The US saw a 14% decrease from last year in the number of female data professionals, meaning that women now account for 22% of the US data workforce. This drop was more severe for women in their first role in data, decreasing by 36%, which is mirrored by the lack of female representation (12%) in entry level positions, explored later in this guide.
This roughly 80/20 split has been that way since I started my career, it hasn’t really improved. As the report alludes to (“the stagnation of last year”), the progress isn’t always one way with pauses, it gets reversed depending on the year. Which is why, over a long time, there hasn’t really been any progress at a macro level.
I think what we’ve seen are organisations who have chosen to address the issue head on, and had success doing so. I recently had the privilege of being part of a panel at Bumble, talking about data careers, and we touched on this topic there. Bumble has a data team that is greater than 50% female, which I think illustrates another point I have seen before: teams which have good diversity find it easy to maintain it and also are much more attractive to female candidates.
Let’s drill into how it goes wrong:
The first two times I built data teams, they were small teams. The first was a team of three (excluding me) and I had one female team member - no solid evidence of skew, but she was the most junior member of the team. The second time, it was a team of four and I managed to have one female team member most of the time - not great, but not terrible either.
The next time I built a data team was at a company that had a dichotomy where engineering (including data within it) was mostly male and every other department was mostly female. This led to the company, as a whole, being balanced, because engineering was about half the company. However, this kind of balance at a top level, unbalanced at department, can cause a big gender pay gap; engineers are often paid more than double what other roles are in tech companies. This illustrates another importance to having balance in all departments/professions. Globally, females will be significantly financially worse off if they don’t have a fair share of tech and data roles.
The data team there was a team of two when I joined, both male. Things were challenging, though, I ended up needing to replace a senior team member quickly… I proactively chose someone from my first data team which had only had senior male team members (compounding effect already starting). Then we started to hire in the more usual way, using LinkedIn and our careers page. What we found was that about 95% of applicants were male. Needing to bulk up the team, I made the next two hires from the same candidate funnel, both male.
Suddenly, we were a team of five people who were all male. If you flip a coin five times and it’s always heads, you can be pretty sure that it’s not a fair coin. This was not my intention and even though it was something I noticed, it wasn’t my primary preoccupation at work… I had lots of other things going on - not an excuse, but the reality. However, we had reached a team size where we weren’t painfully stretched by bursts of work and I had a bit more time to breathe and think. One of my team faithfully kept pointing out to me that we needed to address diversity in the future. It’s good to have people who are willing to persistently bring up issues that are being forgotten!
The fifth hire really made it hit home that something was wrong and we needed to fix it. I didn’t and don’t believe in all-female pipelines: if you reduce your candidate pool by half or more, it will reduce your quality, too, in time. What I wanted was a balanced pool, and it didn’t matter if I had to force it to be. I spoke to my Director of Recruitment about the problem and suggested an idea: that if we didn’t have a balanced candidate pool making it to interview stage, that we would use external agencies to provide only female candidates to compensate for our funnel being unbalanced. This is fair, because the overall funnel is balanced at the applicant stage.
There is extra cost with this forced funnel method, but it’s not for every hire, it’s only for the instances where the candidate hired is female and provided by agency. Let’s imagine this is for one third of hires and the agency usually charges 15% commission - this is an average agency fee per hire of 5%. 5% is not huge, and good agencies can often find you hires for slightly lower salaries than you might find via other means, as was the case the first time I used this method. The relevant male candidates who had applied were all asking for at least 10% more - as female candidates are, on average, paid less than male at the same skill level, it’s possible to hire with agency fees without necessarily paying more. I am speaking without ethical comment, as an engineer, who is applying engineering to improve diversity - I’m making the numbers work to solve the problem, in every sense. You might say it is wrong to knowingly hire a woman for less than equivalent male candidates are demanding. I would say that I’ve always tried to meet a candidate’s expectations when hiring, and that those are defined by them and not by the market or our salary bands. However, once they have been hired and come round to a pay review cycle, making sure they are paid appropriately to peers kicks in. The part of the budget used for agency fees in their first year can be used to increase their pay in the second year if needed, to put them in the right place.
By the time I was able to use this method, we had managed to hire one female candidate - they had applied for an analytics engineer role, but we had an analyst role she was a better fit for, that hadn’t been opened yet. This is a way to use opportunistic hiring to improve diversity rather than harm it, as I’d done before. You could choose to only allow opportunistic hiring where it improves diversity. The role I first got to try the forced funnel method on was a lead analytics engineer role in my team.
The team had got larger by this point and I was looking to have three leads: a product manager, a lead analytics engineer and a director of analytics. All of the three, apart from lead AE, had been filled by organic promotion from the original five male team members I had. The lead AE role was a new vacancy and so was the perfect moment to try the forced funnel method, especially as every other leadership role in data was held by a male team member.
As expected, nearly all of the organic candidates I had for the role were male. I had one female candidate through this route who was promising, but she didn’t want to do our code exercise - this was at a point where it was very much a candidate’s market. We had one more great female candidate, but, partway through the process, she realised she would rather not line manage and didn’t want to focus on analytics engineering. She had a great data + specific industry background and was more attracted to our treasure trove of industry-specific data, than she was the role. Once again, I made an opportunistic hire for a senior analyst role that wasn’t live yet. Doing this can be controversial at companies, but the cost of hiring is huge, so if you can hire multiple candidates from the same funnel, whilst also improving diversity, I would highly recommend it. With long notice periods in the UK being common, very few hires start soon enough to materially affect budgets for the year.
So, I reached out to my contacts who are recruitment consultants - there weren’t too many who specialised in analytics engineering at this point and I ended up speaking to Harry Gollop, who you may have met at London Analytics Engineering Meetups. I thought he may think it strange that I was asking for only female candidates, but I explained the whole funnel and how we had plenty of male candidates applying organically, and he quickly understood and got working.
Harry ended up sharing a couple of great candidates, but there was one that really stood. Harry had been tracking her as a rising star for a while, and she had worked at a well-known MDS consultancy in London that I had nearly ended up working for in the past. I won’t go through every detail of the hiring process, but as you may have guessed - we hired her!
I loved having her in my team and we’ve stayed in touch since I left the company to enter startup land. She has recently been promoted to Director level role at the company 🫡🙌. I only found her because of using the forced funnel method. I have shown her this post and she is happy for it to go out, but to save legal complexities and approvals we decided to keep her name, and the company name, out of the post.
Once teams achieve diversity, as the team above had, this becomes self-sustaining. This was almost immediately evident when a further internal applicant to our team said she wouldn’t have joined if not for the diverse team we already had. Lack of diversity is like an inertia you need to overcome. Energy needs to be expended to overcome the inertia, and, in this case, the energy store is money. This is actually relatively convenient and easy to access - all recruitment has budget.
Why am I telling you this story? It’s repeatable. Anyone can do it. Take it, use it. If your HR/CXO won’t allow some agency fees to help overcome the inertia - diversity isn’t a priority for them 🤷♂️. The numbers work, because of the existing gender pay gap plus time saved to internal recruitment processes. There may not even be a true net cost.
This is more important than having diverse speakers at conferences… this is the industry. If we get this right, everything else will follow - the fair coin always follows the normal distribution over time. Then we will have a larger pool of female data leaders who have the authority, seniority and experience to draw upon to speak at conferences. I’ve found that, in order to improve diversity in speakers at conferences, speaking slots can end up being given to female speakers who can be relatively new to the field - there is nothing wrong with this, but it would be better if there was a balance of female speakers of varying seniority, IC/EM rather than skew. If we work towards more women having leadership positions in data, we automatically build this balance, we stop struggling to find great female speakers for conferences - they will be as abundant as male ones.
As technologists, we’ve solved everything else with mathematics… why not this, too? We don’t appeal to a data pipeline to fix itself naturally, we tackle it with force and intent, a problem to be resolved, and soon. The first step to addressing diversity, is acknowledgement that it is a problem to be solved, a problem that isn’t to be solved is just one bemoaned and forgotten about. It needs to be as real as a written company goal, OKR or Jira Epic.
I also genuinely believe that if we tackle female representation, we will go a long way to solving for other diversity, too. Teams that have good gender diversity will be less likely to unfairly discriminate against minorities than all-male teams. Who will “fit in” with the rest of the team when the team is more diverse, is a wider pool. I have no evidence for this, but I’m sure research like this exists to support it, and I would bet my bottom dollar on it.
I haven’t strayed into ethics much in this post, because time has shown it’s not very effective as a reason to persuade company leadership to change policy. Profit is a much better reason, and one they can’t ignore - it’s the purpose of their existence. 15% might not sound like a lot to people in the VC space, but a 15% increase in efficiency, revenue, margin or any competitive edge is huge. It can be the difference between going under and winning the market in many tightly-fought industries. I wish that this reasoning was used more for an investment case into diversity initiatives. We want that 15% to be rattling around CXO minds, we want them to be thinking: “I want my 15%”. In most cases, the money carrot is greater than the ethical stick.
I asked Lindsay if she would like to add her perspective:
Firstly, kudos to David for inviting me to be a guest on this blog post. I’m happy that a LinkedIn rant in a moment of frustration could turn into a longer form of discussion that hopefully is a little more productive :)
I also appreciate David for sharing his experiences about working to improve the critical issue of diversity in the data industry. The tactics he shared are not ones I have tried myself before, so It's refreshing to see methods used to actively address this problem head-on that have worked. This kind of work is important to arm other hiring managers looking to grow a diverse data team. Outside of the tactics shared, I resonate with many points in David’s post, and it's clear that improving diversity is a shared concern.
While I agree that hiring managers and HR teams can invest more to help hire a diverse team as a way to work through this issue, I’d argue that only addresses the issue at one end of the funnel (from the bottom up). Unfortunately, being able to hire diversely requires balance at the entry level. Based on the Harnam study that David shared above, we’re not seeing that balance–the US saw a 14% decrease from last year in the number of female data professionals, meaning that women now account for 22% of the US data workforce. This drop was more severe for women in their first role in data, decreasing by 36%, which is mirrored by the lack of female representation (12%) in entry-level positions (Harnam Diversity in Data Report 2023).
As my ranty post suggests–I feel pretty strongly that conferences, networking events, and podcasts play a pivotal role in shaping industry narratives. This is because these types of events give the industry access to role models. Role models play a crucial role in helping to improve representation, by giving underrepresented groups someone to look up to who looks like them and they can relate to. David mentioned how Bumble has a data team with over 50% female representation–I don’t find this surprising at all, as Bumble’s founder and original CEO, Whitney Wolfe Herd, infused DEI principles into how the company operates. Bumble is known for being one of the most representative companies in tech–which not only speaks to the importance of women in leadership roles and their impact on diversity within tech but also how these individuals can serve as role models for the rest of the industry. If the industry focuses more on showcasing diverse groups of people at events like conferences, this serves as a top-down approach to improving the lack of diversity in the data industry. By highlighting role models at conferences, we drive more awareness and visibility–and hopefully inspire the industry about what our world could look like.
Yes, it may be difficult to fill conference speaker spots with a diverse group of speakers when there is underrepresentation in the industry. So difficult in fact, that some conference organizers just made women up so they could appear to be more inclusive. However, conferences like DataConnect, hosted by Women in Analytics, demonstrate that it is very possible to run a conference that has exclusively people from underrepresented groups as speakers–and has an impressive lineup of very qualified speakers each year. Just as David has called out in his diverse hiring strategies, DataConnect takes a similar approach of actively investing to get an all-female speaker lineup: they reach out to and recruit women, they provide speaker training and support to ensure high-quality talks, and they sponsor speaker accommodations to make attendance more feasible. If other large data conferences invested in building diverse speaker lineups in the same way, as well as choosing more accommodating destinations (*cough cough*–not Las Vegas), I’m sure we would see more equitable representation.
It's heartening to see more discussions about improving diversity in data–my new podcast, "Women Lead Data," is diving directly into the gender gap we see in the data industry, specifically targeted at the leadership level. I encourage you to take a listen and share with your peers in the data industry. I’m optimistic to see what 2024 brings and I hope to see more championing of diversity at all levels in the data industry.
Lindsay’s logic around role models and a top-down approach is compelling, and I do believe we should do this at the same time as tackling diversity inside data teams.
Professions like accounting are doing much better than data on diversity, despite there being no inherent talent difference to be good at either. I think data should aspire to be on par with accounting here.
Please feel free to reach out to either of us if you have questions or comments on this topic!