State of the Union All

Some 1st party data about data folks

Sep 27, 2023

white concrete building under cloudy sky during daytime

I hosted London Analytics Meetup #5 today and had some great talks, which I’ll cover later in this post.

When I created the Meetup invite this time, I created a Google Form survey to get to know the people who attend my Meetup a bit better than I can just by mingling. This is the form and if you work in data, please feel free to fill it out.

The questions on the form cover some topics which I am using to help me make future Meetups even more relevant for the people attending.

I think the data collected is very interesting and telling about the state of the data industry and tooling landscape.

First of all, for context, I asked which company they worked for.

There was a real mix of tech companies of various sizes from Google, Dropbox, Skyscanner, Checkout, Spotify, Zilch… all the way to corporates like JPMorgan, Lloyds Banking Group, Experian and Reckitt Benckiser. While being an even mix of people from both groups is over-representation of Tech and Startups, I do feel there is decent representation in the group.

I also asked about what their job titles were:

Forms response chart. Question title: What is your job title?. Number of responses: 59 responses.

Essentially, what I found from this is that there is a broad range of people in the Meetup group from all walks of life, but also that there is a huge number of data roles in existence today including: Data Engineer, Analytics Engineer, Data Analyst, Data Platform Engineer, Data Scientist, Head of Data, Product Analyst, Marketing Analyst, BI Developer… plus many more. I wonder if more standardisation in data role titles could be helpful? Do we need both BI Developer and Analytics Engineer titles to exist in the future?

The other interesting thing from the roles is that most SWE roles you could think of are represented here. Is this because there is a draw into Data from SWE? There certainly has been in the past. Could it also represent a convergence between some DE roles and SWE roles?

When asked about their biggest problems faced at work currently, I offered a fairly broad range of answers:

Forms response chart. Question title: Broadly related to data, what is the biggest problem you're facing at the moment at work?. Number of responses: 59 responses.

Data Quality remains a pervasive problem. Larger than ad-hoc data monkey work, but I’m glad that we’re already trying to solve the no 2 problem!

Poor tooling is also a big problem, which aligns with the huge amount of VC investment into the data space over recent years.

Lack of mentoring is equally large, and something I feel is so important. I stay closely in touch with the last data team I built - it’s very easy to become the most senior person in data but without being senior in your org. It can be a lonely place, as I have experienced.

I’ve decided to become a mentor at Data Action Network to do more to alleviate this problem. Thanks to Sebastian Hewing for setting up this organisation! This is one problem where I don’t think tech can help, beyond providing us a framework for being organised and connected.

We Need You.... » WFA Staffing Group — You don’t need to commit loads of time, it can be as little as an hour a month.

Finally, lack of budget is, as expected, an issue. Whether it’s the current economic climate or otherwise… it’s hard to say without having also run this survey a couple of years ago. However, I’m sure it’s a factor, with many orgs I know tightening budgets.

I also asked which BI tool they use:

Forms response chart. Question title: What BI/notebook tools do you use?. Number of responses: 59 responses.

The responses given here persuade me that the group is quite representative, as we know that Power BI and Tableau are by far the most popular BI tools in the world, with over 50% market share together - this is illustrated here. Looker’s high share (and even Hex’s 5%) is probably skew towards Modern Data Stack teams being more likely to be members.

I also asked which data warehouse/s they use:

Forms response chart. Question title: What do you use for a data warehouse at work?. Number of responses: 59 responses.

This is pretty much what I expected, although it was traumatic to see how many teams still have to use SQL Server (and probably on-prem to boot). It was the “data warehouse” I used the most in my career before using MDS tools. I will not go back to it.

Finally, I asked if they use a semantic layer and, if so, which one:

Forms response chart. Question title: Do you use a semantic layer and if so which one?. Number of responses: 62 responses.

This was so telling and relevant to what we’re building at Delphi… most teams don’t use a semantic layer. I met a very advanced large data team with great data and analytics engineers who don’t use one. I think it shows huge opportunity and room to proliferate in the future. Over 37% of teams said they use Power BI, and less than 2% use MDX with it.

It was also interesting that the dbt Semantic Layer (this will be the legacy one, as there hasn’t been time to adopt MetricFlow), is as popular as LookML. I wonder if there is some conflation with dbt-core going on here. Also the LookML % here is less than half of the % that said they used Looker as a BI tool. This suggests that there are many users or analysts using Looker in the group who are unaware of LookML. What will these numbers look like next year, as dbt Labs make a big push to distribute MetricFlow?

London Analytics Meetup #5 talks:

Maiara Reinaldo of Funding Circle

Maiara spoke about the data architecture at Funding Circle, including their compartmentalised running of dbt models as individual nodes in Airflow. She went further, to speak about their use of DataHub (an OSS data catalog) to help colleagues, both technical and not, to discover data assets they need for work.

Adam Timlett of Turing Meta

This was probably one of the most esoteric talks I’ve heard at a data meetup or conference. Adam’s talk about organisational analytics was truly unique! I don’t think I can do it justice in summary, so watch out for the recording here.

George Apps of TravelPerk

George spoke about the differences between the individual contributor and manager path in data, and his own experiences of both. The key takeaway for me is that not everyone in data is suited to be a manager or an IC. Know thyself and choose what you want to do accordingly.

Thanks to everyone who came, it was great fun to host and to attend! Thanks also to Funding Circle for being our gracious hosts!

davidj.substack

Discussion about this post