This week’s guest is Ilan Man - VP of Data & Analytics at Brooklyn Data. Ilan is also a principal contributor/editor of Locally Optimistic, which many of you will be part of.
I started speaking to Ilan at the end of 2021, on the Locally Optimistic Slack workspace and we’ve since managed to meet in person in New Orleans, at dbt Coalesce 2022.
Ilan’s experience, both in-house and as a consultant at BD, is very helpful to me and I’ve asked him questions about operating data at scale or in difficult scenarios that I haven’t personally experienced.
Introduction
Education
I graduated with a statistics and actuarial science degree from the University of Toronto (I grew up in Toronto). I originally entered as a computer engineer, but quickly found out I wasn't nearly as good at programming as my peers, but was always good at math. Someone introduced me to the actuary profession and it seemed like a reasonable enough career path. Fast forward 8 years into my career, and I took a break to get a Masters in statistics from Duke University. So, my background is much more on the stats side than on engineering or product or management.
[David Jayatillake - What made you want to go do the Masters? Especially after being relatively established in your career?
Ilan Man - A couple of things:
As I learned more about data science (that pivot came about 6 years into my actuarial career), I realized that the statistical knowledge and applications I had used up until then weren't the same, and, frankly, were much more superficial than what data scientists were doing in the early 2010s.
Data scientists were building things from scratch, as certain libraries (like scikit or statsmodels) were nascent and, in general, stakeholders weren't familiar with concepts that some of us take for granted today, such as A/B testing, or Bayesian thinking, or how to think statistically (distributions, tails, etc). I thought that if I were to grow my career in this direction, I'd be spending a bunch of time developing junior data scientists, and also explaining to non-technical folks some of these core concepts.
Knowing myself, I need to deeply understand something if I'm going to put my name down beside it, so wanted to learn from the experts in academia and then apply that to the business domain. Sort of a 1 step back, 2 steps forward thing. The jury is still out on whether that plan worked!
I wanted a temporary change of scenery from New York, and some things in my personal life aligned timing-wise to make space for a new experience. So, this was also a catalyst for that.]
How you came to data in your career
I started my career as an actuary, which had a nice blend of stats, a business domain (insurance, finance, risk) and lots and lots of spreadsheets. After a few years, it was clear that this wasn’t a career path I was interested in (too corporate, stuffy, and much less technical than I was hoping it would be) and that’s when I happened upon data science, around 2012. Once I learned more about what data scientists do (and at that time there weren’t too many) I realized it was very similar to what I did as an actuary, but they worked in code, used MacBooks, and worked at start ups, unlike the actuarial world I was familiar with. I decided this was definitely the next step for me. A little “fake it till you make it” later… and, well, I’m still not sure what’s on the other side of that, but it’s been a fun ride!
An overview of the places you have worked in data and what roles you have held
Since pivoting out of being an actuary (after spending a few years at a large German reinsurer and a couple as a consultant at EY), I held data positions of increasing scope and complexity as an analyst, a data scientist and a head of data - the latter across 3 different tech start ups. Aside from a few poor decisions here and there, I spent meaningful time as a data professional at Squarespace, Paperless Post, Trialspark, and presently Brooklyn Data. It’s funny that my last job as an actuary was as a consultant at EY, and my current job is also at a consultancy, though my jobs and the companies are polar opposites.
If you’ve moved on from a data practitioner role - when, where and why?
This part is kind of interesting, because while I certainly am working in the business of capital “D” Data, I’ve never been further away from the data itself. This does relate to an interesting question around where folks who have been in data leadership roles go next, but that’s a conversation for another post, that I know you’ve explored.
[David Jayatillake - I know you say that the conversation around the "missing data executive" is for another post/conversation, but how do you feel personally about moving away from doing the data work yourself and being so far away from it? What does your current role actually entail on a day to day basis?
Ilan Man - I feel conflicted. I really love digging into data, in particular on the statistical modeling side. I care less about the algorithms (it's all LLM going forward anyways, right?) and much more about translating those X matrices and y vectors into the application's domain. I love it when I explain that a certain variable means ‘cost per unit’, and the stakeholder has a lightbulb moment and says "Oh, that's ‘affordability’!" and ‘affordability’ is a term that's important for them, and means something. And then they start to engage with the model in a way that helps build their intuition (assuming the model is built well).
That said, this is probably about 10% of my day. The rest is helping to build Brooklyn Data into the best data consultancy out there. This looks like internal meetings and strategic alignment sessions, business development and sales, executive relationship management, and being a good partner for my peers and team by supporting them however they need it. I really enjoy a lot of the aspects of my job, even if they aren't digging into the data, since I think I have one of the most experientially-diverse (and in that way rewarding) jobs out there.]
Human Interfaces
Broadly, and this is a cliché, I’m sure, but I’ve been thinking more and more about the end results of our work as data people than ever before. There are 3 versions of this that we can discuss:
Focusing on the ‘what’ instead of the ‘how’
A simple framework for decision making (consultants love frameworks)
Focusing on outputs instead of inputs
The ‘what’ instead of the ‘how’
When I was a junior professional, focused on technical rigor and velocity, I spent a lot of time considering how things are made (see above: going back to get a Masters to re-learn regression). It was critical for me to explain why I chose the models I chose, why the formulas were what they were, some other approaches I would experiment with, etc. Really paint the picture and instil confidence that my approach was sound. And my presentations would over-index on the model development process, the EDA I did, anything interesting I found, etc. And at the end of the presentation, there may be some findings, next steps, perhaps even a recommendation. Later in my career, I realized that my audience, which became increasingly senior non-technical leaders, weren’t interested in 90% of the content, which was focused on how I got to where I did and why they should care. They wanted to know what the output was, and what it meant for the business.
Another thing I learned – don’t assume people understand what the implications of your model or project are. You should always guide them, and frame it as “here’s what this means for you”, balancing TMI on the one hand, and talking down to people on the other; a tricky needle I’m continuing to learn to thread.
[David Jayatillake - How does this translate to a typical post-engagement presentation at Brooklyn Data? Do you ever just start at the end - "show me the money" - then explain, if wanted, from the appendices?
Ilan Man - I wouldn't say we have nailed the template just yet as a company, as we're always iterating through new challenges and environments. But personally, I like to open with the upshot / tldr / exec summary, keep it high level throughout, and go to appendices if needed.
But before the presentation, it’s critical to understand the audience, their expectations of the presentation, and then for you to set their expectations ahead of time. There is so much to be gained from early alignment, pre-presentation development, that will make it much easier to produce the right presentation or artifact. The appendices, then, are just support in case things go sideways, not because you intend to use them. If it's a technical talk, great, let's get into it. If it’s not, then don't plan on it. As with many things, when you try to do it all, you end up doing it all poorly.]
Decision making framework
A couple of jobs ago, I was developing OKRs for the Data team (my second stint as a head of data) and got really good feedback from an Operations leader on how they produce their team’s OKRs. They effectively used 3 dimensions: Growth, Cost and Quality, to slot every large initiative into. I started to think more deeply about this simple (and novel to me) framework, and decided that everything we did on Data should be framed in this way. Each of these dimensions will mean something different depending on your industry or organizational maturity, but it’s a nice framework that strikes the balance between simple and exhaustive. Growth is related to things like revenue, market share, subscribers, etc. Top line stuff. Cost is related to margin, efficiency, time to market, etc. And quality is sort of the “how are you doing it?” Security, risk management, % uptime, customer NPS, etc. This doesn’t cover the universe of goals, but they are largely mutually-exclusive (though obviously overlapping) and everybody can reason about it.
[David Jayatillake - I understand the dimensions, but how did you measure these for an individual data project on the backlog? How did you use some kind of balanced score for prioritising what to work on?
Ilan Man - I never actually used the framework to prioritize individual projects, but I could have. It was a way to organize OKRs, so that others in the org could reason about our work. It's a big laddering exercise - does the project you're working on ladder up into a larger, org-wide objective at some point, as you go up?
For example, paying back tech debt1 rarely gets prioritized, because it’s never obviously urgent and it’s only theoretically important (and even then, only to folks on the technical team). Developers don't often quantify the impact of paying back tech debt. But if you did, and you could point to performance optimizations that decrease consumption (which may ladder into a cost OKR) or a refactor that increases developer velocity and can help expedite a new feature (which may ladder into a growth OKR), then you may get that tech debt epic prioritized.
At the very least, you could use the framework as a way to communicate the impact of your projects to folks outside your team in a narrative format which is often what gets initiatives funded.]
Focusing on outputs
Finally, another framework or mental model, that I’ve been using for the past four years or so, has been to focus on outputs as opposed to inputs; again, very consistent with the items above. This is a controversial opinion to hold in the data community, where folks are really invested in understanding the inputs to their models, because that’s where they have control. That’s what they can explain. There are tighter feedback loops and they can more easily build a roadmap for their team. I totally get it. I used to focus my efforts on nailing the inputs, too, and, if I did my job well, improving the inputs should translate to better outputs. However, in more conversations and organizational navigation with higher ups, it was clear that they were all laser-focused on the outputs / outcomes.
In general, executives care less about what you have control over (which often leads to excuses about why targets were missed or who is responsible), and they care more that you get the job done. After all, the leadership team is the one cutting the checks, signing off on budgets and initiatives, and raising funds for your VC-backed company.
What’s more, focusing only on your “span of control” is a great way to get boxed into a small scope and surface area at your company. That’s exactly where leadership will put you, and then look elsewhere for a partner who can look beyond their scope to achieve the business’s targets. Instead of focusing on what you can control, align yourself and your team to an output metric - like your Sales, Marketing and Product colleagues do. Force yourself into the conversation that can expedite your company’s roadmap by tying the team more closely to a company-level objective. Earn your right to be at the table, not because you own the data, but because your leaders trust that you can get it done. In this way, you can transcend being a “data leader” into being a “leader”, who also happens to be able to leverage the organization’s most valuable asset: its data.
Using the above framing has helped me grow myself professionally in the direction I wanted, while also elevating the data team's profile.
[David Jayatillake - This is definitely something I can identify with and have written about before - data is your input to drive output from. How do you feel data folks can get comfortable with this idea and start actually living it for the mutual benefit of them and their company? You and I both have some hybrid business and data training in the form of accountancy and actuarial training, respectively. Benn's post the week before last was interesting in this regard, too. Do you think data folks need some business training to do this, or can it be better learned another way?
Ilan Man - The best data people I've ever managed got deep with their business counterparts. They sat beside the customer experience team, they attended product sprint plannings, they shadowed operators as they navigated their work. In addition, I think it's critical for data people to learn the basics of financial and accounting. It's shocking how seemingly simple FP&A models are compared to some fancy data science models, and yet these models are easily the most important to the organization. Benn did a nice presentation at last year's Coalesce, about learning the operating model of your company. I think that's a great starting point to learn what actually matters.]
One last thing I’d add that I want data professionals to start doing more of, is being the translation layer from tech to business. In particular, folks who are steeped in the modern data stack often talk about improved data quality, optimized performance, and flexible tooling, all of which needs to be translated into something that matters for the folks rationing up budgets (i.e. CFOs, and other C-levels).
Take data quality, for example – I often hear about data folks talking up their favorite tools or new development methods in order to improve data quality, as if that’s something that self-evidently good, and more to the point, worth investing in. What impact does improved data quality have on a company trying to raise a Series B next quarter, which is where everyone's head is at? Maybe don't bring up new tooling at this time, though most VC backed companies are in some ways always raising, so you might as well make it relevant. When C-levels want to share to their board or peers that their company is data-driven (sorry, people outside of data really like saying that), how does improving data quality help tell that story? If they didn’t invest in data quality, how much slower would their development velocity be, and what product features are at risk of being delivered, as a result? I’d like more data professionals to extend their thinking and framing in a way that their business leaders will grok. When we get better at that, then I believe more folks outside of data will better understand what we’re doing here and why we deserve a seat at the table.
[David Jayatillake - Data quality is still the biggest issue facing data teams (I recently surveyed my meetup group and it came out on top). How would you or Brooklyn Data provide a measurable, actionable business case to invest in or protect data quality?
Ilan Man - Aside: I wonder if data quality is the biggest issue facing data teams, or just the biggest issue as far as they know. Would non-data folks at their company also rate data quality as the biggest issue? Or would they rate something else, perhaps a derivative of data quality, or something else entirely (i.e. my data team is so slow to respond to my urgent requests [Data leader adds] "as a result of poor data quality").
To motivate a data quality investment, I would use the framework above. How can better data quality improve data trustworthiness and, for example, democratize it more? What if we picked a few measures, such as "# of people using key dashboards", "internal stakeholder NPS" and "# of developer hours spent debugging bad code", and tracked that over the course of 6 months, before and after implementing some tool or system.
That said, the business context and audience matters. For example, some leaders care mostly about changing their data cultures. This is harder to quantify, but we can make the case that data quality helps to improve the software development culture, by making it easier to debug DAGs, which mean developers are less afraid to touch code that they aren't familiar with and now are more motivated to fix errors, and so on. There's a really compelling story you can tell here.]
What are you doing now in your current role, that helps make human interfaces in data better?
One of the things that I really like in this role, working with customers across different industries, sizes and maturity levels, is that I get to work with a really smart team to pick and choose the tools and processes and frameworks that fit specific client needs. And it turns out there is no one size fits all. There might be a tool that, on average, is the best one in the market, but might be the wrong fit for a given client. So, the job is always evolving. What’s more, something that was the gold standard 12 months ago might not be the gold standard today, so you have to always be on top of best practices, to really understand the nuances and, ultimately, and most importantly, to understand what your customers want and what gets them there. There are many instances where what the client needs is not necessarily the best (whatever ‘best’ means in this context) tool for the job; the world is much messier than that, and that's what makes this job fun!
Side note: I appreciate folks are trying to do away with tech debt as a catch-all description of those "bad" bits of engineering work (shout out to Raphi, a former senior engineer on my team, for helping to raise my consciousness) but I'll use it here for simplicity.