Last year, I wrote a post by this name with very different hopes from the Snowflake Summit.
I was hoping for:
Query Acceleration Service or some other form of automated efficiency to be released
Some true product innovation
I certainly got some of the latter, but not much of the former.
It never even occurred to me that AI would be front and centre of the agenda this year, let alone that I would be a co-founder at an AI-native startup. I thought we’d be hearing more about Snowpark, higher latency Unistore, a broader Data Cloud accessing backend data from more platforms… maybe even Query Acceleration Service and some other features to help drive efficient use of Snowflake, given the economic climate. Yet here we are:
There is probably no-one more enthusiastic about AI on this planet than Jensen Huang and no wonder, given it’s made Nvidia more valuable than Facebook, able to climb in value in a market trending down and worth over $1Tn in Market Cap. Is FAANG over? Is this the era of NAANG?
You might say that Nvidia is quite overcooked at the moment and that, with AMD looming with a CUDApetitor soon, it’s peaked. I wouldn’t bet on that though - every analyst out there is still saying buy buy buy on NVDA. However, even without Jensen Huang on the stage, there is plenty more LLM enthusiasm to go around.
As far as I know, Snowflake does not use Nvidia GPUs in any of their warehouses (at least the ones mere mortals have access to) - is this about to change?
How will this take shape?
Will there be new GPU warehouses in public or private preview? What will their use case be? I would probably bet against this: it’s much more of a Databricks play to provide the infrastructure for customers to build what they want, rather than the Snowflake way of providing a service.
Could there be LLMs available on Snowflake that you can use special GPU snowflake credits (or just regular ones) to use? Could you also fine tune LLMs on your company data to meet typical product and marketing use cases that Snowflake has always supported? How will this be implemented?
Will Snowflake compete with text-to-SQL companies, by providing the very best text-to-SQL experience for Snowflake?
I’ll check back in next week with my post conference write-up.
It is possible that, whilst I believe AI will have a positive impact in the grand scheme, we will see AI disruption in the next year that renders this kind of conference less important in 2024. Is this the last big conference of the post Big Data/pre AI era in Data? Let’s make it a good one, then.
Hopes
This year, I have a different set of hopes to last year:
To catch up with as many of you as possible! If you’re coming and want to catch up, ping me!
To meet folks interested in what we’re building at Delphi.
To meet our great partners, and perhaps also some new ones.
Here’s where I’ll be:
Saturday/Sunday - chilling around Vegas
Monday day - meeting up with partners, hoping to catch the Keynote of the day
Monday night - Ethan Aaron’s not-so-low-key Data Happy Hour at Beer Park with Portable, Metaplane, RudderStack, Brooklyn Data Co. (a Velir company), Striim and DataGalaxy. Sign up here: https://docs.google.com/forms/d/e/1FAIpQLSdQVtx2cS5sWd6jxViZBS76MqnPew-k49bg-DvpgvHD_2v0dg/viewform
Tuesday morning - keynote, but I’m resigned to being in the overflow tent again in all likelihood.
Tuesday afternoon:
Tuesday night:
I can’t not stop by Metaplane, Secoda and Explo’s Chill and Grill happy hour! Sign up here: https://forms.gle/zV29NMgSZiNndab27
dbt Labs, Hex and pH Data’s rooftop social at Beer Park - sign up here: https://www.getdbt.com/resources/rooftopsocial/
Wednesday - I’m purposefully leaving a lot of time to meet folks on Wednesday, but also:
2:15 to 3pm PT - Michael and I will be doing a Delphi demo at Cube’s booth no #1753
Then perhaps one last thing in the evening, before I head to the airport!
London Analytics Meetup #4
I hosted the fourth Meetup of this series with Andrew Jones of GoCardless, at their office near Old Street.
We had some great talks and here are the slides and the video of the presentations.
Emanuela Ciotti - Analytics Engineering Lead at Dojo - Centralised Behavioural Models: Empowering Analytics Engineers & Product Insights
This was a really interesting talk that blended organisational structure with data modelling according to domain, custom dbt macros (that were requested to be open-sourced during the talk!), event tracking and triples (who, where, what) and even a Sankey chart on product user flow!
Andrew Jones, hero of Data Contracts, on Data Contracts. The four principles he went through were really helpful to frame this topic. Andrew’s book on Data Contracts is out now!
Last on was Vignesh Ganesan of e6data. This was a really topical vendor presentation in this era of cloud and DWH cost saving. Vignesh ran a live demo, where e6data absolutely smoked a slightly higher-resourced Databricks Photon cluster, so much so that he answered multiple questions between the point e6data finished running the query set and when the Databricks cluster finished. It was over 95% faster and about 4% of the cost for the same workload! 🤯
e6data doesn’t do write/merge (yet) - it’s read only, but it can work on any data lake that has typical file formats and partitioning, most commonly with parquet. With Firebolt, the lack of these features needed for modern ELT has caused issues, as storage needed to be in their proprietary F3 format. However, I don’t think it is as problematic for e6data, as you can use whatever you want to do ELT on your data lake and e6data can just be used for high concurrency, fast and cheap reads, without needing to move data. BI tools like Looker generate a huge amount of these kind of queries.