Soon-Ee Cheah on data & AI in 2024 and beyond
"In every ML/AI product I’ve developed, off-the-shelf models with state of the art performance on public benchmarks have produced unusable results internally."
One of the best hires I made at Xero was luring Soon-Ee Cheah away from an amazingly productive stint at Zendesk to lead my AI Products team. Originally trained as a pharmacist, Soon-Ee eats complicated AI research papers for breakfast and communicates terribly complicated mathematical concepts in a lucid, understandable and business focussed way that is unparalleled. Just ask him to tell you how AI is like a pink banana. If you ever get the chance to work with him, jump at it. Enjoy our conversation!
Soon-Ee, 2024 was another busy year for data & AI. What’s one development / milestone / news story that really caught your eye?
Witnessing the GenAI arms race between OpenAI, Google, Anthropic, and the open-source community has been a source of fascination for me. Coming from the world of drug development, the parallels in how research and development races unfold between technology and pharma have been uncanny, with key differences (i.e. enforceable patents, mature regulatory frameworks, etc). But what’s caught my eye is the long-term advantage of vertical integration in both fields: that is, owning the machinery to take an interesting idea and scale it to meet the commercially-relevant demands of consumers and regulators.
In drug development, biotechnology companies with great ideas and interesting molecules come and go, but almost invariably it takes the might of big pharma to navigate the enormous logistics exercise of having a drug trialled (and trialled, and trialled), registered, made at scale, distributed, marketed, and ultimately administered to patients. So it is with AI it seems. Entering 2024, the narrative was that Google had been caught flat-footed and was running well behind OpenAI. As we enter 2025, the playing field has evened out and the impact of vertical integration is starting to be felt. In the domain of video generation, Google has emerged as a leader: showing the might of having invested in owning data centres, AI-specific silicon (TPUs), AI research teams (Deepmind), coupled with billions of users globally across their suite of products and services.
You’ve been working in and around data & AI for a while now. Many things have changed! But tell us about something that was true when you started out in this space and is still important today.
System evaluation (or “Evals”) on data from your specific domain is both critical and hard. In every ML/AI product I’ve developed, off-the-shelf models with state of the art performance on public benchmarks have produced unusable results internally. It’s also entirely possible that I am desperately unlucky when it comes to picking problems!
In this era of GenAI product development, designing “good” evaluation suites has become even more critical due to the degrees of freedom in the outputs of these models. Often, the goodness of outputs are subjective, and not even reproducible due to quirks in the way temperature and sampling impact the results. Where previously teams could (somewhat) rely on precision, recall, etc to score iterations of their ML models, with GenAI we have to contend with nebulous concepts like “vibe checks”, or “groundedness”. Add to this the fact that evals are becoming incredibly expensive, requiring human scorers to build confidence in a result. From a commercial perspective, evals becoming trickier is a bad thing: every dollar spent understanding an eval is a dollar not spent improving the product – but you can’t improve a product without a good eval.
It’s been a heady couple of years with 2024 almost as frothy as 2023. What's one common misconception about AI that you wish would go away?
A pretty tricky misconception that’s proven quite popular in the media is that replicating one or many facets of human intelligence (i.e. taking X job or artificial general intelligence) is the most important goal of AI. It’s more helpful to think of AI as tools – not dissimilar to the GPS navigation in your phone, or the humble hammer. Both of these technological marvels allow us to exceed human performance on tasks like navigation and hitting nails. Similarly, the goal of AI shouldn’t be to replicate human intelligence: humans are by nature beautifully flawed creatures and we have a tenuous definition of intelligence to begin with; instead we should focus on using AI to deliver tools that help us achieve superhuman results.
Who do you follow to stay up to date with what’s changing in the world of data & AI?
The majority of my reading in AI tends to be on arXiv, following the published works from industrial labs like Meta FAIR or Google Deepmind. But I tend to find myself drawn toward reading on how data is used in Sport (F1, Soccer, etc) – I’m a big believer in cross-disciplinary reading, because more often than not someone has already solved some or all of a problem I’m working on.
Leaning into your dystopian side for a moment, what’s your biggest fear for/with/from AI in 2025?
My biggest fear is that we will delegate too much decision-making power to AI algorithms with poorly defined objective functions. More concretely, every decision-making algorithm needs a mathematical definition of correctness, or an objective, that shapes a decision. For example, the recommendation algorithms of most social media platforms are designed to maximise engagement (i.e. their objective function is to show the content that maximises view time and clicks). But we’ve seen evidence that the objective of maximising engagement can lead to the polarisation of societal views, as users are never challenged with recommended content that differs from their already held perspective.
At the heart of my fear is that it’s hard to mathematically define the best outcome for society. Almost every impactful decision we make involves trade-offs. Worse still, mathematically elegant and simple objectives like profit motive (i.e. maximise profits) or driving engagement have been shown to produce suboptimal outcomes. AI algorithms today can’t make moral judgements – and as market forces drive the integration of AI into all sorts of business workflows, the probability of unintended societal consequences increases.
And now channeling your inner optimist, what’s one thing you hope to see for/with/from AI in 2025?
I look forward to the breakthroughs AI will enable in biology and drug development. AlphaFold is incredibly powerful, but it's still one small piece of the puzzle that is developing therapeutic drugs. Much of the early work in drug discovery takes years and happens behind closed doors, so my hope is that in 2025 we will start to see the fruits of researchers collaborating with tools like AlphaFold since its release in 2021.
You can follow Soon-Ee on LinkedIn