Andrew John on data & AI in 2024 and beyond
"We are dealing with new technologies: it’s still very unclear how much value they will create and—crucially—who will manage to capture that value."
Sometimes you just sit next to the right person at breakfast. For me, that breakfast was a talk I gave at a joint University of Melbourne & Melbourne Business School event on the rise of AI late last year.
The amiable looking man assigned to sit next to me turned out to be Andrew John, a professor of economics who teaches macroeconomics and managerial ethics/ethical leadership in the Business School MBA and Executive Programs. He looked mildly shocked when I told him I was delighted to be seated next to an economist. Luckily he was equally delighted to be seated next to an AI practitioner because he has a particular research interest in AI Ethics. The rest as they say is history.
Andrew, 2024 has been another busy year for data & AI. What’s one development / milestone / news story that really caught your eye?
I’m intrigued by the so-called reasoning models of OpenAI’s o1 suite, but I am still less than impressed so far. I’ve done a fair amount of experimenting with prompts that give unfamiliar versions of familiar scenarios, such as classic puzzles or riddles. My experience is that the models still largely end up defaulting to the standard versions that they presumably encountered multiple times in their training data. But I’ll add the caveat that I haven’t played much yet with the full o1 model that OpenAI just released. My first experiments do suggest it’s better than o1-preview and o1-mini, but that it can still be fooled by these twists on classic scenarios. [Update: the above was written before OpenAI demoed their o3 model. I look forward to learning about how well that model performs once researchers have the opportunity to investigate it.]
Second, as an economist (who apparently finds it difficult to count to one), I find it fascinating to look at how the economics of the industry is currently playing out. I’m not saying anything particularly original if I observe that it’s far from clear where the ROI is going to come from for the staggering investments we are seeing, particularly as some AI products are being commoditised. But then, I remember when the conventional expert wisdom was that Amazon would never be profitable. We are dealing with new technologies: it’s still very unclear how much value they will create and—crucially—who will manage to capture that value.
You’ve been working in and around data & AI for a while now. Many things have changed! But tell us about something that was true when you started out in this space and is still important today.
Unlike most (all?) of your other interviewees, I’m not someone who can claim decades of expertise. Sure, I had been paying casual attention to developments in machine learning, but I only started digging deeper a couple of years ago when I realised that I wanted to be able to teach about the new ethical issues that were arising from GenAI. (To be clear, sometimes the “new” problem is simply an old problem that can now arise at a completely different scale. People have been creating fake images since the very early days of photography but a process that used to be time-consuming and required skill is now cheap and easy.)
So I’m going to give you an answer that reflects my short-term perspective. When I was first organising my thoughts on the ethical challenges arising from AI use (as distinct from, say, those arising from the building of foundation models), I found it helpful to think of a typology that distinguished legitimate use, misuse (meaning situations where there is no malicious intent, but people use the technology in inappropriate ways through ignorance), and abuse (where the technology is used by bad actors with malicious intent). Some uses naturally fall in multiple categories, so I now have a very full Venn diagram that I keep adding to whenever I learn of a new problem. I continue to find this typology a good way to organise my thinking about a messy space.
It’s been a heady couple of years with 2024 almost as frothy as 2023. What's one common misconception about AI that you wish would go away?
That it can write poetry! In the early days of ChatGPT, people were constantly claiming it could write poetry, but my experiments back then led me to conclude it was really bad at this task, at least for any even moderately sophisticated definition of poetry. I haven’t changed my mind since. Recently, there was a study that received a lot of attention showing that people are not good at distinguishing AI-generated from human-generated poetry, and that people on average prefer the AI-generated content. That’s fine, and not necessarily surprising, but it doesn’t mean that LLMs are producing good poetry.
I recently tried an experiment where I asked LLMs to write short poems about Flinders Street Station in Melbourne in the style of various poets. The verses they produced were mostly similar in terms of metre, rhyme, and content, and the distinctions among the different poets were relatively minor (including the occasional “doth” is not enough to make a poem sound like it has been written by John Donne!). It seems that LLMs still have a very narrow sense of what poetry means: in LLM world, poetry is pretty much just iambic verse with rhymes.
What (I hope) elevates this above an old-man-shaking-his-fist-at-clouds complaint is that I find it very curious that LLMs are so bad in this domain. My intuition is that producing decent pastiches of famous poets should be something that LLMs are very good at. But they’re not, and I don’t know why.
The festive season is almost upon us, so many readers will have a bit of extra time to read / learn / reflect. Who do you follow to stay up to date with what’s changing in the world of data & AI?
You mean, other than Kendra Vant?
I may be relatively new to this space, but I do spend a lot of time trying to keep up. Which is, of course, impossible. I attempt to keep track in a superficial way with the technical advances, while focusing more heavily on the topics that are relevant for my ethics teaching and research.
The list of podcasts and substacks that I subscribe to is much too long, and I probably need to become more selective. I enjoy Ethan Mollick for positive practical suggestions about AI use; at the other extreme I find Gary Marcus’s cynicism to be (mostly) entertaining. I follow the Ethical Tech Project and the AI Ethics Brief for specific content about ethics. For discussion of AI from an economics perspective, Joshua Gans’ substack is a must-read. I often listen to Last Week in AI for a general overview of what’s happening across multiple domains, even if I don’t always agree with their editorial stance. As an academic, I also like sites that direct me to original research papers; Davis Summarizes Papers and Sebastian Raschka’s Ahead of AI are great resources in this regard. But I’m also sure I am missing some really good stuff, so I’m greatly interested in seeing what your other interviewees suggest.
A specific recommendation I have is the recent series on the Nature of Intelligence in the Complexity podcast, where Melanie Mitchell and Abha Eli Phoboo bring an interdisciplinary focus to the question of machine intelligence by interviewing psychologists, linguists, philosophers, and neuroscientists as well as experts in computer science and AI. It’s not heavy listening and the multiple perspectives are refreshing and informative.
Leaning into your dystopian side for a moment, what’s your biggest fear for/with/from AI in 2025?
That it will learn to write poetry.
And now channeling your inner optimist, what’s one thing you hope to see for/with/from AI in 2025?
My answer is based more on curiosity than optimism. I’ve long been interested in linguistics—indeed, some of my own economic research has crossed over into that space—and I’m really intrigued by some of the linguistic questions about LLMs. “They’re just glorified autocomplete” does not do justice to the effective command of language that these models display. Understanding how LLMs internally represent language is inherently interesting and may even eventually teach us something about human language.
More specifically, I’m really interested to see what we are going to learn from the continuing work on sparse autoencoders in LLMs. This was a hot topic about six months ago (anyone who somehow missed it should just google “Anthropic Golden Gate Bridge”). I may have only the barest understanding of superposition and the Johnson-Lindenstrauss lemma, but I think it is really cool that that the dimensionality of the “concept space” within LLMs is potentially so much larger than one would have expected, and thus that fine-grained interpretability might one day be possible. I am hoping there is more interesting work to come in this area.