Book notes: AI Snake Oil by Arvind Narayanan

30 December 2025

Cover of AI Snake Oil

This book is written by two computer scientists in the AI field. The seeds of this book started before the likes of ChatGPT took off, and so I found the first portion of the book quite enjoyable as the book covers some of the pitfalls of the AI technology that came before LLMs like machine-learning and predictive AI. e.g. the concept of “data leakage” - how a tool can be trained on a set of data, and work very well against just that set - but fall apart when you try and more broadly apply it.

Also details a funny (well, not really) story about how attempts at using predictive AI in a hospital to triage patients went wrong - the AI model found that patients with asthma were less likely to experience complications with pneumonia and so suggested that those patients should not be prioritised in the triaging process. When actually the reason the asthmatic patients generally didn’t experience complications is because they were prioritised to begin with.

Also other examples around how AI tends to just reinforce the biases that we humans already have so it’s quite a dangerous technology in that sense - an easy example in today’s world is that if you prompt an AI for a picture of a doctor and nurse it will tend to stereotype and provide a male doctor and female nurse.

There was also a bit of an interesting detour into a section on how scary the effects of social media can be. e.g. how Facebook hires for its moderation team in a central location, rather than having moderators per-country. And without cultural context, moderators cannot accurately detect/remove hate speech especially when dealing with local slang, etc. And so Facebook has possibly inadvertently contributed to spreading violences in certain countries via its lack of moderation. I suppose this ties back to AI because it was a section on how we probably can’t use AI for moderation since it would be very difficult for AI to pick up on these cues.

It took me a long while to finish this book. I felt it start to drag on a bit and I lost my interest as it approached the present-day and covered LLMs like ChatGPT. I think this is a combination of how fast the industry is moving at the moment - if a book is published in 2024, it can very quickly start to go out of date. And I might be wrong, but possibly also more care was put into the first section of the book since the authors had quite a while to write it, before they suddenly rushed to finish it to capitalise on this recent AI boom.