Book Review: AI Snake Oil

I recently started reading this book, AI Snake Oil, and even though I’m still halfway through, some parts really caught my attention. I think I’m just gonna write my notes down before I’m too lazy to do this.

This is from the perspective of a mom, a parent, and also a researcher (in training), but someone who’s not particularly working in the AI/ML field.

Here’s one thing from the book that really struck me:

I’m aware that reproducibility is a big issue in research. Take psychology, for example. It’s so problematic that one review found only about 36% of published studies are replicable.

But I had no idea just how deep the reproducibility crisis goes in AI research until I came across this section from the book, “The Reproducibility Crisis in AI Research.”

“In a 2018 study, Odd Erik Gundersen and Sigbjørn Kjensmo from the Norwegian University of Science and Technology set out to investigate the reproducibility of AI research. They reviewed four hundred papers from leading AI publications to ascertain if they contain enough detail to be reproducible by an independent researcher. They found that none of the four hundred papers satisfied all of the criteria (such as sharing their code and data) for reproducibility. Most papers satisfied merely twenty to thirty percent of the reproducibility requirements they identified, making it hard to even investigate if the results were reproducible.”

Here’s the link to the original paper: https://doi.org/10.1609/aaai.v32i1.11503

This is a huge concern because reproducibility is essential in science. If scientists can’t run experiments multiple times and get the same results, how can they trust the findings?

But here’s where it gets even more worrying: AI and machine learning models are being used more and more across various scientific disciplines. Take neuroscience, for instance. There’s even a subfield called computational neuroscience, where researchers use computational models to test their hypotheses instead of working directly with brains in labs.

So, you can imagine how reproducibility issues in AI/ML research could easily snowball and impact other fields too, right? Just think about the ripple effects on medicine, psychiatry, genomics, and more. If the underlying AI research isn’t solid, it could lead to flawed conclusions, ineffective treatments, and wasted resources in these areas.

But hey, as a parent, you might be wondering… why should I care about this ‘technical’ issue like the reproducibility crisis in AI? I care more about tangible stuff like how AI will impact my kids’ education, the future of work… you know, things like that.

Wait a minute, we’ll get there. This realization hit me after I took that course on AI in Education alongside a classroom full of Finnish teachers who are pursuing their PhDs in education (which, by the way, I didn’t complete the course since I didn’t submit the final assignment lol.)

We hear about AI all the time from mainstream media. But the problem is, a lot of the discussions are full of hype and sensationalism that distract us from the real, pressing issues. Media often talks about things like superintelligence, robots taking our jobs, rogue AI, etc.

But rarely do we have the basic, necessary conversations about things like “How can we independently verify that these AI models are reliable?” (Which is at the heart of the reproducibility issue.)

Take this one example from the book: Mainstream media hardly ever educates us about data leakage in machine learning. It’s one of the most common pitfalls. Basically, it means the tool is tested on the same or similar data it was trained on, leading to exaggerated accuracy. It’s like teaching to the test (or worse, giving away the answers before an exam).

So, before we start worrying that AI tutor platforms will replace teachers, we should first ask: How reliable are these intelligent tutoring systems really?

And if they’re mostly commercial entities that don’t share their proprietary codes and training data, how can we, the public, ever scientifically evaluate their models?

The problem is, we aren’t equipped with the knowledge and tools to scrutinize them. And teachers aren’t immune to this either. They hear about AI from mainstream media just like the rest of us. (Yup, even Finnish teachers who are notoriously highly educated.)

There’s this section from the book that will hit us hard: AI Snake Oil is appealing to broken institutions. In many places, education is a struggling institution with complex inherent problems.

“Educational institutions, especially public schools and colleges, are often financially constrained, understaffed, and overburdened, making them seek solutions that promise efficiency and cost-cutting. Teachers face immense pressure with growing class sizes and shrinking resources, making them susceptible to quick-fix solutions.”

AI promises efficiency. But many inherent problems in education are not necessarily about inefficiency. So AI won’t be the solution.

Overall, from the book, I noticed that we need to remember these key points:

1. AI community has a history of overoptimism and hype.

Remember back in the ’60s when AI legend Marvin Minsky asked his student Gerald Sussman to hook up a camera to a computer and have it describe what it saw over the summer? Well, Sussman didn’t quite get it working. It took half a century to get us even close.

2. The industry eagerly sparks the hype to sell products and make money.

Companies are always looking for the next big thing to market, and AI fits the bill perfectly. This drive for profit often leads to overpromising and underdelivering.

3. The media fans the flames to get us to click on the news.

Sensational headlines grab attention, but they rarely provide the nuanced understanding needed to grasp the real issues at play.

4. Our cognitive biases make us susceptible to AI hype.

One example is the illusion of explanatory depth. It’s a fancy way of saying that we often think we understand complex ideas better than we actually do. This overconfidence can stop us from asking the right questions or considering other viewpoints. With AI, it’s such a broad term that most of us don’t have the time to dig into the specifics or recognize the differences between various types of AI.

This ties into something called the halo effect, where we tend to evaluate a technology based on a few standout examples. For instance, when people hear about AI beating a world champion at Go, they might assume that AI is equally effective for completely different tasks, like predicting one’s educational outcomes. This can lead to unrealistic expectations and misunderstandings about what AI can and cannot do.

I don’t think the book is trying to downplay how useful and transformative AI is, but to remind us that we might be missing the important discussions we should be having. There’s this thing called “criti-hype”—when people think they’re criticizing AI but end up hyping it instead, missing the core issues.

In a nutshell, here’s my key takeaway from the book. As parents, teachers, we can’t rely on mainstream media to educate ourselves about AI. We need to actively seek out information ourselves. Talk to researchers, especially those who are open-minded and critical in their own fields. Read books and scientific papers. Watch online lectures.

There’s no shortcut or easy way around it. We can’t teach our kids how to critically work and study with AI unless we understand what AI can and can’t do, what it’s good at, and its limitations. (Including about generative AI like ChatGPT).

And I think reading AI Snake Oil is a good place to start.

Book Review: AI Snake Oil

Share this:

Related