Deep-Frying Data

Idle Words

I have been wading through the sewerage we have been forced to wade through every day. Looking for something worth reading. This I can recommend.

To get you started, consider this:

Today I’m here to talk to you about machine learning. I’d rather you hear about it from me than from your friends at school, or out on the street.

Machine learning is like a deep-fat fryer. If you’ve never deep-fried something before, you think to yourself: “This is amazing! I bet this would work on anything!”

And it kind of does.

In our case, the deep-fryer is a toolbox of statistical techniques. The names keep changing—it used to be unsupervised learning, now it’s called big data or deep learning or AI. Next year it will be called something else. But the core ideas don’t change. You train a computer on lots of data, and it learns to recognize structure.

These techniques are effective, but the fact that the same generic approach works across a wide range of domains should make you suspicious about how much insight it’s adding.

And in any deep frying situation, a good question to ask is: what is this stuff being fried in?

This guy has hit on the same thing that has bothered me.

If you feed garbage into a black box – you get garbage out of it. You can make it look pretty, and smell pretty – but it is still crap.


