in Uncategorized

All models are wrong, but some models are useful!

George Box said that.

There is no best model that works for all of your data. Wolpert reiterates that as the No free lunch theorem.

Model predictive performance is domain specific. What works in one data domain has sometimes very little consequence in another one. Predictably, the rise of Domain Science: Data science needs to get closer to the business unlocking value.

Meanwhile, ensembles are here to stay!
Users want a buffet of algorithms that try to “lock-pick” the data for it's secrets.
Time is eventually the key limiter. Data science efforts have to make best out of the budget for experimentation and use some kind of co-evolutionary technique that picks the “Champion” model of models for your data.

Robust automation and fast analytics can speedup large parts of data smithy.
Still, discovery takes patience & ingenuity.