dataists » A Taxonomy of Data Science

5 steps of what a data scientist does, in roughly chronological order: Obtain, Scrub, Explore, Model, and Interpret, by Hilary Mason (bit.ly chief scientist):

Both within the academy and within tech startups, we’ve been hearing some similar questions lately: Where can I find a good data scientist? What do I need to learn to become a data scientist? Or more succinctly: What is data science?

We’ve variously heard it said that data science requires some command-line fu for data procurement and preprocessing, or that one needs to know some machine learning or stats, or that one should know how to `look at data’. All of these are partially true, so we thought it would be useful to propose one possible taxonomy — we call it the Snice* taxonomy — of what a data scientist does, in roughly chronological order: Obtain, Scrub, Explore, Model, and iNterpret (or, if you like, OSEMN, which rhymes with possum).

Different data scientists have different levels of expertise with each of these 5 areas, but ideally a data scientist should be at home with them all. We describe each one of these steps briefly

Since histograms of real-valued data are contingent on choice of binning, we should remember that they an art project rather than a form of analytics in themselves.

via dataists » A Taxonomy of Data Science.