What I think being a data scientist is about?

I think a data scientist is about bringing value to a business or common goal that can only be determined by evaluating the factors unique to that business or goal. Only a branch of statistics is used when determining if a pattern or behavior is concretely tied to improvement in product or value. Only a portion of coding is used to help with the analysis of the information the decision is based on. Only a portion of value can be added by having product knowledge.

It’s quite a special individual that can use the part of statistics to back up their claims, code the work to have a discernible conclusion in a relevant time period, and having product knowledge to know what to look at to begin with or knowing which questions, when answered, provide the most value.

What I see as the major duties and/or knowledge areas?

I think the major areas, as hinted at above, would be:

  1. Statistical/Mathematical to verify and identify patterns or behaviors
  2. Coding to effectively do it
  3. Knowledge on the specific area in question to guide priorities

Major duties of the role would be to focus on:

  • Designing/Evaluating the manner in which data was gathered
  • Normalizing and transforming data into a useable format
  • Determining the most appropriate analysis on the data
  • Performing the analysis and verifying accuracy
  • Communicating and identifying concrete takeaways in a meaningful manner
  • Advising on specific action in how to respond to identified behavior

What differences/similarities do you see between data scientists and statisticians? How do you view yourself in relation to these two areas?

Differences between Data Science and Statisticians would be scope primarily. Statisticians are much more focused on pulling analysis out of the information given. Data Science can be seen as an adaptation to a Statisticians’ methods of gathering data. Instead of spending time doing physical work gathering samples and designing the study, it is now spent coding the tools that run in the normal processes as users continue their normal functions.

I’d view myself as really just a beginner in both. I hope that I see both as equally mastered as I grow. I think both are just tools to be used, like a hammer and an axe. Rather than be in a forest with only a hammer, I strive to have a travel size of both to handle most situations.


<
Previous Post
Blog Post Title From First Header
>
Next Post
Programming Questions and Background