There is a significant overlap between a data analyst & a data scientist but here’s what I see as the main responsibilities of each:
Data scientist: Mainly looking at estimating the unknown, e.g.
- Building statistical models that make decisions based on data. Each decision can be hard, e.g. block a page from rendering, or soft, e.g. assign a score for the maliciousness of a page, that is used by downward systems or humans.
- Conducting causality experiments that attempt to attribute the root cause of an observed phenomenon. This can be done by designing A/B experiments or if A/B experiment is not possible apply epidemiological approach to the problem, e.g. @
- Identifying new products or features that come from unlocking the value of data; being a thought leader on the value of data. A good example of that is the product recommendations feature that Amazon first made available to a mass audience.
Data analyst: Mainly looking at the known, i.e. historical data, from new perspectives, e.g.
- Writing custom queries to answer complex business questions.
- Conceiving and implementing new metrics on capturing previously poorly understood parts of the business/product.
- Addressing data quality issues, such as data gaps or biases in data acquisition.
- Working with the rest of engineering to instrument incremental new data acquisition.
Of course, there is a significant overlap between the two roles. A data scientist always needs to write custom queries and a data analyst may need to build a decision-making module either by simple rules or applying machine learning principles.