Is Code the Best Way to Represent a Data Analysis?


Key takeaways


  • While transparency has inherent value, I have found that it’s not exactly what people want when they see a data analysis. What they want is an answer to the question, “Is this data analysis any good?”
  • In my experience, most people get value out of the code (and the data) when they can go into the code, make modifications, and run alternate analyses on the data.
  • The fundamental problem with code as a representation for data analysis is that code only shows what were the observed results. But when evaluating the quality of a data analysis, I think it is equally important to know what were the expected results and why the observed results differ from the expected results (if they do).
  • There can be many reasons why an observed result differs from expectations, but I tend to group them at a high level into three categories:
  • Science
  • Data
  • Analysis