Between quantity and quality: competing views on the role of Big Data for causal inference
Abstract
When does more data help and when does it not in the sciences? In the past decade, this question has become central because of the phenomenon of Big Data. While these discussions started as a result of somewhat naive ideas that have been closely analyzed and mostly rejected in the philosophy of data, the question about the epistemic difference that more or less data makes still matters, especially in light of the impressive performance of data science and machine learning tools, which seem to improve their outcomes when they are trained on large volumes of data. In several areas of the sciences, having more data is also connected to methodological and epistemic benefits and more generally something that research should strive toward. More data is often equated to better science: this elicits crucial questions about the epistemic value of the quantity of data. In this chapter, we discuss this problem in light of current discussions in the life and health sciences and the philosophy of data.