I received an interesting question from some good colleagues in the Global Open Data for Agriculture and Nutrition (GODAN) initiative: can we have access to data about what people eat? The question originally focused on whether there exists a central database and/or very large dataset (including multiple years/countries) capturing what people eat. This means not just caloric consumption per person, but what they actually consume. For example – today I have consumed a small apple, half small melon, 5 pieces bacon. This got me into a little bit of thinking: do we have access to such data? And from what type of sources?
I know for sure that we do have access to general statistics about health & nutrition in different countries. There is the example of the World Bank and its DataBank repository: it published data on Health Nutrition and Population Statistics per country that can give an indication of what people eat. Again, this source is providing information on very specific indicators (mostly related to malnutrition) that is not always complete (not all countries and not all years are covered, even for those indicators). I assume that this is the case with many data sets coming from governmental sources – you may find data that is missing, that need cleaning, or that would have extra value only after you combine them with data from other sources.
I think that the type of data that my colleagues are looking for is logged in all these apps for calorie monitoring that are so popular in the “developed” world. Such applications (see examples here and here) help their users reduce their daily calorie consumption (and therefore become more fit) by logging all their meals during their day. Sometimes, the creators of these apps are also affiliated with the academia, being able to do research on this massive data set that they are collecting from their users. This got me wondering about how you could convince such companies to open up (parts of) their data for open research purposes. And of course the way in which such an open move should be carried out – thinking about issues like anonymising the data, licensing the use of the data from the users, etc.
And if we are into such a crowd-sourced way of collecting data about what people eat, how about thinking in terms of really open platforms where people pro-actively contribute and share their data? I would imagine platforms like the one from Numbeo that collect user-provided information on various indicators (for instance, more than 1,3 million prices of products and services, from more than 4,5 thousands cities, provided by more than 150 thousand people), creating again a massive data source that could be of tremendous value for research and innovation.
My conclusion is that I don’t have the definite answer in the question “What do people eat?”. But I can think of different sources, that use different collection mechanisms and models, and that have their own value in assembling this information and generating this knowledge. Which are exactly the needed next steps after the various relevant data sources are identified: “assembling information from various sources” and then “generating knowledge out of the data”…