A few days ago, I was invited by the iMarine project team to give a talk on different types of services and tools that may be developed on top of e-infrastructures, as part of their Ecosystem approach to marine and fishery data management workshop that took place as a satellite event of the 2014 edition of the International Conference on Research Infrastructures (ICRI’14).
I have to admit that talking about a topic like “Specialised set of tools and services over an e-infrastructure” seems too boring for a group of great people (like the amazing Christos Arvanitidis from the Hellenic Centre for Marine Research) that are trying to solve really difficult problems like the observation, modelling, documentation and preservation of marine biodiversity and fishery ecosystems. This is why I decided to try changing the focus of my presentation and give to this audience my perspective on why and how e-infrastructures (in general) and specifically marine & maritime research data e-infrastructures can be relevant to small businesses like ours. The talk slides were heavily based on a previous presentation that I gave a couple of months ago, with a particular reflection on the marine, fisheries and maritime research data & sector. Here is the previous, longer (but agriculture-oriented) version of my slides.
The main arguments that I made during this talk were the following:
Recent studies and trends indicate that open data create an opportunity for entrepreneurship, since new businesses and new business models are beginning to emerge. Alex Howard’s blog post on the research report of Deloitte UKand the Open Data Institute, briefly introduces the new business models of suppliers, aggregators, developers, enrichers and enablers; and also shows current sector trends (and hopefully future opportunities) by reusing the following very interesting diagram from this report:
Public cloud and grid facilities that governments and the European Commission are putting in place to manage, preserve, compute, enrich, visualise and share research data can be a very interesting source of such open and big data. Especially in the light of developments like opening up all research data that is being funded by donors like the European Commission or NSF (and all the relevant Research Data Alliance activities).
Still, the perception that we get (mainly from our exposure to relevant European projects on e-infrastructures) is that to access this massive data volume and use the public computing & hosting resources available, we need to go through very traditionally conceptualised Scientific Gateways (e.g. agINFRA-produced INFN Catania’s and SZTAKI’s Science Gateways, iMarine’s Gateway etc.). These environments have very carefully defined certification and security procedures that definitely make sure that access to Europe’s valuable grid and cloud resources is controlled and monitored – but can also scare the majority of start ups or small businesses away. One approach is changing from a user-driven access to these e-infrastructures to a machine-driven access, by opening programmable interfaces that software applications may use to invoke services and discovery/retrieve data. This is the approach that we tried to demonstrate with the RESTful API gateway of agINFRA.
But in my view, we still have a long way to go. Even if such magnificent projects develop and expose the infrastructural components that can be used for large-scale data-powered research, they will probably not go beyond developing and testing some illustrative demonstrators of real services that researchers (or practitioners) will use. A very nice example is the iMarine Species App. These can show to the market how the different components can be put together to deliver a nice software solution that will try to address a real researcher need. But the exploration of whether someone should pay for such a solution (being it a donor/funder or the users themselves) is something that research and academic partners do not have the capacity to do in a dynamic and growing environment like the data economy. And the argument of studies like the one of Deloitte UK and the Open Data Institute is that there is a business potential for companies (and especially newly created ones) to take advantage of.
And what makes this really interesting is the fact that these companies may invent even new ways to contribute to the sustainability of such public data infrastructures. This is the business model that we follow in Agro-Know where we work on the aggregation, enrichment and publication of metadata records for free, because we are then offering services on top of them. In this way, we don’t need to get paid for the task of data aggregation. Neither do we need to restrict access to this data (unless the license of the data provider obliges to). We even have a basic set of data discovery services that are offered for free to all the targeted institutions, e.g. because they are an essential part of the creation of a critical mass of users. The generation of new value is when our clients request a specific, customised, contextualised, adapted solution for their needs – and this is where our work gets paid. Which means that we have a good reason to take of the data aggregation workflows of your e-infrastructure; for free.
By no means I can argue that this is an absolutely correct or rational way to look at the sustainability of data e-infrastructures. And we are still trying to figure out how we can validate the assumption that we can build a sustainable and scalable business by adopting such a business model. Still, the feedback that I got from this audience is that at least this is a new and different view of such topics, and it would be interesting to explore further such symbiotic models of the public and private sector.
PS. What about the marine sector? Can the uptake of marine, fisheries & maritime research e-infrastructures and data create growth opportunities and social innovation? I believe yes. I can imagine some great enhancements that data can bring to new, sexy platforms like Incrediblue. I think that there is space for companies that will work on data economy models (like aggregation, cleansing, enrichment & mashing up of various data types), companies that will develop new research services/applications for researchers, as well as companies that will develop new data products and data-enhanced software solutions for the maritime & fishery industries. To make this attractive as a topic for young people so that a generation of Blue Growth start ups appears, we will need some blue start up heroes to pave the way; and then to keep the innovation pipelines open & busy through hackathons, incubators, demo days and all the rest.