INSIGHTS FROM THE DATA STARTUPS’ PERSPECTIVE

Context

In order to better understand the challenges of data sharing, governance and reuse across multi-stakeholder actors in data value chains, the REACH project has conducted surveys with 15 of its data startup pioneers. Below are our findings.

Trends & Insights

One of the positive trends is the usage of AI And Machine Learning. Despite its potential to revolutionise the way businesses do business, advanced external site data has been slow to take hold. Startups are pushed to start developing artificial intelligence and machine learning data, parsing applications that would solve the aforementioned issue.

Big data and analytics are now becoming increasingly recognised as being beneficial to companies, but executives are also becoming aware of the risks. A hybrid strategy is used by many businesses in order to store their data on the public cloud as well as on-premises, for reasons related to compliance, security, and control.

While Big Data and Data Value Chains may be seen as the answer to a lot of problems faced by organisations, they should not be viewed as a panacea. Plenty of other irritants exist – from lack of capital, through excessive regulation, to a shortage of skilled employees.

Most of the data that companies collect today is worthless. Without the right people, the right processes and the right technology coming together at the right time, it’s simply a pile of moldering bits and bytes gently decaying on tapes and hard drives, contributing to nothing.

The reasons for not fully processing data collected were a lack of internal skills, a lack of proper processing tools and the time-consuming nature of data processing.

Signals of disruption

59% of the surveyed startups noted that they haven’t noticed any signs of disruption, on the other hand 41% of them pointed several examples:

COVID-19 has changed the world – and many companies are gaining benefits by rethinking their analytics models and data management processes to keep pace with the new realities of business. This means that companies need to continuously refresh models, bring in new and modern data, and adapt more rapidly to changing events – or risk making decisions based on invalid projections. In addition to improving their models and data management, many companies have increased their use of practices designed to help them gain a better understanding of a more volatile and unpredictable world.

Startups have also learned that data spaces are a game changer for the data economy. Able to share data with unknown partners and data endpoints, companies can now co-create and exploit the true value of their data in a flourishing ecosystem.

One of the biggest challenges data executives are facing today is turning the immense amount of information that their organisation, customers and partners — or rather their whole ecosystem — are creating into a competitive advantage. In that sense, databases are not relevant to ensure a good multi-stakeholder data workflow, yet a signal of disruption. Some of the startups mentioned that the most important is the industrial data transformation and to focus work into the data transformation workflow.

On the other hand, it is prudent for companies to move at the speed of business, enable self-service, and create delightful customer experiences. Artificial Intelligence has the power to automate and speed data analytics, making it possible to act on data almost instantaneously.

The disruption comes from companies that provide infrastructure-as-a-service. Code is becoming open-source, however, it is not about the specific code/model, but about how it fits together to create value for customers.

Barriers

REACH asked businesses in its ecosystem what they thought held them back from performing at their best. From startups’ perspective, the main barriers that they see by participating in Data Value Chains are:

The quantity and quality of historical data. For early companies with not more than 3 years of historical data quantity, density and variety is a challenge. Poor data quality analysis can lead to difficulties when extracting insights and ultimately poor decision-making.
The lack of trust. Trust in data and the interpretation of data are essential at each step of the data value chain. Without trust, there can be no value. Therefore, startups must be as transparent as possible about their work, adhering to appropriate methodologies and making metadata readily available. Users, for their part, must undertake to understand the data and utilize data in a trustworthy manner.
Legal and Data Protection Regulations. Different policy practices and legislation in the Member States have a large impact on what can be done with data, and lead to fragmentation within the digital internal market.
European companies are facing a shortage of data experts. It is likely that the demand for qualified data scientists will continue to outstrip the supply in the near future. The real challenge seems to be that the European data economy will need more and more people who are highly trained in data management, who have data analysis competences while having an understanding of business issues from a series of domains
Lack of a constant investment-flow towards startups and growing firms active in data technologies and applications.

Opportunities for growth

When asked if they have noticed any signs of opportunity within the data value chain space, and what those opportunities are – 80% of surveyed startups answered affirmative, while 20% stated they haven’t seen any new opportunities for growth yet.

The majority agreed that all the business sectors are digitalising, therefore, data will be the nervous system of the economy in the upcoming future. Additionally, it is stated that once collected, data can be exploited for multiple purposes and cycles of collection and exploitation become self-sustaining.

Several startups mentioned that big data can be applied to a variety of issues in environmental risk management and natural disasters, particularly considering increased frequency of erratic and extreme weather, as well there are immense implications for the uses of Big Data for climate modeling. Examples given are:

The use of low-cost numerical environmental models that assimilate environmental data (sensor data with heterogeneous uncertainty and remote sensing data) to provide affordable information with high-spatial resolution.
More requests of data about air quality, climate and other environmental parameters for other business areas such as financial services, pharma, urban planning etc. In detail, requests from sectors as pharma to know about pollens, cities to know about the impact of climate change mitigation actions, etc.

Startups have also stated other segments of the economy where they have discovered the new signs of opportunities for growth, e.g., fraud prevention – using the Big Data Value Chain Space is possible to prevent the Investment Fraud/Scam and help the investors.

In healthcare, the rise of comprehensive electronic health records (EHRs) offers promising opportunities for application of existing risk scores and for development and validation of new models. The EHR data include large numbers of individuals, usually greatly exceeding sample sizes available in individual trials or registries.

Some startups saw the opportunity to power the current space of data analytics from customer data, especially in the field of cross-data analysis compliant with GDPR and blending of transaction data with behavioral data, including data extraction from sensors.

In general, startups see the clear need for standardisation of data. There is the old challenge for FAIR (Findability, Accessibility, Interoperability, and Reuse of digital assets) data principles not only regarding scientific data but any kind of data silo or data lake. Moreover, there is a strong opportunity for data harmonisation services to facilitate the development of system-to-system communication interfaces.

Lastly, it is stated that technology providers could collaborate to integrate and offer more holistic solutions. In addition, niche markets can benefit from data sharing and Machine Learning federative approaches. However, in these scenarios, data value is transferred to the technical partner.