While data has the potential to boost the UK economy significantly, the links between data and AI are not sufficiently understood.
If we are to seize this potential and position the UK as a global AI superpower, as the National AI Strategy aims to do, we must get a grip on data infrastructure in order to fully deliver on this vision. There are many challenges ahead, from a shortage of foundational data skills to concerns about the trustworthiness of data and worries about data sharing, which all need urgent attention if we want to take advantage of AI’s opportunities.
Every organisation is a data organisation
In the 21st century, every organisation is a data organisation and needs to consider how it uses data and its role in its wider data ecosystems. As both the public and private sectors increasingly rely on data and new technologies – including AI – to drive efficiency and improve services and products, we all need more confidence in understanding the opportunities and limitations.
Disparities in access to data and information persist, creating a digital divide that hinders social progress and economic development. Around 30% of the UK population say they haven’t heard of any of the most prominent generative AI products, including ChatGPT2. Those who are familiar with new generative AI tools aren’t always aware of how they should be used productively and responsibly.
A shortage of data skills
According to Peak’s AI Benchmarking report (2022), UK businesses lag significantly behind India and the US in data maturity, with fewer organisations in the UK using AI or having clear standards for data collection and processing. Lloyds Bank has found that ten million people in the UK lack the basic foundation digital skills for everyday life. According to The Industrial Strategy Council, ‘five million workers could become acutely under-skilled in basic digital skills by 2030’.
This lack of data skills has the potential to impact the UK’s competitiveness. Using AI, for instance, as copilots for general office tasks such as processing emails, writing documents or creating slides requires understanding how these technologies work, the data they rely on, and their limitations. This is the only way workers can validate what AI generates, which is prone to make up facts.
The foundation stone of data literacy
The government recognised the importance of data skills ‘for a data-driven economy and data-rich lives’ by making it one of the four pillars of the National Data Strategy (NDS). However, the NDS points to ‘a fragmentation of leadership and a lack of depth in data skills at all levels’, which is preventing the development of ‘a mature data culture’. It also cites an overemphasis on the risks of misusing data, leading to ‘a chronic underuse of data and a woeful lack of understanding of its value’.
No wonder then that it says that ‘foundational data literacy will be required by all’. In its May 2021 response to the consultation on the NDS, the government highlighted that senior leaders – including politicians – need the data skills to ‘promote and champion’ data in their departments and ‘all civil servants and public sector workers should have a foundational level of data literacy’.
The Open Data Institute (ODI) defines data literacy as ‘the ability to think critically about data in different contexts and examine the impact of different approaches when collecting, using and sharing data and information.’ It goes beyond specialist roles like data analysts, scientists, engineers or ethicists to organisation-wide ones like data stewards, governance managers, and chief data officers.
Leaders must understand that they need to improve data literacy to help their organisations build effective data-focused business models and create good data governance processes and practices that will result in them becoming more trusted with data. Non-technical workers must have access to tools and training that help them understand the links between data and AI, including using prompts effectively to get the best out of the generative AI tools.
The opportunity of generative AI
Rather than replacing people, AI is an important and helpful tool for both skilled and inexperienced workers. AI models can be built and trained to develop accessible tools for everyone to find, publish and analyse data without having to learn to code. The opportunity is significant; for example, a study by GitHub showed that Copilot was a particular benefit to less experienced software developers. It also helped experienced workers, showing that it helped participants to stay in the flow (73%), preserve mental effort during repetitive tasks (87%), and complete tasks (56%) faster than those without Copilot.
AI could potentially benefit productivity in UK-friendly industries such as services by supporting less experienced or qualified workers. For example, call centre agents with access to an AI conversational assistant improved their productivity by 14% on average, with a 35% improvement for novice and low-skilled workers.
Understanding the limitations
Researchers at Harvard Business School found that while AI can provide real value, its unpredictable failure points and opacity about how best to use the tools mean that the value and risks of AI are unclear to many users and organisations. Large Language Models (LLMs) have well-documented drawbacks due to their early stage of development, from suffering hallucinations to a lack of accuracy.
We know that they are still liable to suggest fictional datasets, perform inaccurate analyses, and cite sources incorrectly (if they name any at all).
To get the best out of generative AI (GenAI) tools, people need to adapt and learn to work effectively with AI, using it as a companion tool rather than relying on its outputs without question. GenAI can even help create better education programmes, including data-related ones, helping its users to learn new AI skills, which enable us to check generative AI’s outputs and provide skilful prompts to improve its results.
Currently, most training is either on AI or data, but we also need training on data-centric AI to understand the specific data-related issues that affect some of the most popular models like ChatGPT.
The time to act is now
AI brings significant opportunities for efficiency, economic growth, and innovation. If we want to unlock the UK’s AI potential and secure our place as world leaders, we must place data at the heart of what we do. We urgently need to shift the AI narrative from an exclusive focus on model development to a wider understanding of data and the needs of the people using it. Ensuring everyone can access and use data effectively is critical for empowering people to make better decisions and create equitable outcomes for society.
There is much to do, and technology is moving at lightning speed. We need to work quickly to translate ideas into action before competitors overtake us.
Elena Simperl is a computer science professor from the Open Data Institute and King’s College London.