predictive analytics – Parental social licence for data linkage for service intervention

5th October 20225th October 2022

Governments’ use of automated decision-making systems reflects systemic issues of injustice and inequality

By Joanna Redden, Associate Professor, Information and Media Studies, Western University, Canada

In 2019, former UN Special Rapporteur Philip Alston said he was worried we were “stumbling zombie-like into a digital welfare dystopia.” He had been researching how government agencies around the world were turning to automated decision-making systems (ADS) to cut costs, increase efficiency and target resources. ADS are technical systems designed to help or replace human decision-making using algorithms.

Alston was worried for good reason. Research shows that ADS can be used in ways that discriminate, exacerbate inequality, infringe upon rights, sort people into different social groups, wrongly limit access to services and intensify surveillance.

For example, families have been bankrupted and forced into crises after being falsely accused of benefit fraud.

Researchers have identified how facial recognition systems and risk assessment tools are more likely to wrongly identify people with darker skin tones and women. These systems have already led to wrongful arrests and misinformed sentencing decisions.

Often, people only learn that they have been affected by an ADS application when one of two things happen: after things go wrong, as was the case with the A-levels scandal in the United Kingdom; or when controversies are made public, as was the case with uses of facial recognition technology in Canada and the United States.

Automated problems

Greater transparency, responsibility, accountability and public involvement in the design and use of ADS is important to protect people’s rights and privacy. There are three main reasons for this:

these systems can cause a lot of harm;
they are being introduced faster than necessary protections can be implemented, and;
there is a lack of opportunity for those affected to make democratic decisions about if they should be used and if so, how they should be used.

Our latest research project, Automating Public Services: Learning from Cancelled Systems, provides findings aimed at helping prevent harm and contribute to meaningful debate and action. The report provides the first comprehensive overview of systems being cancelled across western democracies.

Researching the factors and rationales leading to cancellation of ADS systems helps us better understand their limits. In our report, we identified 61 ADS that were cancelled across Australia, Canada, Europe, New Zealand and the U.S. We present a detailed account of systems cancelled in the areas of fraud detection, child welfare and policing. Our findings demonstrate the importance of careful consideration and concern for equity.

Reasons for cancellation

There are a range of factors that influence decisions to cancel the uses of ADS. One of our most important findings is how often systems are cancelled because they are not as effective as expected. Another key finding is the significant role played by community mobilization and research, investigative reporting and legal action.

Our findings demonstrate there are competing understandings, visions and politics surrounding the use of ADS.

a table showing the factors influencing the decision to cancel and ADS system — There are a range of factors that influence decisions to cancel the uses of ADS systems. (Data Justice Lab), Author provided

Hopefully, our recommendations will lead to increased civic participation and improved oversight, accountability and harm prevention.

In the report, we point to widespread calls for governments to establish resourced ADS registers as a basic first step to greater transparency. Some countries such as the U.K., have stated plans to do so, while other countries like Canada have yet to move in this direction.

Our findings demonstrate that the use of ADS can lead to greater inequality and systemic injustice. This reinforces the need to be alert to how the use of ADS can create differential systems of advantage and disadvantage.

Accountability and transparency

ADS need to be developed with care and responsibility by meaningfully engaging with affected communities. There can be harmful consequences when government agencies do not engage the public in discussions about the appropriate use of ADS before implementation.

This engagement should include the option for community members to decide areas where they do not want ADS to be used. Examples of good government practice can include taking the time to ensure independent expert reviews and impact assessments that focus on equality and human rights are carried out.

a list of recommendations for governments using ADS systems — Governments can take several different approaches to implement ADS systems in a more accountable manner. (Data Justice Lab), Author provided

We recommend strengthening accountability for those wanting to implement ADS by requiring proof of accuracy, effectiveness and safety, as well as reviews of legality. At minimum, people should be able to find out if an ADS has used their data and, if necessary, have access to resources to challenge and redress wrong assessments.

There are a number of cases listed in our report where government agencies’ partnership with private companies to provide ADS services has presented problems. In one case, a government agency decided not to use a bail-setting system because the proprietary nature of the system meant that defendants and officials would not be able to understand why a decision was made, making an effective challenge impossible.

Government agencies need to have the resources and skills to thoroughly examine how they procure ADS systems.

A politics of care

All of these recommendations point to the importance of a politics of care. This requires those wanting to implement ADS to appreciate the complexities of people, communities and their rights.

Key questions need to be asked about how the uses of ADS lead to blind spots because of the way they increase the distancing between administrators and the people they are meant to serve through scoring and sorting systems that oversimplify, infer guilt, wrongly target and stereotype people through categorizations and quantifications.

Good practice, in terms of a politics of care, involves taking the time to carefully consider the potential impacts of ADS before implementation and being responsive to criticism, ensuring ongoing oversight and review, and seeking independent and community review.

21st July 202221st July 2022

Drawing parallels – the processing of data about children in education and social care

By Sarah Gorin, Ros Edwards and Val Gillies

During our research, we have been learning more about the ways that Government agencies such as health, social care and education collect, process and join up information about families. Schools, like other Government agencies collect and process an increasing volume of information about children. Data is collected for administrative purposes, such as: monitoring attendance, attainment, progress and performance; for safeguarding children; and to promote and support education and learning.

Information about children is not only captured by the school for their own and purposes determined by the Government, but also by private educational technology (EdTech) companies who gather data on children via their use of apps, that may be free to download, and recommended by teachers as promoting learning. These companies may sell on information for marketing or research purposes. Since the pandemic the use of EdTech has grown exponentially, meaning the data being gathered on children both through schools and by EdTech providers is greater still, raising the stakes in terms of the protection of children’s personal data.

A new report by The Digital Futures Commission (DFC) ‘Education Data Reality: The challenges for schools in managing children’s education data’ examines the views of professionals who work in or with schools on the procurement of, data protection for, or uses of digital technologies in schools. The report describes the range of EdTech used in schools and the complex issues that managing it presents.

In a blog about the report, the main author Sarah Turner highlights four key issues that constrain children’s best interests:

The benefits of EdTech and the data processed from children in schools are currently not discernible or in children’s best interests. Nor are they proportionate to the scope, scale and sensitivity of data currently processed from children in schools.
Schools have limited control or oversight over data processed from children through their uses of EdTech. The power imbalance between EdTech providers and schools is structured in the terms of the use they signed up to and exacerbated by external pressure to use some EdTech services.
There is a distinct lack of comprehensive guidance for schools on how to manage EdTech providers’ data practices. Nor is there a minimum standard for acceptable features, data practices and evidence-based benefits for schools to navigate the currently fragmented EdTech market and select appropriate EdTech that offers educational benefits proportionate to the data it processes.
Patchy access to and security of digital devices at school and home due to cost and resource barriers means that access to digital technologies to deliver and receive education remains inequitable.

The report is focused on the processing of education data about families, however there are many interesting parallels with the findings from our project on the way data about families is collected, processed and used by local authorities:

Firstly, there is a lack of evidence about the benefits of the use of digital technologies in both schools and in local authorities and a lack of understanding about the risks to children’s data privacy.
There is a lack of government guidance for schools as there is for local authorities about the digital technologies that they employ, meaning that organisations are left individually responsible for ensuring that they are compliant with General Data Protection Regulation (GPPR).
Schools, like local authorities are time, resource and expertise poor. Often neither have the data protection expertise to understand and consider the risks versus the benefits of data processing for children’s best interests.
There is a lack of transparency in how data is collected, handled and processed by Government agencies as well as third parties who gain access to data about families, either through children using their apps for educational purposes or through local authorities employing them for the development of predictive analytics systems.
Public awareness and understanding about how data is collected and processed and the risks of data sharing to children’s privacy are low and are not well understood by parents and children.

We welcome this new report by the Digital Futures Commission and hope that it stimulates more discussion and awareness amongst professionals and families.

8th October 20211st November 2021

Using Artificial Intelligence in public services – does it breach people’s privacy?

By Ros Edwards, Sarah Gorin and Val Gillies

As part of our research, we recently asked parents what they thought about the use of data linkage and predictive analytics to identify families to target public services.

They told us that they didn’t trust these processes. This was particularly the case among marginalised social groups. In other words, the groups of parents who are most likely to be the focus of these AI identification practices are least likely to see them as legitimate. Now a new report by the United Nations High Commissioner of Human Rights, Michelle Bachelet highlights major concerns about the impact of artificial intelligence, including profiling, automated decision-making and machine-learning, upon individuals’ right to privacy.

The report makes a number of recommendations, including a moratorium on the use of AI systems that pose a serious risk to human rights and the banning of social scoring of individuals by Governments or AI systems that categorise individuals into groups on discriminatory grounds.

The right to privacy in the digital age: report (2021) builds on two previous reports by the High Commissioner looking at the right to privacy in the digital age and incorporates views of international experts at a virtual seminar, as well as responses to the High Commissioners call for input into the report from member states, including the U.K.

It examines the impact of digital systems such as artificial intelligence in four sectors, including public services. Artificial intelligence is used in public services such as social care, health, police, social security and education in a range of ways, such as decision-making about welfare benefits and flagging families for visits by children’s social care services.

Concerns are expressed about the linking together for example of large health, education and social care data sets with other data held by private companies, such as social media companies or data brokers who, the report says, may gather information outside protective legal frameworks. The involvement of private companies in the construction, development and management of public sector data systems, also means they can gain access to data sets containing information about large parts of the population.

There are additional concerns about the potential inaccuracy of historic data and the implications of that for future decision-making. The report states that these systems unequally “expose, survey and punish welfare beneficiaries” and that conditions are imposed on individuals that can undermine their autonomy and choice.

A digital welfare fraud detection system was banned by a court in the Netherlands, ruling that it infringed individuals’ right to privacy. The system provided central and local authorities with the power to share and analyse data that were previously kept separately, including on employment, housing, education, benefits and health insurance, as well as other forms of identifiable data. The tool targeted low-income and minority neighbourhoods, leading to de facto discrimination based on socioeconomic background.

The recommendations in the report include:

using a human rights based approach
ensuring legislation and regulation are in line with the risk to human rights, with sectors including social protection to be prioritised
development of sector specific regulation requirements
drastic improvements to efforts regarding transparency, including use of registers for AI that contain key information about AI tools and their use, informing affected individuals when decisions are being or have been made automatically or with the help of automation tools, and notifying individuals when the personal data they provide will become part of a data set used by an AI system.

With concerns about the risks to the human rights of individuals and families about the use of data linkage and predictive analytics, it is vital to pay heed to the UN High Commissioner’s call for a moratorium. Public authorities need to pay meaningful attention to the lack of social legitimacy for AI, as evidenced in our research, and to ask themselves if the risk of further distrust and disengagement from already marginalised social groups, and consequences for a cohesive and equal society, is worth it.

26th February 202126th February 2021

Running focus groups with parents in a Covid-19 setting – how will we do it?

In this second project blog, the research team reflect on how Covid-19 and the restrictions it has placed on all our lives, has led to methodological, ethical and practical challenges in working with focus groups on parental buy-in for linking and analysing data about families. They outline the challenges they face and how they’re adapting their approach.

For the next stage of our project, we’re conducting focus groups to explore how particular social groups of parents understand and talk about their perspectives on data linkage and predictive analytics. Back in early 2020, we were optimistic about the possibility of being able to conduct these groups face-to-face by the time we reached this stage of our research. Now though, it’s clear we’ll need to move online, and we’ve been thinking about the issues we’ll face and how to deal with them.

Questions we’re grappling with include:

What might moving online mean for how we recruit participants?
How can we best organise groups and engage parents with the project?
How can we develop content for online groups that will firstly, encourage parents to contribute and enjoy the research process, and secondly, be relevant to our research endeavour?

What will moving online mean for recruiting participants?

Our intention was – and still is, to hold focus group discussions with homogenous groups of parents, to explore the consensus of views on what is and isn’t acceptable (social licence) in joining together and using parents’ administrative records.

We’re using the findings from our earlier probability-based survey of parents to identify social groups of parents whose views stand out. These include home-owning parents in professional and managerial occupations, who have stronger social licence, and mothers on low incomes, Black parents, and lone parents and parents in larger families living in rented accommodation, who tend to have weak or no social licence.

Our original plan was to recruit participants for our focus groups by contacting local community and interest groups, neighbourhood networks, services such as health centres and schools, workplaces and professional associations. We still plan to do this, but we’re concerned that the pandemic is placing huge pressures on community groups, services for families and businesses and we may need to be prepared that helping us to identify parents to participate in research may not be a priority or, as with schools, appropriate.

So we’ve also been considering recruitment through online routes, such as advertising on relevant Facebook groups; using Twitter and putting advertisements on websites likely to be accessed by parents. It’ll be interesting to see if these general reach-outs get us anywhere.

An important aspect of recruitment to our study is how to include marginalised parents. This can be a conundrum whether research is face-to-face or online. Face-to-face we would have spent quite a bit of time establishing trust in person, which is not feasible now. Finding ways to reach out and convince these parents to participate is going to be an additional challenge. Our ideas for trying to engage these parents include the use of advertising via foodbanks, neighbourhood support networks and housing organisations.

And there’s the additional problem for online methods, revealed in inequalities of online schooling, of parents who have limited or no online access. Further, Covid-19 is affecting parents living in poverty especially and we don’t want to add to any stress they’re likely to be under.

Enticing affluent parents working in professional and managerial occupations to participate may also be difficult under the current circumstances. They may be juggling full-time jobs and (currently) home schooling and feeling under pressure. Either way, marginalised or affluent, we think we’ll need to be flexible, offering group times in evenings and at weekends for example.

How should we change the way we organise groups and engage parents with the project?

We know from reading the literature that online groups can face higher drop-out rates than face-to-face. Will the pandemic and its potential effect on parent’s’ physical and mental health mean that we face even higher drop-out rates? One strategy we hope will help is establishing personal links, through contacting participants and chatting to them informally before the focus group takes place.

We’ve been mulling over using groups containing people who know each other, for example if they’re members of a community group or accessed through a workplace, and groups that bring together participants who are unknown to each other. Because we’re feeling a bit unsure about recruitment and organisation, we’ve decided to go down both routes as and when opportunities present themselves. We’ll need to be aware of this as an issue when we come to do the analysis though.

We’re also thinking to organise more groups and have fewer participants in each group than we would have done face-to-face (after all, we’re not going to be confined by our original travel and venue hire budget). Even in our online research team meetings we can cut across and interrupt each other, and discussion doesn’t flow in quite the same way. Reading participants’ body language and non-verbal cues in an online focus group is going to be more difficult. Smaller numbers in the group may help a bit, but it can still be difficult to see everyone if, for example, someone is using a mobile phone. We’ll just have to see how this goes and how best to handle it.

There’s also a dilemma about how many of the project team to involve in the focus groups. We’ll need to have a team member to facilitate the group, but previous research shows it might be useful to have at least one other to monitor the chat and sort out any technical issues. But with a group as small as 4-6 participants will that seem off putting for parents? It’s all hard to know so may be a case of trying it in order to find out!

What should we consider in developing content that’s engaging for parents and relevant to our research?

What we’ll miss by holding our group discussions online is the settling in and chatting and putting us and our participants at ease – how are you, would you like a drink, there’s some biscuits if you want, let me introduce you to … and so on. We don’t think that we can replicate this easily.

But we’ve been pondering our opening icebreaker – should we ask something like….

‘If you could be anywhere else in the world where would you be?’
or

‘What would be the one thing you’d pack in a lockdown survival kit?’

And we’re also planning to use a couple of initial questions that use the online poll function. Here’s an instance where we think there’s an advantage over in-person groups, because participants can vote in the poll anonymously.

After that, we’ll be attempting to open up the discussion to focus on the issues that are at the heart of our research – what our participants feel is acceptable and what’s not in various scenarios about the uses of data linkage and predictive analytics.

Ensuring the well-being of parents after focus groups is always important, but with online groups may be harder if the participants are not identified through community groups in which there’s already access to support. We plan to contact people after groups via email but it’s hard to know if parents would let us know even if groups presented issues for them. We have also given some thought to whether we could use online noticeboards for participants to post any further comments they may have about social licence after they’ve had time to reflect, but do not know realistically if they would be used.

It’ll be interesting to see if the concerns we’ve discussed here are borne out in practice, and our hopeful means of addressing them work. And also, what sort of challenges arise for our online focus group discussions that we haven’t thought of in advance!

If you have any ideas that might help us with our focus groups, please do get in touch with us via datalinkingproject@gmail.com

18th November 202018th November 2020

A murky picture – who uses data linkage and predictive analytics to intervene in families’ lives?

In the first of a series of blogs discussing key issues and challenges that arise from our project, Dr Sarah Gorin discusses the problems encountered by our team as we try to find out which local authorities in the UK are using data linkage and predictive analytics to help them make decisions about whether to intervene in the lives of families.

As background context to our project, it seemed important to establish how many local authorities are using data linkage and predictive analytics with personal data about families and in what ways. To us this seemed a straightforward question, and yet it has been surprisingly hard to gain an accurate picture. Six months into our project and we are still struggling to find out.

In order to get some answers, we have been reaching out to other interested parties and have had numerous people get in touch with us too: from academic research centres, local authorities, independent foundation research bodies, to government-initiated research and evaluation centres. Even government linked initiatives are finding this difficult, not just us academic outsiders!

So what are the issues that have been making this so difficult for us and others?

No centralised system of recording

One of the biggest problems is finding information. There is currently no centralised way that local authorities routinely record their use of personal data about families for data linkage or predictive analytics. In 2018, the Guardian highlighted the development of the use of predictive analytics in child safeguarding and the associated concerns about ethics and data privacy. They wrote:

“There is no national oversight of predictive analytics systems by central government, resulting in vastly different approaches to transparency by different authorities.”

This means that it is very difficult for anyone to find out relevant information about what is being done in their own or other local authorities. Not only does this have ethical implications in terms of the transparency, openness and accountability of local authorities but also more importantly, means that families who experience interventions by services are unlikely to know how their data has been handled and what criteria has been used to identify them.

In several European cities they are trialling the use of a public register for mandatory reporting of the use of algorithmic decision-making systems. The best way to take this forward is being discussed here and in other countries.

Pace of change

Another issue is the pace of change. Searching the internet for information about which local authorities are linking families’ personal data and using it for predictive analytics is complicated by the lack of one common language to describe the issues. A myriad of terms are being used and they change over time…‘data linkage’; ‘data warehousing’; ‘risk or predictive analytics’; ‘artificial intelligence’ (AI); ‘machine learning’; ‘predictive algorithms’; ‘algorithmic or automated decision-making’ to name but a few.

The speed of change also means that whilst some local authorities who were developing systems several years ago, may have cancelled or paused their use of predictive analytics, others may have started to develop it.

The Cardiff University Data Justice Lab in partnership with the Carnegie UK Trust are undertaking a project to map where and why government departments and agencies in Europe, Australia, Canada, New Zealand and the United States have decided to pause or cancel their use of algorithmic and automated decision support systems.

General Data Protection Regulation (GDPR)

GDPR and the variation in the way in which it is being interpreted may be another significant problem that is preventing us getting to grips with what is going on. Under GDPR, individuals have the right to be informed about:

the collection and use of their personal data
information including the purposes for processing personal data
retention periods for data held
and with whom personal data will be shared

As part of their responsibilities under GDPR, local authorities should publish a privacy notice which includes the lawful basis for processing data as well as the purposes of the processing. However, the way that local authorities interpret this seems to vary, as does the quality, amount of detail given and level of transparency of information on privacy notices. Local authorities may only provide general statements about the deployment of predictive analytics and can lack transparency about exactly what data is being used and for what purpose.

Lack of transparency

This lack of transparency has been identified in a Review by the Committee on Standards in Public Life who published a report in February 2020 on Artificial Intelligence and Public Standards. In this report it highlighted that Government and public sector organisations are failing to be sufficiently open. It stated:

“Evidence submitted to this review suggests that at present the government and public bodies are not sufficiently transparent about their use of AI. Many contributors, including a number of academics, civil society groups and public officials said that it was too difficult to find out where the government is currently using AI. Even those working closely with the UK government on the development of AI policy, including staff at the Alan Turing Institute and the Centre for Data Ethics and Innovation, expressed frustration at their inability to find out which government departments were using these systems and how.” (p.18)

Whilst some local authorities seem less than forthcoming in divulging information, this is not the case for all. For example, in Essex, a Centre for Data Analytics has been formed as a partnership between Essex County Council, Essex Police and the University of Essex. They have developed a website and associated media that provides information about the predictive analytics projects they are undertaking using families’ data from a range of partners including the police and health services.

So what are we doing?

As part of our project on parental social licence for data linkage and analytics, our team are undertaking a process of gathering information through internet searching and snowballing to put together as much information as we can find and will continue to do so throughout the course of the project. So far, the most useful sources of information have included:

the Cardiff University Data Justice Lab report that examines the uses of data analytics in public services in the UK, through both Freedom of Information requests to all local authorities and interviews/workshops with stakeholders
the WhatDoTheyKnow website which allows you to search previous FOI requests
internet searches for relevant local authority documents, such as commissioning plans, community safety strategies and Local Government Association Digital Transformation Strategy reports
media reports
individual local authority and project websites

It would seem we have some way to go yet, but it is a work in progress!

If you are interested in this area we’d be pleased to know of others’ experiences or if you’d like to contribute a blog on this or a related topic, do get in touch via our email datalinkingproject@gmail.com