Responsible AI update: Testing how we measure bias in the U.S.

May 9, 2024

Co-authors: Co-authored byOsonde Ope Osoba, Co-authored bySaikrishna Badrinarayanan, Co-authored byMiao Cheng, Co-authored byRyan Rogers, Co-authored bySakshi Jain, Co-authored byRahul Tandra, and Co-authored byNatesh Pillai

For years, we’ve delivered AI products to our 1 billion LinkedIn members and our many customers so they can become even more productive and successful at work. This long history of LinkedIn AI has been at the heart of how we serve up job recommendations, help people connect with peers, and provide the most relevant content in people’s feeds. We know it matters to everyone using LinkedIn that we use AI responsibly, and we hold ourselves accountable through our Responsible AI Principles. These commitments help us make the promise of AI real, all while treating members equally and ensuring that we do not amplify unfair biases.

Measuring effects across demographics is critical to meeting these commitments. The rest of this blog outlines how we are testing measurement based on U.S. data while staying true to our principled approach.

Challenges and Opportunities of Measuring Algorithmic Bias

Measurement is the key step for identifying and addressing any unfair bias present in our AI systems. At LinkedIn, unfair bias can mean, for example, equally qualified members receiving recommendations for opportunities of unequal quality or experiencing unequal error rates. We measure for these biases across different demographic groups. Addressing the fairness and inclusion principle in our Responsible AI Principles is important both as a social responsibility and as a business responsibility. By having our algorithms treat our members equally based on the skills they share with us (a standard we refer to as “Equal Treatment” in the rest of this blog), we can continue to advance our work to provide opportunities to a broader base of skilled members.

This leads us to the central problem we address: Monitoring our algorithms for Equal Treatment requires knowledge of our members’ demographic groupings. But our data on these demographic groupings is limited. A few years back we introduced Self-ID, a feature that would allow our US members to opt in to provide demographic information about themselves. And, to remain true to our Self-ID commitments to members that we would use their data to identify bias on our platform, further action is necessary.

Self-ID data has been a valuable resource for some of our equity analyses. For example, Self-ID data has enabled LinkedIn researchers to analyze U.S. demographic differences in senior job titles, online networking behaviors and the post-pandemic jobs “reshuffle.” However, these analyses relied solely on Self-ID provided demographic data from a percentage of the LinkedIn member base. This means we have enough data to do Equal Treatment analyses¹ such as comparisons of the relative quality of job recommendations or networking suggestions for members based on binary gender, but not enough to replicate such analyses for race / ethnicity groups.

Deriving a Solution to the Key Challenge

Given the challenge of limited availability of race and ethnicity information and after due consideration of the optimal approach, we are pursuing an approach to measurement of Equal Treatment that seeks to responsibly use member personal data, including providing members with appropriate transparency and controls. As detailed below, the diligence we designed into this measurement system is also motivated by a recognition of the sensitivity and complexity of using race and ethnicity information, especially at an individual level.

Principles

This work is focused on the U.S. Our design principles can be summarized as follows:

Effectiveness: Race/ethnicity demographic information must be comprehensive and useful enough to enable Equal Treatment measurements with respect to race and ethnicity at the aggregate-level.
Privacy: The measurement test must have privacy by design at its core, including:
- Data minimization: We must use (or “process”) the least amount of personal data needed to achieve our objective and respect the sensitivities members may have around race/ethnicity information, including avoiding LinkedIn “assigning,” saving, or disclosing race/ethnicity information.
- No individual race/ethnicity assignment: We will avoid individual assignment of a single race or ethnicity category to members.
- Strong Security: We will implement privacy and security protections to prevent unauthorized access (internal or external) to this measurement workflow (e.g. efforts to prevent reassociation of an estimate calculation to an identifiable person).
- Transparency: We will continue to be transparent about the personal data we are using for race/ethnicity measurement tests.
- Member Control: We will provide appropriate controls to members. So, even though a single race/ethnicity will not be assigned to any member, members can opt-out of having their personal data used for the purpose of this race/ethnicity measurement test.

Summary of Our Approach

We designed the privacy-preserving probabilistic race/ethnicity estimation (PPRE) system to adhere to these principles. This enables us to make meaningful progress towards our commitment to monitor LinkedIn’s use of AI for unintended bias without compromising member privacy or control over their race/ethnicity identity. Our system combines the below techniques from responsible AI and privacy:

(a) A U.S.-Census-normed Bayesian model for “race/ethnicity”² and member-reported Self-ID data to provide comprehensive and useful race demographic information, and

(b) Privacy enhancing technologies including secure two-party computation and differential privacy,

Our system achieves the following:

No individual member is assigned a race/ethnicity: We calculate only probabilistic estimates that are ephemeral on the member level; during each measurement, they are computed on-demand, never stored on disk, always encrypted by design and aggregated. No member level estimations are used to alter how our algorithms specifically treat that member.
Secure exchange of data: Different system providers (the tester and the test client) of relevant data within LinkedIn exchange only encrypted data that is not decryptable by the other or to an eavesdropper.
No new disclosures of a member’s sensitive personal data: The secure two-party computation protocol ensures that at any point during the execution of this measurement or after, neither the measurement executor nor the measurement client gains knowledge of the tuple (member ID, race/ethnicity probabilities, test features).
Supporting deniability on targeted race/ethnicity identification: Differential privacy helps prevent internal teams that perform algorithmic equal treatment testing from being able to re-identify members’ race/ethnicity estimations or Self-ID values by looking at the output from multiple test flows. In addition, we mandate a governance approval process before embarking on any new measurements.
Retain only aggregate data: We save only aggregate information at the end of the test and delete all individual level (even encrypted) data as soon as it has been processed. We only use anonymized aggregated data to measure our algorithms for Equal Treatment by race/ethnicity.
Member Control: We enable members to opt-out of having their personal data used for measurement through their data privacy settings.

Our investment in this technology will serve as the foundation to execute on our commitment to ensure that we are providing fair and equitable experiences that help all of our members and customers be more productive and successful.

For those who wish to learn more, we will also be publishing a scientific paper that outlines more details of the techniques we have built.

Acknowledgements

This work was the result of a fruitful and massive collaboration between the data privacy, responsible AI and legal teams. We would like to thank Daniel Olmedilla, Souvik Ghosh, Ya Xu for their leadership support and Daniel Tweed-Kent, Matthew Baird, Sam Gong, Joaquin Quinonero Candela, Adrian Rivera Cardoso, Leonna Spilman, Gorki De Los Santos, Katherine Vaiente, Igor Perisic, Imani Dunbar, Greg Snapper, and other reviewers for their feedback. Finally, we would like to express our deep gratitude to those members who have trusted us with their Self-ID data. This work would not have been possible without your contribution.

¹ See last year’s paper from LinkedIn Responsible AI showing such analyses.

² “Race and ethnicity” in this discussion refers to a standardized composite categorization that includes both race and ethnicity (e.g. “Hispanic or Latino”) categories. Without opining on whether it is the optimal categorization, we use this categorization because it is the most widespread standard used in the U.S. government as well as in academia. And, as such, it is the categorization standard with the most available data for measuring estimation models.

Generative AI

Musings on building a Generative AI product

Juan Pablo Bottaro

Apr 25, 2024
Responsible AI

Our Responsible AI Principles in Practice

Apr 13, 2023
Responsible AI

Sharing LinkedIn’s Responsible AI Principles

Feb 22, 2023