Privacy Violations Using Our Faces to Train AI

Theodore F. Claypoole

Email

704-331-4910

Bio and Articles

Find Your Next Job !

Specialist: Legal Information Center Research

Experienced Family Law Attorney

LEGAL ASSISTANT II

Explore More Job Openings

Is Your Privacy Violated by Using Your Face to Train AI to Recognize Faces?

by: Theodore F. Claypoole of Womble Bond Dickinson (US) LLP - HeyDataData

Tuesday, July 21, 2020

Print Mail Download info_icon_img

/>i

If a picture of your face is used for a purpose that doesn’t identify you, is your privacy violated?

If the publicly available picture was used just to show a face, distinguished from some faces, similar to others, and fed into a computer so that the computer could learn various attributes of a human face, does this affect your privacy?

We may soon find an answer, at least as one judge interprets Illinois law.

A lawsuit was filed against Amazon, Alphabet and Microsoft alleging misuse of an Illinois resident’s picture under the Biometric Information Protection Act (BIPA) when the companies used the picture, contained in a huge database of human face pictures, to train and/or test a machine learning program to distinguish features among faces. The complaint arose from use of a database that these companies didn’t build.

According to a story in C/NET, “The photos in question were part of IBM’s Diversity in Faces database, which is designed to advance the study of fairness and accuracy in facial recognition by looking at more than just skin tone, age and gender. The data includes 1 million images of human faces, annotated with tags such as face symmetry, nose length and forehead height.” So the function of the facial information had nothing to do with identification with the people in the pictures and they only showed visages publicly available. Yet, the plaintiffs claim that “ongoing privacy risks” to pictured individuals “would injure those residents and citizens within Illinois.”

I can understand why privacy is at stake when your picture is compared to others for identification purposes in an AI facial recognition database. I just wrote last month about those risks here, and here, and again here. But training or testing the same machine learning tool is a different story. The picture goes in, it is compared and contrasted with others, and the machine moves on. How does this threaten the ongoing privacy of Illinois residents?

Many people I talk to assume that there is a long-determined set of rules about how private companies (or even government entities) can use anybody’s name and likeness. There isn’t.

Rights of privacy in the U.S. are relatively new and generally do not protect “public information” like a name or face. Rights of publicity are statutory in some states, court-driven in others, and barely exist in many – and when they do exist they often only protect people in the public eye.

Relatively new biometric privacy laws in Texas, Washington and Illinois are poised to prevent fingerprints and retinal scans from being taken, stored and/or used by private companies without your permission. But pictures of your face? Are friends and relatives who post your picture from their wedding violating the law? No. They aren’t. And the wedding picture may actually identify you.

So how would Amazon be violating the law by using that same picture to train a machine learning system to distinguish between facial characteristics without even identifying you to the system? Especially if Amazon is just using the picture of a face in the same way it might use a picture of a maple leaf to learn what shapes a tree leaf may exhibit.

BIPA is clear in some aspects. It demands that private entities need a person’s permission to collect, disclose, disseminate or profit from a person’s “biometric identifier” or “biometric information.” Neither biometric identifiers nor biometric information includes pictures. Further, BIPA also makes clear that its proscriptions tie the use of biometric measurements to identification of the person described by the measurements. Amazon, Alphabet and Microsoft are not accused of using the pictures for identification, only for training AI.

While it is true that training AI is fraught with unexplored legal and business risks, separating a picture from an identification of the pictured person seems to eliminate most of the privacy risks, especially when that picture will just be shown to a computer program for measurements and comparisons with other pictures. No one is running an advertising campaign using these pictures. The process result is generalized knowledge, not specific identification.

In addition, the plaintiffs in the BIPA suit request that the defendant companies destroy any relevant facial data that has been saved. Exactly how is this supposed to happen? Assuming the subject pictures were used, along with tens of millions of other pictures, to train facial recognition AI, how do you delete the knowledge of one face from the multiple millions and remove the specific knowledge distinctions gained from these face samples? I suppose you could remove the pictures from the original database used to train the AIs, but the defendant companies don’t own the database, and pictures are not covered under BIPA. So what does this demand even mean in the context of training machine learning systems?

I am certain the plaintiff’s lawyers in this case were driven (or at least encouraged) by the $550,000,000 settlement of a BIPA class action this January following a U.S. 9^th Circuit decision that under BIPA, “development of face template using facial-recognition technology without consent (as alleged here) invades an individual’s private affairs and concrete interests.” But the Facebook case involved a company actually using facial recognition AI in its role to identify individuals, or distinguish them from others, using their faces, and necessarily the face measurement templates. As stated above, the plaintiff’s in the AI training case are not accused of doing so.

So will courts within the same circuit find that every one of millions upon millions of people depicted in the Diversity in Faces database has a privacy interest threatened because their pictures were used to train commercial databases. Clearly, face templates were used, not “using facial-recognition” technology, but to train the same technology to recognize faces at a later time, in different circumstances. How much privacy is at stake here? And will the defendant companies settle the matter, or have the stomach to stick it out and make an important distinction in the law?