Grid of 70 averaged faces for different face attribute categories, with overlaid heatmaps indicating significant regions.

Working with Faces

A journey into the art of face analysis and classification.

Kyle McDonald


Congrats on becoming a Faceworker! Our AI finds the perfect face for every job. Audition for each job by showing us you can make your face fit what the job needs. Ready to try out for your first job?

Facework is a game that imagines a world where face analysis is key to the latest gig economy app. As a Faceworker, the player is given an opportunity to interrogate in realtime how computer vision and machine learning tools work — to playfully grow an intuition for what it means to see like a machine, and to understand how machines can fail. The first audition challenge is to smile, the second asks the player to wear glasses, or maybe to look outdoorsy. As they choose their path, an unexpected interruption takes the game in a new direction, accompanied by increasingly uncomfortable criteria for job suitability.

Facework is built on a decade-long exploration of all kinds of automated face analysis — and is inspired by many other artists and researchers. In this article I’d like to share a few of those inspirations with you.

Row of screens in a gallery showing one woman per screen with a blue background and green status bar of varying height.
“Cheese” (2003) by Christian Moeller

“Cheese” and Smiles

My introduction to face attribute classification came from Christian Moeller’s Cheese (2003), where a sequence of actresses try to hold a smile as long as they can while being judged by a computer on the quality of their smile. Christian writes “The performance of sincerity is hard work.”

Face attribute classification stands in contrast to the more familiar technologies of face detection (drawing a box around a face), face landmark detection (finding key points like eyes, ears, or lips), and face recognition (matching a name to a face, but sometimes used as a catch-all term for all face analysis).

Identifying whether someone is smiling or not is treated as one “attribute”: a class, label, trait, or description of a face. A single, irreducible dimension, a binary: SMILING. But no attribute exists in isolation. And many, like facial expressions, have complex histories. Modern face expression classification has its origins in the late 60s, when American psychologist Paul Ekman developed the Facial Action Coding System. He broke down expressions into their constituent muscle movements, and tied them to emotions based on his personal intuition and a disputed hypothesis that a core set of emotions are culturally invariant.

Facework starts with SMILING as the first job because it is something most people can perform easily. But it’s also one of the first face attributes I personally worked with. My earliest work on face analysis was building Portrait Machine (2009) with Theo Watson. We analyzed everything we could: hair color, skin color, clothing color, face size, head tilt, sunglasses, etc. It was a very manual process full of heuristics, like color averages and carefully set thresholds. We had to write new code for every new attribute. Something like a smile seemed impenetrable.

But with new machine learning techniques it is possible to make a prediction based on “examples” (training data) instead of “explanations” (hard-coded heuristics). In 2010 Theo found some machine learning-based smile detection code from the Machine Perception Laboratory at UCSD (the same group who worked with Christian on Cheese). Theo built an app that inserts a smile emoji whenever you smile. I extended Theo’s work to share a screenshot whenever you smile.

Detecting smiles can feel innocuous and playful. But as Cheese shows: try holding that smile for more than a few moments, and it’s apparent that something is very wrong. Why are we asking a computer to judge us in the first place? And who is putting that judgement to use? Does the meaning of an expression change when we’re being judged by a machine instead of another human?

Six adjacent monitors mounted in portrait on a gallery wall, with one photo on each and a circle highlighting a face.
Vibe Check (2020) by Lauren McCarthy and Kyle McDonald.

“us+” and “Vibe Check”

In 2013 I worked with Lauren McCarthy to build us+, a video chat plugin that analyzes speech and facial expressions and gives feedback to “improve” conversations. How might the future feel if every expression, every utterance, was automatically analyzed? Can it be affirming or helpful to get automated advice, or is this future nothing but a dystopia to avoid? Are there some things that only humans should ever judge, and machines should always avoid?

In 2020 we created Vibe Check, a face recognition and expression classification system to catalog the emotional effect that exhibition visitors have on one another. Some visitors are identified as consistently evoking expressions of happiness, disgust, sadness, surprise, or boredom in others nearby. Upon entering the exhibition, visitors are alerted to who these people are, and as they leave, they may find they’ve earned this distinction themselves and found a place on the leaderboard.

With us+ and Vibe Check we are not expressing broad pessimism in response to this tech. Maybe we can find some place for it, if we build it consensually and reflect on it together? Maybe there is someone out there who could use the feedback from a machine instead of a human, or someone who needs help identifying face expressions? We try to create a space for that reflection.

Without reflection, we get companies like HireVue, who sell software that analyzes speech and facial expressions to estimate job interview quality, leading to complaints to the Federal Trade Commission.

Trevor Paglen, a bald white man, looking concerned in a webcam, with a green box around his face labeled “skinhead”.
Trevor Paglen testing ImageNet Roulette (2019).

“ImageNet Roulette”

In late 2019 Microsoft researcher Kate Crawford and artist Trevor Paglen published Excavating AI. Their essay examines the impossibility of classification itself: typically framed as a technical problem, they break it down to its political and social foundations.

This essay was preceded by “an experiment in classification” called ImageNet Roulette, developed by Leif Ryge for Trevor’s studio: upload a photo to the website, and it returns a copy with a box around you and a label. They write:

The ImageNet dataset is typically used for object recognition. But […] we were interested to see what would happen if we trained an AI model exclusively on its “person” categories. […] ImageNet contains a number of problematic, offensive, and bizarre categories. Hence, the results ImageNet Roulette returns often draw upon those categories. That is by design: we want to shed light on what happens when technical systems are trained using problematic training data.

In some ways this artistic “what would happen if” process mirrors the way that some face classification research is carried out:

Most or all of this work hopes to create a teachable moment: Kate and Trevor wanted to show the dangers of “problematic training data,” and classification more broadly, Michal and Yulin wanted to “expose a threat to the privacy and safety of gay men and women,” Xiaolin and Xi wrote a defense of their work saying they believe in the “importance of policing AI research for the general good of the society.”

When does the opportunity for discussion for some come at the expense of retraumatizing or harming others (“The viral selfie app ImageNet Roulette seemed fun — until it called me a racist slur”, “Excavating ‘Excavating AI’: The Elephant in the Gallery”)? When commenting on the potential dangers of technology, when is the comment outweighed by the potential for the work to be misused or misinterpreted?

Shu Lea Cheang pushes back differently in her work 3x3x6 (2019) where visitors to the installation send selfies to be “transformed by a computational system designed to trans-gender and trans-racialize facial data.” Instead of simply handing visitors a reflection of the machine’s gaze, the machine is actively repurposed to blur the categories into a queered heterogenous mass. Paul Preciado connects Shu Lea’s work back to Michal and Yulin’s research:

But if machine vision can guess sexual orientation it is not because sexual identity is a natural feature to be read. It is because the machine works with the same visual and epistemological regime that constructs the differences between heterosexuality and homosexuality: We are neither homosexual nor heterosexual but our visual epistemologies are; we are neither white nor black but we are teaching our machines the language of technopatriarchal binarism and racism.


Most of the jobs in Facework are trained using a dataset called Labeled Faces in the Wild Attributes+ by Ziwei Liu, et al (with the exception of WEARING MASK, POLICE and CEO). LFWA+ is not as well known as the larger CelebFaces Attributes Dataset (CelebA) though both were released with the paper Deep Learning Face Attributes in the Wild (2015). While CelebA consists of 200k images of celebrities with 40 attributes per image, LFWA+ is a smaller 18k images with 73 attributes each. Both include BIG LIPS, BUSHY EYEBROWS, and DOUBLE CHIN, and others. LFWA+ adds four racial groups, SUNGLASSES, CURLY HAIR, and more.

The logic or ontology of these categories is unclear, and the provenance of these labels is hard to trace. The accompanying paper only mentions briefly how the LFWA+ data was created. Based on my sleuthing, it seems to be related to an earlier dataset called FaceTracer. My impression is that the researchers were just trying to plug in anything they could get their hands on. One lesson seems to be: if the data exists, someone will find a use for it — appropriate or not. Some researchers critiqued ImageNet Roulette as a completely inappropriate use of the ImageNet labels, saying that no one has ever trained a classifier on the “person” category. But appropriating data regardless of origin or original intent seems to be pretty standard in machine learning. What classifier will be trained next? Here are some other labels I have seen describing people in datasets:


Mushon Zer-Aviv tackles this final category of “Normal” in the The Normalizing Machine (2019) where visitors are asked to point at a previous visitor’s portrait that looks “most normal.” Mary Flanagan address the absurdity of these labels in [help me know the truth] (2016) where visitors are asked to pick between two random variations of their own selfie that exemplify a ridiculous label like “Which is a banana?” or “Which is a martyr?”

In response to these and other labels I think of critic and writer Nora Khan, who encourages us to ask the same questions about “naming” that we ask in the arts:

Is what I’m seeing justifiably named this way? What frame has it been given? Who decided on this frame? What reasons do they have to frame it this way? Is their frame valid, and why? What assumptions about this subject are they relying upon? What interest does this naming serve?

In order to understand this research better, I try to replicate it. It’s easy to critique face analysis systems as fundamentally flawed for collecting images non-consensually, or for trying to fit the boundless potpourri of human expression into Paul Ekman’s seven core categories. But when I train these systems I get to discover all the other flaws: the messy details that the researchers don’t write about.¹

Replicating research gives me insight that I can’t get from using a toolkit, reading a paper, or studying historical precedents and theory. It helps put me in the mindset of a researcher. I have to solve some of the same problems, sit with an assumption of the validity of the categories and data in question for an extended duration. I get to let the data stare back at me. Every dataset has a first row, and there are always one or two faces I see over and over while debugging code.

“Who Goes There”

Some face attributes seem mundane: BANGS, SHINY SKIN, TEETH NOT VISIBLE. Others are a matter of life and death: like research on classifying Uyghur people which is sold by surveillance companies for use in western China where one million Uyghur people are sent for “re-education” each year.

I think this is one reason that some of the most detailed work on predicting race is framed as a completely different problem: “photo geo-localization.” In Who Goes There (2016) by Zach Bessinger, et al. they build on previous work GeoFaces (2015) by Nathan Jacobs, et al. to predict the geographic location of a photo based on the faces in that photo. This is a common sleight of hand: researchers rarely have the data they want, so they find a proxy. If they can’t find a large collection of photos with self-identified racial identities for each face, they use the GPS tags from 2 million Flickr photos.

In replicating Zach’s work I discovered how this sleight of hand can have subtle unexpected outcomes. After achieving similar accuracy to the original model, I went through the data to find examples of photos that looked “most representative” of each of the geographic subregions. I was curious who the model predicted to be “the most North American” or “the most East Asian,” and if this matched my personal biases or any other stereotypes. This is how I found JB.

Grid of 60 small faces of the one person in different lighting: light skinned with Caucasian features, mustache and beard.
Self portraits by Jean-Baptiste Dodane, CC-Attribution from Flickr, with permission.

For most regions of Africa, the model picked a variety of people with darker skin as being most representative. But in Central Africa, a massive region including around 170M people, it picked dozens of photos of only one guy with much lighter skin. I thought it was a bug at first, but I couldn’t find the mistake in my code. So I traced the faces back to Flickr and found him.

I’m JB and I completed a 26600 KM trip across Africa via the West coast, from Zürich to Cape Town, between 2012 and 2014. This website relates this wonderful and strenuous experience in geotagged posts, thousands of photos

To the classifier, JB’s face was so consistent and useful for prediction that it sort of fell in love and prioritized him above anyone else in this area. No one else was as similar looking to other folks in the region as JB looks similar to himself. If we zoom in to the researcher’s map we can actually find JB around Gabon.

Even after making a small change to only allow one example per face, I saw new bugs: hundreds of people photographing the same celebrity, or statue, or artwork.

Discovering JB helped me think about how we use proxies in research and daily life. In Who Goes There, appearance is a proxy for geolocation (or geolocation is a proxy for race). In everyday interaction, expression is a proxy for emotion. There is a place for some proxies: they can be an opportunity to communicate, to understand, to find your people. But they can also be a mechanism for prejudice and bias: gender-normative beliefs might treat clothing as a proxy for gender identity, or racist beliefs might treat skin color as a proxy for racial identity. In the end there is no substitute for someone telling you who they are. And no machine learning system can predict something that can only be intentionally communicated.

Face Apps

Facework is massively inspired by face filters and face-responsive apps. This includes popular face swap apps like Face Swap Live and Snapchat from 2015, which can be traced to work I made with Arturo Castro in 2011 (and to earlier experiments by Parag Mital). But Facework is more directly connected to face filters that change your age and gender, your “hotness,” or your race. The same datasets and algorithms that drive face attribute classification also drive face attribute modification.

So it makes sense that when researchers collect expressions, accessories, age, gender, and race under one generic umbrella as “face attributes,” that this uniform understanding of facial appearance is transferred to face filter apps. When the dataset treats SMILING, OLD, and ASIAN the same, face filter apps also treat them the same. FaceApp might filter someone’s face to make them look like they are smiling, or older, and not think twice about rolling out race-swapping filters

This is connected to a more general dehumanizing side effect of face analysis. Édouard Glissant writes that “understanding” someone is often based on making comparisons: either to an “ideal scale” or an analysis of differences without hierarchy. He calls this “transparency,” where each person is rendered “transparent” to the observer. The observer “sees through” the person to a system of classification and comparison. Édouard asks for “opacity”: to avoid reduction and understand each person as “an irreducible singularity.” But transparency is the essence of automated face analysis, and dehumanization is a natural consequence.

Dehumanization shows up in the terminology. The same way prison guards refer to prisoners as “bodies”, cops refer to suspects in their surveillance systems as “objects”, and popular face analysis libraries call aligned face photos “face chips.” The images no longer represent people, but are treated as more similar to color chips.

Dehumanization also manifests in privacy practices. The most popular face filter apps have been caught harvesting images from users, like Meitu in 2017 or FaceApp in 2019. Using face-responsive interaction as bait for collecting images goes back to nearly the beginning of automated face analysis itself: in 1970 Toshiyuki Sakai collected hundreds of faces from an installation showing visitors which celebrity they looked like, Computer Physiognomy at the Osaka Expo. When each face is seen as a replaceable and interchangeable variation on every other face, and large amounts of data are required to build and verify these systems, it makes perfect sense to surreptitiously build massive collections of faces. When the United States Customs and Border Patrol loses 184,000 images of travelers, we may find out about it. But it is unclear whether any face filter apps have ever had a similar breach, or whether Meitu’s data has been turned over to the Chinese government.

Large grid of faces on gallery wall, two people select small face chips from buckets on gallery wall.
“Face to Facebook” (2011) by Alessandro Ludovico and Paolo Cirio

“Obscurity” and “Capture”

Alessandro Ludovico and Paolo Cirio’s Face To Facebook (2011) picks up on these dynamics around the dehumanization of both analysis and collection, and how this intersects with issues of privacy and surveillance. Alessandro and Paolo created a fake dating website populated with the profiles of 250k people scraped from Facebook. The aesthetics of his installation reify these processes: visitors to the installation are confronted with a massive grid of literal “face chips”, and small buckets labeled with different attributes and full of these “face chips”.

In Obscurity (2016), Paolo highlights the millions of mugshots and arrest records available to the public through websites that collect this data and run extortion schemes. He scraped 15 million profiles and hosted them on his own site with obfuscated data and images, making it easy to search but difficult to find anything useful.

Mugshots are part of a common testing and verification step for new face recognition algorithms. In the United States, there is a kind of feedback loop from mugshots, to research, to surveillance products, and then these surveillance products are sold to police and used to aid in arrests supporting mass incarceration. By interrupting part of this process, I see Paolo critiquing the entire loop.

The research that leads to fascist surveillance products sometimes draws on face attribute analysis, like Xiaolin and Xi’s Automated Inference on Criminality using Face Images mentioned above. These algorithms can be traced to ideas from phrenology that assume “criminality” is an inherent trait that can be measured from appearance.

In Capture (2020), Paolo built a crowdsourced face recognition system populated with images of the French police, to comment on their ongoing turn towards brutality and use of mass surveillance. He extracted 4000 faces of police officers from public photos, posted them to a website, and asked visitors to identify them.

How can we position this as an artwork? Cultural theorist Dr. Daniela Agostinho identifies two broad categories of critical artwork responding to face analysis: “practical tools” and “epistemological tools.” Practical tools include Adam Harvey’s CV Dazzle (2011) face camouflage, recently popularized by Dainty Funk and others. As an epistemological corollary, Daniela suggests Face Cages (2013) by Zach Blas. I would add Infilteriterations (2020) by Huntrezz Janos, and Sondra Perry’s Lineage for a Multiple-Monitor Workstation: Number One (2015). A piece of Sondra’s work is in conversation with CV Dazzle in that it asks: when we wear camouflage, does it work the same for everyone?

I see Paolo’s Capture in a third category: it’s a tool for raising awareness and provoking action. And sure enough, after it got the attention of the press and the French Minister of the Interior (who demanded the site be taken down), the site was redirected to a petition calling for a ban on face recognition in the European Union. In some sense, Capture started as a tool for reflection and turned into a tool for action.

Face Recognition Bans

The idea of tracking people who have been arrested using photos can be traced to the mug shots of French police officer Alphonse Bertillon in the late 1800s. Bertillon carefully documented variations in ears, eyes, and sizes and colors of facial features. In 1963 researcher Woodrow Bledsoe secretly built the first face recognition system for the CIA. He hoped to automate this very manual matching process, and lamented the difficulty of the problem, stating “a simple minded approach may not be successful” and listing some ways in which recognition might fail — matching the wrong person under certain conditions.³

In the late 80s and early 90s, researchers returned to face recognition and scaled it up for real-world use. In the late 90s, NIST released FERET, the first major face recognition dataset and benchmark sponsored by the Department of Defense’s Counterdrug Technology Development Program through DARPA for over $6.5M.

In January 2001, we saw the first major face recognition rollout at the Super Bowl in Tampa, Florida using 32 cameras combined with thermal imaging. The ACLU wrote, “We do not believe that the public understands or accepts that they will be subjected to a computerized police lineup as a condition of admission.” The developers of the tech downplayed the privacy issues:

It’s not automatically adding people to the database. It’s simply matching faces in field-of-view against known criminals, or in the case of access control, employees who have access. So no one’s privacy is at stake, except for the privacy of criminals and intruders.

And yet, today, United States government face recognition systems include at least 117 million Americans, including people who are “neither criminals nor suspects.” Companies like ClearView provide face recognition services based on 3 billion images. Much larger systems like SkyNet based in China are expanding into other countries. In the US these algorithms have aided in at least one misidentified arrest — that we know about.

This massive expansion of face recognition has faced significant push back this year. We’ve seen bans on face recognition tech in Portland, Boston, and San Francisco — the Electronic Privacy Information Center is tracking bans across the US and around the world. The Facial Recognition and Biometric Technology Moratorium Act of 2020 bill seeks a ban on federal use of the tech.

Unfortunately most of these bans are worded in a way that still allows many kinds of face analysis. For example, a ban might disallow matching surveillance footage to a database of drivers license photos or mugshots, but still allow searching surveillance footage for a suspect based on a description like “young Black male.” Surveillance companies like Qognify call this “suspect search” and they tout this as a valuable feature — that their software does not maintain a database of identities and is completely “unlike face recognition.”

Even if “suspect search” were used equitably, it is not accurate. In 2018 warrior artist and computer scientist Joy Buolamwini and researcher Timnit Gebru published Gender Shades, demonstrating intersectional inaccuracies that negatively affect darker-skinned women. Joy has argued in Congress for a moratorium on the use of face analysis by law enforcement, leading to Amazon independently calling for a moratorium of use of its own software by police. But while Joy calls for “fair, transparent and accountable facial analysis algorithms,” others like Nabil Hassein speak out Against Black Inclusion in Facial Recognition, asking “whose interests would truly be served by the deployment of automated systems capable of reliably identifying Black people”? Hito Steyrl concurs, “I would prefer not to be tracked at all than to be more precisely tracked.” I see Mimi Onuoha asking this question in a different way with Classification.01 (2017), an installation of two neon brackets that light up when two visitors appear “similar” by some unknown, and unknowable, metric.

In the end, face attribute classification tech has the same results as face recognition tech: it reifies and reinforces systems of power built on racism, sexism, classism, ableism, colonialism. Is our ability to regulate this tech limited by our exposure to its more obscure capabilities? With all the emphasis on the narrow tech of face recognition, could we be suffering a failure of imagination and neglecting other kinds face analysis?

Three screenshots from the game showing customer tips, job selection, and face audition ratings.

Influences on Facework

All of this art and research affected Facework in some way. In some cases it was more direct: I wanted to hint at the relationship between the carceral system and face analysis, so I used Paolo Cirio’s Capture data to train the POLICE classifier. In most cases it was more indirect: the final challenge in Facework is inspired by companies like HireVue that believe it’s possible to determine what kind of job someone deserves based on how they look.

Some influences come from exploratory technical work. While replicating the LFWA+ classifier I built a tool to test my model in realtime. I sat in front of my laptop making faces, watching 73 sliders wobble. I realized that nearly all the categories could be “tricked” if you made the right expression, or cast shadow puppets on your face, or held up the right object in the right orientation. My model achieved similar accuracy to the original paper, but subject to this realtime adversarial attack all of my confidence in that “accuracy” vanished. This formed the basis of the game mechanic behind Facework.

Seeing the response to other work as cultural interventions helped me think about positioning my own project. The way ImageNet Roulette creates discussion for some at the expense retraumatizing others helped guide me toward gameplay where there is always an element of choice. Seeing face filter apps ignore privacy made me intent on deploying all analysis on-device, avoiding sending images to the cloud for processing or building any face databases.

But the biggest influence on Facework comes from examining my personal relationship to these issues.

“People” and Whiteness

Like a lot of American kids born in the 80s, I was fortunate to grow up with the big illustrated book “People” by Peter Spier (1980). Peter writes:

More than 4,000,000,000 people…and no two of them alike! Each and every one of us different from all the others. Each one a unique individual in his or her own right.

Excerpt of “People” with drawings showing examples of variations in body shape, skin color, eye color, noses, and more.

At the end of People Peter writes: “But imagine how dreadfully dull this world of our would be if everybody looked, thought, ate, dressed, and acted the same!”

Page from “People” showing a drab rectilinear cityscape with a street full of same-colored people in same-colored vehicles.

The idea for Facework started to take shape in late 2016 just after Donald Trump was elected President of the United States.

As a white American who never confronted how whiteness operates, I was surprised to see Trump win in spite of his overt racism. I’ve been learning more about race in America, and I am beginning to understand “the monopoly that whiteness has over the norm,” and the way I grew up believing “other people are raced, we are just people” (both quoted in The Whiteness of AI). This is hammered home in the documentary Color of Fear (1994):

There’s this way in which “American” and “white” and “human” become synonyms — that “why can’t we just treat each other as human beings?” to me when I hear it from a white person means, “Why can’t we all just pretend to be white people? I’ll pretend you’re a white person and then you can pretend to be white. Why don’t you eat what I eat, why don’t you drink what I drink, why don’t you think like I think, why don’t you feel like I feel?”

This influenced a backstory we wrote for Facework, imagining a CITY where life is better for SOME than OTHERS (borrowing the all-caps treatment from attribute classification research). A small group of well-intentioned SOME set out to fix CITY with APP. But instead of asking why this disparity exists, they deploy a tech-solutionist approach of collecting information, ranking, and sorting, all in order to make everything feel more familiar (to them).

There were still lots of places in OTHER’S neighborhood that got bad scores from the algorithm. There were neighborhood barber shops and aging clapboard churches and weed-riddled playgrounds. There were unlicensed bars and at-home nail salons and people selling t-shirts on the sidewalk. And every single one of them would increase its score by transforming into an upscale deli-bar-bowling alley hybrid concept.

In the words of Peter Spier, Facework customers “look, think, eat, dress and act the same”. You can see it in their avatars, and some of their reviews. They are “normal”. And they only allow you, the worker, to exist outside of their “normal” within the scope of carefully constructed attributes (and at their request).

What happens next?

This year we’ve seen some of the largest protests in American history in response to the killing of George Floyd and other Black Americans at the hands of police. Joining the protests in Los Angeles brought me hope, helped me connect with local groups, and helped me understand that I will not be the hero making a difference with the right APP (or artistic intervention)— but that I can be a small part of something much bigger than myself.

Facework is not that bigger movement. In many ways it still operates within the traditional critical media art paradigm of “making visible.” Artist and researcher Roopa Vasudevan suggests that this “revelatory” mode ultimately reinforces the power it critiques. Instead, Roopa calls for “strategic transparency,” favoring a “collective mode of practice” over the revelations of an individual artist.

I see this collective practice in the Stop LAPD Spying Coalition, with a network of activists pushing back directly against invasive police surveillance in Los Angeles.

I see this in Datasheets for Datasets (2020) by Timnit Gebru et al, encouraging distributed documentation of the way datasets are composed, collected, and what biases might be contained within.

I see teaching as one mode of collective practice, with classes like Dark Matters Expanded (2020) by American Artist diving into biometrics, surveillance, and algorithmic policing.

Even Paolo Cirio’s Capture, in its final act, switched gears from “revelation” (thousands of photos of police) to “collective practice” by pointing towards a petition calling for a full ban on face recognition in Europe.

In my understanding of Roopa’s proposal, one effective space for artists may be in collective exploitation of systems of control. One example is Mask.ID (2018) by The Peng! Collective to the where they used machine learning to produce passports with photos that allow the passports to be used by multiple refugees. At scale, this could make more difference than any “revelation.”

Mask.ID also has a personal connection for me. I’ve been looking at myself to begin dismantling my own acculturation into whiteness, to recognize my privilege while also pushing back against the system that gives me those privileges. I’ve been looking to my family to learn who we were before we were “white”. I’ve been thinking about my grandfather. In Mask.ID I am reminded of stories he told about forging documents for people escaping Poland during the Nazi occupation. I’ve been thinking about him working at the train yard, reporting movements of German troops, tanks, and artillery — as one of many spies — and I wonder about the relationship between spying and “making visible”. Maybe revelations still matter, when they are secret, and when they are at scale? I’ve been thinking about the time he was locked up by the gestapo, and convinced a prison guard to escape with him and four other prisoners.

None of this is simple. Face analysis will not be legislated away overnight, massive corporations will not completely pivot on their strategy, systems of control will not be dismantled in one go, and we certainly aren’t going to fix anything with more accurate models.

But I draw inspiration from this absolutely ridiculous idea: that we the prisoners might not only convince the guards to let us go, but that we the guards might also be convinced to run away with the prisoners. A utopian, impossible liberation — borne of necessity, imagination, but ultimately a belief in the recognition of our individual complicities, the possibility of redemption, and universal dignity.


Facework (2020) by Kyle McDonald
Game Design Greg Borenstein
Design Fei Liu
Development Evelyn Masso, Sarah Port
Production Keira Chang

Sponsored by Mozilla Creative Media Awards


  1. In the popular facial expression recognition challenge (FER) dataset there’s the cartoon Nixon labeled “anger,” the creepy doll labeled “happiness,” or the missing image icon labeled “fear”. The original dataset had so many bad samples that Microsoft released an update called FER+ that labels a large number of images “unknown” and “not a face”.
  2. Even apps like Snapchat — which operate on different principles but sometimes use the same data — are content shipping a Blackface filter until critics point out the problem.
  3. For an interactive example of what face recognition looks like today, see Erase Your Face by YR Media.

Thanks to Keira Chang for significant editing and feedback on this article, to Lauren McCarthy for feedback and support, and to A.M. Darke for insights and discussions on faces. Thanks to Jean-Baptiste Dodane for discussion about his self portraits. This article is indebted to the excellent essay On the Basis of Face (2020) by Devon Schiller.