Police body cams and Burmese language data: Facebook CTO says their AI moderation is "pushing the frontier"
Almost two years ago, Mark Zuckerberg told Congress that he was confident that AI would be able to identify content that breaks Facebook's policies -- be it violent video, hate speech, or fake news -- and remove it from his platform.
Just over a year later, Zuckerberg's suite of AI moderation tools failed to stop the Christchurch shooter from livestreaming his killing of over 50 people, uninterrupted, for 29 minutes.
Despite Facebook's efforts, events like this -- the police shooting of Philando Castille and the military-led campaign inciting violence in Myanmar, among others -- have plagued the site and its apps for years.
So how did the posts, which encourage and display the kind of violence and disinformation Zuckerberg was sure his AI dragnet would find, get through undetected?
In a recent FT piece, Facebook's CTO Mike Schroepfer said the answer to the Christchurch shooting is largely the same as it is for posts encouraging ethnic violence and manipulation campaigns elsewhere in the world: even for Facebook, there is a lack of trainable data to teach AI tools to recognize varied forms hate speech, violence, and disinformation.
Take Facebook Live, where users broadcast live video. Facebook said that the Christchurch shooting went undetected because the nature of the video was unique -- filmed using a helmet camera, in the style of a first person shooter -- and its moderation algorithms didn't recognize violence from that angle.
To automatically flag live videos, Facebook's AIs need to be trained on images and media that look like the acts they're supposed to detect.
Facebook's solution, according to Schroepfer, is a program to collect realistic body cam footage:
Facebook has now equipped London police with body cameras during terrorist training exercises to get more footage, having eschewed using footage of video game shoot-outs or paintballing.
When it comes to policing written posts, Schroepfer says, the challenges are similar. Take Myanmar, where posts inciting violence against the Rohingya Muslim minority went unchecked for months on the platform. It's again a problem of data:
"There’s not a lot of content in the world in Burmese, which means there’s not a lot of training data," said Schroepfer.
But the problem isn't just finding inappropriate posts. It's also keeping them from being reposted.
To take down reposts of the Christchurch shooting or fake news memes, Facebook developers play a cat and mouse game with reposters. Facecbook teams create fingerprints, or hashes, of banned videos and images, which the system can use to make sure newly uploaded posts don't contain content previously deleted.
Many banned posts originate from tightknit online groups, however, with dedicated followers that work to get around the fingerprint. To repost the Christchurch shooting video, users added borders to the footage, changed the tint of the colors, or cropped the video's aspect ratio.
"In a lot of cases, this is an adversarial game," Mr Schroepfer told the FT.
Tone and subtlety in text is difficult as well -- especially in Myanmar, a country where, according to the New York Times, the vast majority of internet use takes place on Facebook.
"When the level of subtlety goes up, or context goes up, the technical challenges go up dramatically," he said.
Media and Political Disinformation →
This section focuses on the ways in which groups use and exploit advanced technologies to manipulate users.
Brilliant offers courses in computer science, math, and natural sciences.
Brilliant is made with the loving efforts of lifelong learners from MIT, Caltech, Duke, the University of Chicago, and more.
In school, people are often trained to apply formulas to rote problems. But this traditional approach prevents deeper understanding of concepts, reduces independent critical thinking, and cultivates few useful skills.
Whether you're looking for Computer Science Fundamentals or are ready to learn to write your own Neural Networks, Brilliant has a course for you: