Talk at Applause: Fighting Fake News with AI and Crowdsourcing at The Wall Street Journal

Wall Street Journal: Fighting Fake News with AI and Crowdsourcing

(from Applause’s web site) In this showcase session at Applause, the Wall Street Journal’s CTO & CPO, Rajiv Pant, will explain how the media organization is using advanced technologies to combat fake news. You’ll hear about ultra-realistic voice and video cloning technologies that will make you question the legitimacy of recordings. You’ll also hear how machine learning, artificial intelligence and the crowd are being deployed to defend against this threat and others.

( 30 minute video at YouTube)

Earlier this week, I spoke at the Applause conference about a problem that I deeply care about: deepfakes. Not the garden-variety Photoshopped images or misleading headlines we’ve dealt with for years, but something fundamentally more dangerous—AI-generated synthetic media so convincing that even experts struggle to identify it.

This isn’t theoretical or future threat. The technology exists today, and it’s advancing faster than our ability to defend against it.

The Deep Fake Threat

Let me start with what happened when I showed the audience a video of President Obama. He appeared on screen, speaking directly to the camera about a startup called Lyrebird that can create digital copies of anyone’s voice from just one minute of audio. The facial expressions matched perfectly. The delivery was natural. Everything looked authentic.

It was completely fake.

That video demonstrates a specific type of deepfake: lip syncing from audio. Researchers at the University of Washington developed a tool that converts audio files into realistic facial expressions and mouth movements, which can then be grafted onto video of someone else. They took Obama’s voice from a talk show interview and made it appear as though he was saying those exact words during a White House speech.

The implications are staggering. Imagine videos of politicians in meetings with terrorists, soldiers committing war crimes that never happened, or CEOs making statements that tank their companies’ stock prices. These aren’t far-fetched scenarios—the technology to create them is available right now.

Understanding the Technology

Deepfakes come in several varieties, each with distinct characteristics and dangers:

Face Swapping has become disturbingly accessible. A program called FakeApp—which takes about 12 minutes to learn—allows anyone to swap faces in videos with compelling results. What’s remarkable is the efficiency: in 2016, the crew of Star Wars: Rogue One spent tens of thousands of dollars and used expensive Hollywood equipment to superimpose a young Carrie Fisher’s face onto another actress. By 2018, FakeApp users replicated the same effect with consumer hardware. The results are nearly indistinguishable.

Facial Reenactment represents a more sophisticated approach. This technique doesn’t just swap faces—it transfers one person’s complete facial expressions onto another person’s face in a target video. The algorithm traces the source face, detects every micro-expression (smiles, raised eyebrows, lip movements), and adjusts the target face accordingly. The latest generation, called Deep Video Portraits, goes further by manipulating entire body positions and limb movements, not just faces.

Video Alteration pushes boundaries even further. NVIDIA has demonstrated algorithms that can change weather conditions or time of day in existing video footage. Take a snowy winter road scene and transform it into summer. Convert daylight to nighttime. Make a sunny day rainy. This isn’t frame-by-frame manual editing—it’s AI that has learned what these different conditions look like by analyzing thousands of other videos.

Perhaps most insidious are selective edits: removing or replacing specific people or objects in otherwise authentic footage. These micro-modifications are harder to detect because 85-90% of the video remains genuine. Only a small portion has been altered, making it highly believable and extremely difficult to spot.

How Deepfakes Actually Work

The breakthrough that enables modern deepfakes came from Ian Goodfellow’s research on Generative Adversarial Networks (GANs). The architecture is elegant: two neural networks locked in competition with each other.

The first network, called the generator, creates fake content—images, video, audio. The second network, the discriminator, tries to detect whether that content is real or fake, assigning confidence scores to its assessments.

Here’s where it gets interesting: they improve iteratively. The generator creates a fake. The discriminator evaluates it and likely catches it initially. The generator tries again, learning from its failure. The discriminator gets better at detection. They continue this adversarial loop, each pushing the other to improve.

This is similar to how DeepMind’s AI learned to master chess and Go—by playing against itself millions of times. But instead of learning game strategy, these networks learn to create and detect synthetic media. The process continues until the generator produces fakes that the discriminator can no longer reliably identify. At that point, human observers have virtually no chance of detecting the manipulation.

The same datasets universities use to train detection algorithms can be weaponized to create better fakes. It’s an arms race with no clear end point.

Why This Matters for Journalism

At The Wall Street Journal, our brand depends on trust. Recent surveys rank us as the most trusted newspaper in America across the political spectrum. That trust, built over decades, could evaporate overnight if we published a story based on fake video evidence or fell victim to synthetic audio in a source interview.

The Jayson Blair case at The New York Times demonstrated how a single journalist fabricating stories could shake public faith in an entire institution. That was before deepfakes. Now, a well-meaning journalist could unknowingly base a story on AI-generated evidence, with far-reaching consequences.

This matters beyond media. The Wall Street Journal publishes market-moving information. A fake video or audio clip that we validate through publication could trigger economic damage. A manipulated video of a world leader could push countries toward conflict. The stakes are civilization-level, not just reputational.

Detection and Defense Strategies

Fighting deepfakes requires combining AI capabilities with human intelligence. Machine learning offers one line of defense. The Technical University of Munich built FaceForensics, a database of edited images and videos used to train detection algorithms. The challenge: the same training data helps bad actors create more sophisticated fakes.

Forensic analysis reveals subtle artifacts invisible to human eyes. Digital cameras and sensors have unique imperfections—in lenses, chips, and circuitry—that create fingerprints in the images they produce. A forensics expert can often identify not just which camera model was used, but sometimes the specific serial number of an individual device. Current deepfakes typically lack these artifacts, though that gap is closing.

Behavioral patterns offer another detection vector. Human eyes blink at predictable intervals; first-generation deepfake videos often missed this detail. Pulse detection from subtle skin color changes, natural body sounds, and hand movements provide additional signals that software can analyze but human observers usually miss.

The most promising approach combines AI detection with crowdsourced human investigation. Until we achieve artificial general intelligence, AI systems can only analyze the specific video or image in front of them. They lack broader contextual knowledge.

Humans excel at this contextual analysis. If someone created a deepfake video of me from two years ago, AI might find it flawless. But a human researcher would notice that two years ago, I had long curly hair. They could track down original source videos, check my public appearances from that time period, verify my location through other documentation, and cross-reference details that a computer wouldn’t know to check.

This is why partnerships between news organizations and crowdsourced testing platforms make sense. Our journalists focus on reporting; crowdsourced teams investigate the authenticity of visual and audio evidence with the healthy skepticism and research capabilities the situation demands.

The Legitimate Applications

I should note that this technology isn’t inherently evil. The movie industry uses these techniques for special effects and dubbing. Companies like Lyrebird aim to help people who’ve lost their voices to disease recover this crucial part of their identity. Adobe’s developing VOCO—essentially Photoshop for audio—which could have valuable applications in content production.

NVIDIA’s video alteration tools could transform film production. The selective editing capabilities might help creators fix mistakes or update content efficiently. Like many powerful technologies, the tools themselves are neutral. What matters is how we deploy them.

What We’re Doing About It

At the Journal, we’re taking this threat seriously. We’re currently working with Cornell University on research partnerships to develop better detection methodologies. We’re investing in forensics capabilities and training our journalists to recognize the warning signs.

But technology alone won’t solve this problem. We need new verification protocols, stronger source authentication, and—most importantly—partnerships that combine AI capabilities with human insight and judgment.

The crowdsourced model offers a scalable solution. Just as our engineering teams work with external testing partners to find bugs in our applications that might otherwise go undetected, our newsroom could work with crowdsourced investigators to verify the authenticity of visual and audio evidence before we stake our credibility on it.

An Urgent Priority

Deepfakes represent information warfare. We’re approaching a world where seeing is no longer believing, where video evidence can be manufactured, and where trust in shared reality itself comes under attack.

The technology is advancing exponentially. The tools are becoming more accessible. The potential for misuse is expanding daily.

We can’t stop the technology from evolving—if universities and legitimate companies halt their research, bad actors will continue regardless. But we can build defenses, establish verification protocols, and create partnerships that help us maintain trust in an era of synthetic media.

This is about more than protecting news organizations or preventing market manipulation. It’s about preserving our ability to distinguish truth from fiction, to have shared facts as the foundation for democratic discourse, and to maintain trust in the information ecosystems that society depends on.

The fight against deepfakes has just begun. We need to take it seriously before the fakes become indistinguishable from reality.


Watch my full 30-minute presentation from the Applause conference below, including demonstrations of various deepfake techniques and a more detailed technical discussion of detection methodologies:

Tags: