A data journalism investigation analyzing how five major French media outlets covered sexual violence over seven years, revealing persistent problematic patterns despite increased attention post-#MeToo.
This investigation analyzes 2,321 articles published between January 2018 and December 2024 across five major French news outlets: Le Monde, Le Figaro, Libération, 20 Minutes, and France Info.
Key findings:
- Coverage of sexual violence tripled between 2018 and 2024.
- 47.5% of articles contain victim-blaming language or euphemistic framing.
- Political orientation does not significantly affect problematic framing rates.
- Media coverage disproportionately emphasizes convictions compared to high dismissal rates in judicial reality.
- Elite celebrity cases dominate coverage patterns.
The full methodology, dataset structure, and reproducible code are provided in this repository.
| Year | Articles | Change |
|---|---|---|
| 2018 | 227 | Baseline (MeToo wave) |
| 2020 | 108 | -52% (COVID impact) |
| 2024 | 639 | +181% vs 2018 |
The 2024 surge was driven by actress Judith Godrèche's testimony and renewed attention to the cinema industry.
Surface level: 80.7% of articles use victim-centered framing Reality: 47.5% contain victim-blaming language
This contradiction reveals that even well-intentioned coverage perpetuates harmful narratives through word choice, hedging language, and implicit doubt.
By outlet:
| Outlet | Victim-Blaming % | Political Leaning |
|---|---|---|
| Libération | 52.8% | Left-wing |
| 20 Minutes | 49.2% | Centrist |
| Le Figaro | 46.3% | Right-wing |
| Le Monde | 45.1% | Center-left |
| France Info | 44.8% | Public service |
Notable: The most progressive outlet (Libération) has the highest rate of victim-blaming language.
Overall average polarity: -0.70 (scale: -1 to +1)
| Year | Polarity | Interpretation |
|---|---|---|
| 2019 | -0.61 | Post-MeToo optimism |
| 2022 | -0.75 | Peak negativity |
| 2024 | -0.68 | Slight recovery |
Coverage became more negative over time, suggesting growing frustration with slow systemic change.
| Metric | Media Coverage | Reality |
|---|---|---|
| Convictions mentioned | 41.6% | ~10% of cases |
| Dismissals mentioned | 4.9% | ~80% of cases |
| Ratio | 0.12 | 8.0 |
Media over-reports convictions by 8x relative to dismissals, creating a false impression of justice system effectiveness.
| Type | % of Articles |
|---|---|
| Rape | 65.7% |
| Sexual Assault | 58.6% |
| Harassment | 45.0% |
| Cyber Violence | 19.7% |
| Incest | 16.1% |
| Child Sexual Abuse | 7.5% |
| Institution | Coverage % |
|---|---|
| Cinema/Entertainment | 76.0% |
| Family/Domestic | 70.3% |
| Politics | 44.6% |
| Workplace | 36.8% |
| Catholic Church | 23.5% |
Notable gap: The Catholic Church abuse scandal is significantly under-covered relative to its scale.
"Pédophilie" vs "Pédocriminalité" (problematic vs correct terminology):
| Year | Ratio | Interpretation |
|---|---|---|
| 2018 | 53.0 | Only problematic term used |
| 2022 | 0.77 | Correct term dominates |
| 2024 | 0.79 | Progress sustained |
This demonstrates that media can learn and improve when advocacy efforts raise awareness.
| Person/Case | Mentions | Context |
|---|---|---|
| Harvey Weinstein | 340 | International benchmark |
| Georges Tron | 303 | French politician |
| Richard Berry | 292 | Actor, incest accusation |
| Gabriel Matzneff | 237 | Writer, pedophilia |
| Judith Godrèche | 173 | Actress, 2024 surge |
| Gérard Depardieu | 177 | Actor, multiple accusations |
Coverage heavily favors elite perpetrators over systemic analysis affecting ordinary people.
- Manual URL collection: 2,321 article URLs gathered from five outlets
- Sources: Le Monde, Le Figaro, Libération, 20 Minutes, France Info
- Web scraping: Python with
newspaper4kandBeautifulSoup - Time period: January 2018 – December 2024
- Le Monde - Center-left, reference newspaper
- Le Figaro - Right-wing, oldest national daily
- Libération - Left-wing, progressive
- 20 Minutes - Free daily, centrist
- France Info - Public broadcaster
- Sentiment analysis: French-language polarity scoring
- Keyword detection: Custom dictionaries for:
- Victim-blaming language (e.g., "allégué," "prétend," "affirme")
- Euphemisms (e.g., "gestes déplacés," "comportement inapproprié")
- Correct vs problematic terminology
- Entity extraction: Named entity recognition for people, institutions
- Temporal analysis: Trends over time
- Cross-outlet comparison: Statistical comparison between sources
- Paywall impact: Some outlets had partial paywalls affecting article completeness
- Date extraction: 19.5% of articles lacked extractable publication dates
- Author attribution: 82% lacked identifiable bylines
- Sample bias: Limited to five major outlets; regional press not included
- Python 3.10+
- newspaper4k - Article extraction
- BeautifulSoup - HTML parsing
- pandas - Data manipulation
- matplotlib/seaborn - Visualization
- Jupyter notebooks - Analysis workflow
french-media-sexual-violence-analysis/
├── README.md
├── LICENSE
├── requirements.txt
├── french_media_coverage_analysis.ipynb # Complete analysis notebook
├── data/
│ ├── raw/ # Original curated URLs
│ └── processed/ # Final analyzed dataset
├── visualizations/ # Generated charts
└── docs/
└── methodology.md # Detailed methodology
# Clone the repository
git clone https://github.com/yourusername/french-media-sexual-violence-analysis.git
cd french-media-sexual-violence-analysis
# Create virtual environment
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt# Open the analysis notebook
jupyter notebook french_media_coverage_analysis.ipynb- Quantity ≠ Quality: Tripled coverage hasn't eliminated problematic framing
- Progressive paradox: Left-leaning outlets aren't immune to victim-blaming
- Justice distortion: Media creates false impression of system effectiveness
- Elite focus: Celebrity cases overshadow systemic issues
- Language evolves: Terminology can improve with sustained advocacy
- Establish editorial guidelines flagging victim-blaming language
- Train journalists on trauma-informed reporting
- Balance justice coverage by reporting dismissal rates
- Diversify sources beyond high-profile cases
- Cover systemic failures, not just individual incidents
- All data sourced from publicly available articles
- No victim identification information included
- Analysis focuses on media framing, not case details
- Project aims to improve media practices, not shame individual journalists
Eloise Bouton Data Journalist | Senior Journalist (15 years) | Junior Data Scientist
- Combining journalism expertise with data science skills
- Focus: Social justice, women's rights, LGBTQ+ issues
- Works remotely
This project is licensed under the MIT License - see LICENSE file for details.
- French media outlets for public access to archives
- #MeToo movement for catalyzing this conversation
- Data science bootcamp instructors and peers
This repository includes article metadata (publication date, outlet, URL) and processed textual features used for analysis (e.g., keyword frequencies, framing classifications, named entity counts).
Full copyrighted article texts are not redistributed. Raw article content was accessed for research and analysis purposes only and is not publicly shared in this repository.
Automated classifications (e.g., victim-blaming language detection, framing patterns, terminology usage) rely on pattern-based natural language processing methods and may not capture full contextual nuance. To assess reliability, a random sample of 100 articles was manually reviewed to validate automated classifications.
Justice system statistics referenced in this project are sourced from:
Ministère de la Justice, Violences sexuelles et atteintes aux mœurs : les décisions du parquet et de l'instruction,
Infostat Justice n°160, March 2018.
These figures refer to complaints processed by prosecutors and include cases dismissed before trial, primarily for insufficiently characterized offences.
If you reference this project in research, journalism, or academic work, please cite:
Bouton, Eloise (2025).
French Media Coverage of Sexual Violence (2018–2024): A Computational Analysis of 2,321 Articles.
GitHub repository: https://github.com/eloiseelle/french-media-sexual-violence-analysis
Eloise Bouton is a freelance journalist specializing in media analysis and gender issues.
Website: https://eloisebouton.com/
For commissions, collaborations, or data inquiries, please open an issue on this repository or contact directly.
