Replication Package: AI IDEs or Autonomous Agents? Measuring the Impact of Coding Agents on Software Development
This repository contains the replication package for the following paper:
Shyam Agarwal, Hao He, and Bogdan Vasilescu. 2026. AI IDEs or Autonomous Agents? Measuring the Impact of Coding Agents on Software Development. In 23rd International Conference on Mining Software Repositories (MSR ’26), April 13–14, 2026, Rio de Janeiro, Brazil. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3793302.3793589
Some additional files are available on Zenodo [Zenodo DOI placeholder - to be added]
MSR Challenge 2026
This replication package contains all data, code, and analysis scripts necessary to reproduce the results presented in our paper for the MSR Challenge 2026: "AI IDEs or Autonomous Agents? Measuring the Impact of Coding Agents on Software Development."
This study investigates the causal impact of coding agents on software development outcomes by employing difference-in-differences (DiD) estimation methods to analyze how agent adoption affects various software quality and productivity metrics.
The replication package is organized into three main directories:
Contains all datasets used throughout the analysis pipeline:
panel_event_monthly.csv: Main panel dataset for difference-in-differences analysis, containing monthly aggregated metrics for treatment and control repositoriesrepos_with_details.csv: Repository-level metadata including adoption dates, metrics, and classification flagsmatching.csv: Propensity score matching results linking treatment repositories to matched control repositoriesrepo_events.csv/repo_events_control.csv: GitHub event-level data for treatment and control repositoriests_repos_monthly.csv/ts_repos_control_monthly.csv: Monthly time series data for treatment and control groupsall_scraped_prs_final_list.csv: Pull request data collected during the studyagent_first.txt: List of repositories that adopted agents directly (without prior AI traces)ide_first.txt: List of repositories that adopted agents after potentially using traditional AI tools
Note: Some data files may not be present in this repository due to size limitations. These additional data files can be found at: [Zenodo DOI placeholder - to be added]
R Markdown notebooks that reproduce all analyses, tables, and figures:
-
DiffinDiff.Rmd: Main difference-in-differences analysis for both AF and IF groups- Generates static treatment effects table
- Creates dynamic treatment effects (event study) plot for six key outcomes
-
AdoptionTimeAnalysis.Rmd: Analysis of agent adoption timing patterns- Generates adoption time distribution plot comparing AF and IF groups
-
RepoMetricsAnalysis.Rmd: Descriptive statistics and repository metrics comparison- Generates LaTeX table with summary statistics (mean, min, median, max) for both AF and IF groups
Contains all figures generated by the notebooks:
dynamic_effects.pdf: Event study plot showing dynamic treatment effects across six outcomesagent_adoption_time_combined.pdf: Bar chart showing adoption timing distribution by group
Our data collection and processing workflows are based on adaptations of the scripts originally provided in the following replication package:
Hao He, Courtney Miller, Shyam Agarwal, Christian Kästner, and Bogdan Vasilescu. 2026. Speed at the Cost of Quality: How Cursor AI Increases Short-Term Velocity and Long-Term Complexity in Open-Source Projects. In 23rd International Conference on Mining Software Repositories (MSR ’26), April 13–14, 2026, Rio de Janeiro, Brazil. ACM, New York, NY, USA, 19 pages. https://doi.org/10.1145/3793302.3793349
Source: https://zenodo.org/records/18368661
Specifically, we modified their scripts to support our study goals, including the detection of AI tool traces, event data collection, propensity score matching, and repository metric aggregation. Our adaptations focus on distinguishing between repositories with and without prior AI tool usage (similar to their robustness check), and analyzing the differential impact of agent adoption across these groups. Please refer to the source above for the original codebase.
All analyses were performed using R 4.3.3. Required R packages:
install.packages(c(
"didimputation", # Borusyak et al. DiD imputation estimator
"ggplot2", # Plotting
"dplyr", # Data manipulation
"data.table", # Fast data operations
"readr", # Reading CSV files
"tidyr", # Data tidying
"stringr", # String manipulation
"purrr", # Functional programming
"lubridate", # Date/time manipulation
"knitr", # Dynamic report generation
"kableExtra", # Enhanced table formatting
"Cairo" # High-quality graphics device
))- R: Version 4.3.3 or compatible
- Cairo graphics library: Required for PDF generation (install system dependencies as needed)
All required data files are included in the data/ folder. If any data files are missing from this repository (due to size limitations), they will be available on Zenodo: [Zenodo DOI placeholder - to be added]
Please download the full dataset from Zenodo and place all files in the data/ folder to ensure complete reproducibility.
Install R 4.3.3 and the required packages listed above. Ensure Cairo graphics support is available for PDF generation.
Knit the notebooks in RStudio or using R's rmarkdown::render() function. The notebooks should be executed in the following order:
-
RepoMetricsAnalysis.Rmd- Generates descriptive statistics table (LaTeX format)
- Output: LaTeX table printed to console
-
AdoptionTimeAnalysis.Rmd- Analyzes adoption timing patterns
- Output:
plots/agent_adoption_time_combined.pdf
-
DiffinDiff.Rmd- Main DiD analysis for AF and IF control groups
- Output:
- Static treatment effects table (displayed in HTML)
plots/dynamic_effects.pdf(event study plot)
Each notebook reads data from ../data/ and saves outputs to ../plots/ (relative to the notebook location).
- HTML outputs: Each notebook generates an HTML file with embedded tables and plots
- PDF plots: High-quality PDF figures are saved in the
plots/directory - LaTeX tables: Descriptive statistics are printed in LaTeX format to the console
- All data files are pre-processed and ready for analysis. The raw data collection scripts are available in the referenced Cursor AI study replication package.
- Results may vary slightly due to R package version differences, but the overall findings should be consistent.
- The notebooks are designed to be self-contained and reproducible with the provided data.
If you use this replication package, please cite:
Shyam Agarwal, Hao He, and Bogdan Vasilescu. 2026. AI IDEs or Autonomous Agents? Measuring the Impact of Coding Agents on Software Development. In 23rd International Conference on Mining Software Repositories (MSR ’26), April 13–14, 2026, Rio de Janeiro, Brazil. ACM, New York, NY, USA, 5 pages. https://doi.org/10.1145/3793302.3793589