Skip to content

Commit 3aae9c5

Browse files
authored
Merge pull request #35 from MIT-Emerging-Talent/findings
Milestone 5 - Findings folder update
2 parents 38c1a5c + 8e518bf commit 3aae9c5

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

44 files changed

+854
-0
lines changed

4_findings/README.md

Lines changed: 50 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,50 @@
1+
# Findings Folder
2+
3+
A cohesive overview of the study “Open-Source vs. Commercial AI: Comparing Performance
4+
and Quality,” including the finalized report, raw responses, and key links.
5+
6+
## Overview
7+
8+
- Experiment dates: November 21–December 3, 2025
9+
- Sample size: 42 responses
10+
- Design: Blinded, side-by-side comparison of open-source vs. commercial AI models
11+
across eight Apollo‑11–themed tasks (summarization, paraphrasing, reasoning,
12+
creative writing)
13+
14+
## Contents
15+
16+
- `README.md`
17+
Folder guide with overview, quick findings, and links
18+
- `findings_report.md`
19+
Final findings report (Markdown) with all charts embedded
20+
- `responses.csv`
21+
Raw Google Forms responses (CSV exported from the live Google Sheet)
22+
23+
## Quick findings
24+
25+
- Summarization and reasoning: commercial models generally rate higher, but margins
26+
are modest.
27+
- Paraphrasing and creative writing: preferences are more balanced; model identity
28+
is often hard to distinguish.
29+
- Uncertainty matters: “not sure / can’t tell” responses are informative and
30+
suggest convergence in perceived quality under blind conditions.
31+
- Identification difficulty: many participants struggled to correctly label outputs
32+
as open-source or commercial, reinforcing that style and quality can overlap
33+
depending on task and prompt.
34+
35+
## Data source (Google Sheets)
36+
37+
- Primary data source (cleaned headers, all responses):
38+
[Google Sheets](https://docs.google.com/spreadsheets/d/1Bm4geFzEUw9qNFrUuG4MAFSxkr_JFJY8HObnniywYFY/edit?usp=sharing)
39+
40+
## Report (PDF, external)
41+
42+
- Shareable PDF version of the findings report:
43+
[Findings Report — PDF](https://drive.google.com/file/d/1GsjNYVLDgeXjm95937c2C-epp8M82Ytc/view?usp=sharing)
44+
45+
## Notes
46+
47+
- Charts are embedded directly in `findings_report.md`.
48+
- `responses.csv` mirrors the Google Sheet at the time of export.
49+
- The study focuses on output-level evaluation under blind conditions; uncertainty
50+
and task dependence are treated as meaningful signals in interpreting quality.

0 commit comments

Comments
 (0)