|
| 1 | +# Findings Folder |
| 2 | + |
| 3 | +A cohesive overview of the study “Open-Source vs. Commercial AI: Comparing Performance |
| 4 | +and Quality,” including the finalized report, raw responses, and key links. |
| 5 | + |
| 6 | +## Overview |
| 7 | + |
| 8 | +- Experiment dates: November 21–December 3, 2025 |
| 9 | +- Sample size: 42 responses |
| 10 | +- Design: Blinded, side-by-side comparison of open-source vs. commercial AI models |
| 11 | + across eight Apollo‑11–themed tasks (summarization, paraphrasing, reasoning, |
| 12 | + creative writing) |
| 13 | + |
| 14 | +## Contents |
| 15 | + |
| 16 | +- `README.md` |
| 17 | + Folder guide with overview, quick findings, and links |
| 18 | +- `findings_report.md` |
| 19 | + Final findings report (Markdown) with all charts embedded |
| 20 | +- `responses.csv` |
| 21 | + Raw Google Forms responses (CSV exported from the live Google Sheet) |
| 22 | + |
| 23 | +## Quick findings |
| 24 | + |
| 25 | +- Summarization and reasoning: commercial models generally rate higher, but margins |
| 26 | + are modest. |
| 27 | +- Paraphrasing and creative writing: preferences are more balanced; model identity |
| 28 | + is often hard to distinguish. |
| 29 | +- Uncertainty matters: “not sure / can’t tell” responses are informative and |
| 30 | + suggest convergence in perceived quality under blind conditions. |
| 31 | +- Identification difficulty: many participants struggled to correctly label outputs |
| 32 | + as open-source or commercial, reinforcing that style and quality can overlap |
| 33 | + depending on task and prompt. |
| 34 | + |
| 35 | +## Data source (Google Sheets) |
| 36 | + |
| 37 | +- Primary data source (cleaned headers, all responses): |
| 38 | + [Google Sheets](https://docs.google.com/spreadsheets/d/1Bm4geFzEUw9qNFrUuG4MAFSxkr_JFJY8HObnniywYFY/edit?usp=sharing) |
| 39 | + |
| 40 | +## Report (PDF, external) |
| 41 | + |
| 42 | +- Shareable PDF version of the findings report: |
| 43 | + [Findings Report — PDF](https://drive.google.com/file/d/1GsjNYVLDgeXjm95937c2C-epp8M82Ytc/view?usp=sharing) |
| 44 | + |
| 45 | +## Notes |
| 46 | + |
| 47 | +- Charts are embedded directly in `findings_report.md`. |
| 48 | +- `responses.csv` mirrors the Google Sheet at the time of export. |
| 49 | +- The study focuses on output-level evaluation under blind conditions; uncertainty |
| 50 | + and task dependence are treated as meaningful signals in interpreting quality. |
0 commit comments