|
| 1 | +<!-- markdownlint-disable MD024 --> |
| 2 | +<!-- Disabled MD024 (Multiple headings with the same content) rule |
| 3 | +because repeated headings (Summary, Action Items) are |
| 4 | +intentionally used across multiple sections for structural clarity. |
| 5 | +--> |
| 6 | +# Milestone 4 Meeting Minutes |
| 7 | + |
| 8 | +## Meeting 18 |
| 9 | + |
| 10 | +**Date:** November 19, 2025 (Wednesday, 1:00 PM EST) |
| 11 | +**Attendees:** Amro, Aseel, Banu, Caesar, Reem, Safia |
| 12 | + |
| 13 | +### Summary |
| 14 | + |
| 15 | +- The research question was refined with Evan's help: |
| 16 | + - During the group conversation with Evan on Slack, team initially drafted a |
| 17 | + precise question focusing on whether optimized open-source models (e.g., via |
| 18 | + recursive editing, distillation) could become environmentally and |
| 19 | + functionally viable alternatives to commercial models. |
| 20 | + - However, Evan advised that the ELO2 project should remain **open-ended**, |
| 21 | + shifting toward a broader guiding question: |
| 22 | + **“How can we achieve similar results to large private models on smaller |
| 23 | + devices and with less power consumption?”** |
| 24 | + - As a result, the final deliverable will be a **comprehensive portfolio** of |
| 25 | + experiments, benchmarks, comparisons, and promising directions—rather than a |
| 26 | + single definitive answer. |
| 27 | +- Based on Evan’s feedback, the upcoming **Google Form** will include both |
| 28 | + **commercial** and **open-source** model responses. |
| 29 | +- Initially, the plan was to pair small open-source SLMs with commercial models |
| 30 | + of similar sizes, but this was not feasible due to limited access. The team |
| 31 | + instead decided to **use accessible commercial LLMs (ChatGPT, Claude, |
| 32 | + Gemini).** |
| 33 | +- **All questions** across all categories will be used in the **Google form**. |
| 34 | +- Finalized model assignments: |
| 35 | + - **Aseel** → ChatGPT |
| 36 | + - **Caesar** → Claude Haiku 4.5 |
| 37 | + - **Amro** → Gemini Pro 3 |
| 38 | + - **Banu** → Gemini Fast (Flash 2.5) |
| 39 | + - **Reem** → Gemini Flash 2.5 Lite (via API/HuggingFace) |
| 40 | + |
| 41 | +### Action Plan |
| 42 | + |
| 43 | +- Each member will generate **responses for all question prompts** using their |
| 44 | + assigned model. |
| 45 | +- All responses must be uploaded to the |
| 46 | + [shared document](<https://docs.google.com/document/d/1CBYpsLvkeE5aLKp1-6vPaiiz> |
| 47 | + DXVf80o2uH42XL5gXtw/edit?tab=t.0) **by tomorrow**. |
| 48 | +- In tomorrow’s meeting, the team will review all open-source and commercial |
| 49 | + model outputs and **select the final answers** to include in the Google Form. |
| 50 | + |
| 51 | +--- |
| 52 | + |
| 53 | +## Meeting 19 |
| 54 | + |
| 55 | +**Date:** November 20, 2025 (Thursday, 2:30 PM EST) |
| 56 | +**Attendees:** Amro, Aseel, Caesar, Reem, Safia |
| 57 | + |
| 58 | +### Summary |
| 59 | + |
| 60 | +- The team revisited the original plan for the evaluation form. Initially, |
| 61 | + **all 21 questions** across all task categories were intended to be included. |
| 62 | +- However, because a 21-question survey would be too long for participants, the |
| 63 | + group agreed to **select only two questions per category** to keep the form |
| 64 | + manageable. |
| 65 | +- During this selection process, the team decided to **exclude the |
| 66 | + Retrieval/RAG category** entirely, since its questions require factual lookup |
| 67 | + (dates, names, quantities), which does not align well with the survey’s goal |
| 68 | + of evaluating reasoning or generation quality. |
| 69 | +- As a result, the form will include **four categories**—Reasoning, |
| 70 | + Summarization, Creative Writing, Paraphrasing—with **two questions per |
| 71 | + category**, for a total of **8 questions**. |
| 72 | + _(Selected Q&A's can be found |
| 73 | + [here.](<https://docs.google.com/document/d/1CBYpsLvkeE5aLKp1-6vPaiizDXVf80o2uH> |
| 74 | + 42XL5gXtw/edit?tab=t.ugqqnecewdh7))_ |
| 75 | +- The group reaffirmed that **each task category will be represented by one |
| 76 | + model pair**, ensuring all models contribute to the study. |
| 77 | +- The team decided to **pair each open-source model with the closest commercial |
| 78 | + model** for comparative evaluation. |
| 79 | +- The final task–model pairings were confirmed as: |
| 80 | + - **Reasoning:** Gemma ↔ Claude Haiku 4.5 |
| 81 | + - **Summarization:** LaMini ↔ Gemini Flash |
| 82 | + - **Creative Writing:** Mistral ↔ Gemini Pro 3 |
| 83 | + - **Paraphrasing:** Qwen ↔ ChatGPT |
| 84 | + |
| 85 | +### Action Plan |
| 86 | + |
| 87 | +- Add the selected model responses to the **Google Form** initially created by |
| 88 | + Banu and finalize the form. |
| 89 | +- Submit the form to **Evan** for feedback, then incorporate any revisions. |
| 90 | +- **Publish** the finalized form to the cohort group and **collect responses |
| 91 | + until November 30**. |
| 92 | + |
| 93 | +--- |
| 94 | + |
| 95 | +## Meeting 20 |
| 96 | + |
| 97 | +**Date:** November 25, 2025 (Tuesday, 2:30 PM EST) |
| 98 | +**Attendees:** Amro, Caesar, Reem, Banu |
| 99 | + |
| 100 | +### Summary |
| 101 | + |
| 102 | +- To discuss what is left while the form is still running, the team reviewed the |
| 103 | + remaining deliverables for both **ELO2/Graduation requirements** and **the |
| 104 | + project itself**. |
| 105 | + |
| 106 | + 1. For **ELO2 and graduation**, the deliverables were revisited and confirmed |
| 107 | + as: |
| 108 | + - Repository |
| 109 | + - Presentation |
| 110 | + - 1000-word final testimonial _(individual)_ |
| 111 | + - 1000-word ELO2 retrospective _(individual)_ |
| 112 | + - Exit Survey _(individual)_ |
| 113 | + |
| 114 | + 2. For **Green AI project’s final outputs**, the deliverables were identified |
| 115 | + as: |
| 116 | + - Repository |
| 117 | + - Article |
| 118 | + - Presentation |
| 119 | + - Form analysis |
| 120 | +- The team also discussed the expected article format and structure. |
| 121 | + - The article should narrate the project process by explaining motivations and |
| 122 | + the overall journey, with roughly **5–10%** on initial ideas, most of the |
| 123 | + content focusing on the work done, and a concluding section with findings |
| 124 | + and potential future directions. |
| 125 | +- Reem volunteered to create an infographic or visual summary that can be used |
| 126 | + for both the article and the presentation. |
| 127 | +- It was also noted that **on November 28 (Friday)**, support may be requested |
| 128 | + from Evan to help boost form participation through an announcement. |
| 129 | +- Since the form was published later than anticipated, the team decided to |
| 130 | + **close it on December 2nd instead of November 30th.** |
| 131 | + |
| 132 | +### Action Plan |
| 133 | + |
| 134 | +- **Caesar and Reem** to begin drafting the article for team review. |
| 135 | +- **Amro** to work on repository updates and refine the main README draft |
| 136 | + prepared by Banu earlier. |
| 137 | +- **Reem** to create an infographic using a **visualization tool** to summarize |
| 138 | + project results for the article, presentation, and Medium or similar platforms |
| 139 | + _(after the form closes)_. |
| 140 | +- The survey form will be **closed on December 2nd**, after which data analysis |
| 141 | + will begin. |
| 142 | +- **Banu, Safia, and Aseel** to work on the presentation, building on the |
| 143 | + initial draft previously prepared by Banu. |
| 144 | +- An announcement request **may be sent to Evan on November 28 (Friday)** to |
| 145 | + encourage more form responses. |
| 146 | + |
| 147 | +--- |
| 148 | + |
| 149 | +## Meeting 21 |
| 150 | + |
| 151 | +**Date:** December 2, 2025 (Tuesday, 12:30 PM EST) |
| 152 | +**Attendees:** Caesar, Reem, Aseel |
| 153 | + |
| 154 | +### Summary |
| 155 | + |
| 156 | +- The team discussed the status of the survey form and agreed to close it by the |
| 157 | + end of the day. After reviewing its current performance, they noted that most |
| 158 | + initial insights and demographics could already be observed through the Google |
| 159 | + Forms visualization tools, which provided a general overview of respondent |
| 160 | + characteristics and early trends. |
| 161 | +- The group revisited the remaining requirements and clarified what still needs |
| 162 | + to be completed for the final deliverables. A key part of the discussion |
| 163 | + focused on how the results will be structured and presented in both the |
| 164 | + article and the visual summary. This included considering how to best |
| 165 | + translate the survey findings into a clear narrative and an accompanying |
| 166 | + infographic or visualization. |
| 167 | +- The team confirmed that an additional meeting would be held on December 3rd to |
| 168 | + examine the survey results in more depth. During that session, they will |
| 169 | + identify any notable or unexpected findings that may require special emphasis |
| 170 | + or separate formats within the final outputs. |
| 171 | + |
| 172 | +### Action Plan |
| 173 | + |
| 174 | +- Close the survey form by end of day on December 2. |
| 175 | +- Begin outlining how survey results will be integrated into the article and |
| 176 | + visual materials. |
| 177 | +- Continue exploring the data using Google Forms visualizations to prepare for |
| 178 | + deeper analysis. |
| 179 | +- Meet again on December 3rd to review detailed results and determine standout |
| 180 | + findings or sections that require special formatting. |
0 commit comments