Conversation
|
💖 Thanks for opening this pull request! 💖 |
|
@EmilyXinyi Could you maybe add a link to or embed your Vlog here? I think it would be a nice addition to this well written blogpost about your internship experience. |
I embedded the video for now, though I am not sure how good it would look on the blog because it's a portrait (vertical) video. Alternatively I can post it on some social media first and link it here. |
|
|
||
| ### Open Source Developement | ||
|
|
||
| I started my contributions by adapting certain metrics (tweedie, mean absolute percentage error etc.) to be Array API compatible under the guidance of my mentor, Olivier. The Array API standard is a cross-library API for array operations on Python, which is designed to improve interoperability and consistency across different array libraries. This also means that scikit-learn algorithms written in NumPy for CPU can work on other hardwares (GPU) with PyTorch or CuPy, greatly improving performance. As I gained more familiarity with the scikit-learn codebase and Array API, I began working on adapting “larger” functions to be Array API compatible, which means a lot more fundamental, a lot more dependencies, a lot more challenging, and a lot more fun. |
There was a problem hiding this comment.
| I started my contributions by adapting certain metrics (tweedie, mean absolute percentage error etc.) to be Array API compatible under the guidance of my mentor, Olivier. The Array API standard is a cross-library API for array operations on Python, which is designed to improve interoperability and consistency across different array libraries. This also means that scikit-learn algorithms written in NumPy for CPU can work on other hardwares (GPU) with PyTorch or CuPy, greatly improving performance. As I gained more familiarity with the scikit-learn codebase and Array API, I began working on adapting “larger” functions to be Array API compatible, which means a lot more fundamental, a lot more dependencies, a lot more challenging, and a lot more fun. | |
| I started my contributions by adapting certain metrics (tweedie, mean absolute percentage error etc.) to be Array API compatible under the guidance of my mentor, [Olivier](https://github.com/ogrisel). The Array API standard is a cross-library API for array operations on Python, which is designed to improve interoperability and consistency across different array libraries. This also means that scikit-learn algorithms written in NumPy for CPU can work on other hardwares (GPU) with PyTorch or CuPy, greatly improving performance. As I gained more familiarity with the scikit-learn codebase and Array API, I began working on adapting “larger” functions to be Array API compatible, which means a lot more fundamental, a lot more dependencies, a lot more challenging, and a lot more fun. |
|
|
||
| ### Chinese Community Outreach | ||
|
|
||
| China has the second largest user group of scikit-learn. As a community, we believe that we can be more inclusive to ease Chinese contribution and do what is necessary to recruit more Chinese contributors. Therefore, I need to find out who and where scikit-learn is being used, if there are other platforms (outside of GitHub) that development is happening, because GitHub tends to be very slow in China, and establish scikit-learn’s official presence in the Chinese community. |
There was a problem hiding this comment.
| China has the second largest user group of scikit-learn. As a community, we believe that we can be more inclusive to ease Chinese contribution and do what is necessary to recruit more Chinese contributors. Therefore, I need to find out who and where scikit-learn is being used, if there are other platforms (outside of GitHub) that development is happening, because GitHub tends to be very slow in China, and establish scikit-learn’s official presence in the Chinese community. | |
| China has the second largest user group of scikit-learn according to documentation web analytics. As a community, we believe that we can be more inclusive to ease Chinese contribution and do what is necessary to onboard more Chinese contributors. Therefore, I need to find out who and where scikit-learn is being used, if there are other platforms (outside of GitHub) that development is happening, because GitHub tends to be very slow in China, and establish scikit-learn’s official presence in the Chinese community. |
|
|
||
| I also had weekly Peer Programming sessions with Loïc and Stefanie, where my piled-up questions from the week outside of Array API would be answered, and I would almost always learn something new about developer tools or programming fundamentals. | ||
|
|
||
| On the Chinese community outreach side, it has always been with the scikit-learn communications team. Here I must give a special shoutout to manager François, who is also part of the communications team, for always being supportive and believing in my outreach efforts, especially because I was nervous doing this kind of task and using Chinese in a professional context for the first time. I also got to interact with [Charlie](https://charlie-xiao.github.io/) (yes, the core-dev Charlie), who is located in China and helped me tremendously with tasks that require physical presence. |
There was a problem hiding this comment.
| On the Chinese community outreach side, it has always been with the scikit-learn communications team. Here I must give a special shoutout to manager François, who is also part of the communications team, for always being supportive and believing in my outreach efforts, especially because I was nervous doing this kind of task and using Chinese in a professional context for the first time. I also got to interact with [Charlie](https://charlie-xiao.github.io/) (yes, the core-dev Charlie), who is located in China and helped me tremendously with tasks that require physical presence. | |
| On the Chinese community outreach side, it has always been with the scikit-learn communications team. Here I must give a special shoutout to manager [François](https://www.linkedin.com/in/françois-goupil/), who is also part of the communications team, for always being supportive and believing in my outreach efforts, especially because I was nervous doing this kind of task and using Chinese in a professional context for the first time. I also got to interact with [Charlie](https://charlie-xiao.github.io/) (yes, the core-dev Charlie), who is located in China and helped me tremendously with tasks that require physical presence. |

Blog post about Emily's summer internship
cc: @francoisgoupil