Sharing my experience of open source contribution and internship on setting up projects locally
Hi folks, whenever you join a new company or want to contribute to a new open-source project, you would need to set up the project locally running on your machine. And this is the part where most of the time, new joiners or contributors struggle, including me.
While contributing to the Pandas open source project and having my internship experience, I have something to share.
My Experience during an Internship
During my internship, I had to set up a project locally to submit PRs and bug fixes. This was the first time I was setting up such a big project locally and also there was no documentation for the project setup.
We need two things to be up and running locally:
Project environment/code base
Database
Project environment set up:
Why do we need to set up the project environment locally and make it up and running?
To be able to experiment with the code base, to see the output of the changes you made into the codebase, in sort to see the output of the changes you made to the codebase without breaking the production.
To set up the project locally I took help from my mentor, such a nice person. He helped me set up the project and database locally.
I wrote down every step and then created a PR containing documentation of the Project environment and database setup steps so that the new joiner can easily set up the project and the mentor can focus on other important tasks.
My advice
β Ask your mentor to guide you or get to know about documentation.
β
For database setup you can ask for SQL schemas .SQL
files from colleagues instead of creating them on your own.
My Experience with contributing to Pandas
So far I just contributed to open-source projects that fix typos in the documentation or code. So there did not have to set up the project locally.
But this time I made up my mind and try to contribute to the Pandas. Before starting code contributions or bug fixes I first contributed to the documentation and got to know about the process.
Things I followed
Understood the project by contributing to documentation, that does not require setting up the project locally
Read the Contribution Guideline that contains the whole process of code contribution like following
Setup project environment locally
3 ways: Through pip/meson, Docker, and using GitPod.
I choose pip/meson
Then after installing dependencies, build the project
if in a terminal you can print the version of pandas after importing pandas
π Congratulations you have successfully built pandas locally
And I succeeded at it π.
After that so far as of publishing this blog, I have created 7 pull requests that have been merged into Pandas codebase π.
This was the first time I contributed to such a big project used by thousands of machine learning and data science professionals.
Challenge I faced
In pandas if certain file changes then we have to rebuild the Pandas using the C compiler. In the middle, the build process got changed with meson for some performance improvement to build Pandas.
So, I had to build the Pandas using the new meson process but it failed to build.
For one day I tried to solve it but couldn't.
I asked about it on Slack, Kudos to the members πβ₯,
I got so many solutions from them. But for some reason, I was not able to build the project.
Already one of the members gave the correct solution But at that time I was trying to use different methods, originally I was getting errors using the pip method but to try different solutions from Slack, now I was trying to build using Mamba, and then finally Gitpod.
Neither Mamba nor Gitpod worked for me after spending 2 weeks just building the project.
Almost I lost hope that now I can not contribute further to the Pandas.
But still, for the last time, I just give it one more chance and tried using pip, and this time it build successfully ππ.
What was the solution?
I already applied the solution given by one of the members as I told you earlier, but was trying a different method, mamba.
I was using a bit older version of the visual studio build tools and visual studio, so I updated to the latest one and tried using Pip and got success ππ.
My Advice
β Read the documentation of the project environment setup
β Join a communication medium like I joined Slack
β Ask a question in Slack, and make sure to explain it well in the discussion.
β Have patience and try again
π Thank you so much for reading so far.
If you find it value-adding or useful give it a π like or πcomment.
If you have any feedback, I'm more than happy to receive it π