Sharing my experience of open source contribution and internship on setting up projects locally

Β·

4 min read

Hi folks, whenever you join a new company or want to contribute to a new open-source project, you would need to set up the project locally running on your machine. And this is the part where most of the time, new joiners or contributors struggle, including me.

While contributing to the Pandas open source project and having my internship experience, I have something to share.

My Experience during an Internship

During my internship, I had to set up a project locally to submit PRs and bug fixes. This was the first time I was setting up such a big project locally and also there was no documentation for the project setup.

We need two things to be up and running locally:

  1. Project environment/code base

  2. Database

Project environment set up:

Why do we need to set up the project environment locally and make it up and running?

To be able to experiment with the code base, to see the output of the changes you made into the codebase, in sort to see the output of the changes you made to the codebase without breaking the production.

To set up the project locally I took help from my mentor, such a nice person. He helped me set up the project and database locally.

I wrote down every step and then created a PR containing documentation of the Project environment and database setup steps so that the new joiner can easily set up the project and the mentor can focus on other important tasks.

My advice

βœ… Ask your mentor to guide you or get to know about documentation.

βœ… For database setup you can ask for SQL schemas .SQL files from colleagues instead of creating them on your own.

My Experience with contributing to Pandas

So far I just contributed to open-source projects that fix typos in the documentation or code. So there did not have to set up the project locally.

But this time I made up my mind and try to contribute to the Pandas. Before starting code contributions or bug fixes I first contributed to the documentation and got to know about the process.

Things I followed

  • Understood the project by contributing to documentation, that does not require setting up the project locally

  • Read the Contribution Guideline that contains the whole process of code contribution like following

    • Setup project environment locally

      • 3 ways: Through pip/meson, Docker, and using GitPod.

      • I choose pip/meson

      • Then after installing dependencies, build the project

      • if in a terminal you can print the version of pandas after importing pandas

      • πŸŽ‰ Congratulations you have successfully built pandas locally

      • And I succeeded at it πŸŽ‰.

After that so far as of publishing this blog, I have created 7 pull requests that have been merged into Pandas codebase πŸŽ‰.

This was the first time I contributed to such a big project used by thousands of machine learning and data science professionals.

Challenge I faced

In pandas if certain file changes then we have to rebuild the Pandas using the C compiler. In the middle, the build process got changed with meson for some performance improvement to build Pandas.

So, I had to build the Pandas using the new meson process but it failed to build.

For one day I tried to solve it but couldn't.

I asked about it on Slack, Kudos to the members πŸ™β™₯,

I got so many solutions from them. But for some reason, I was not able to build the project.

Already one of the members gave the correct solution But at that time I was trying to use different methods, originally I was getting errors using the pip method but to try different solutions from Slack, now I was trying to build using Mamba, and then finally Gitpod.

Neither Mamba nor Gitpod worked for me after spending 2 weeks just building the project.

Almost I lost hope that now I can not contribute further to the Pandas.

But still, for the last time, I just give it one more chance and tried using pip, and this time it build successfully πŸŽ‰πŸ˜­.

What was the solution?

I already applied the solution given by one of the members as I told you earlier, but was trying a different method, mamba.

I was using a bit older version of the visual studio build tools and visual studio, so I updated to the latest one and tried using Pip and got success πŸ™ŒπŸŽ‰.

My Advice

βœ… Read the documentation of the project environment setup

βœ… Join a communication medium like I joined Slack

βœ… Ask a question in Slack, and make sure to explain it well in the discussion.

βœ… Have patience and try again

πŸ™ Thank you so much for reading so far.

If you find it value-adding or useful give it a πŸ‘ like or πŸ“comment.

If you have any feedback, I'm more than happy to receive it πŸ˜€

Β