Transitioning to open source development
Written by Tim Boudreau, Open Source Software Engineer, G-Research
Marc Andreessen famously quipped that “software is eating the world,” but it’s becoming increasingly apparent that it’s actually open source software that’s eating the world. And people, as well as the companies they work for, are working to make the adjustment, accommodating new ways of writing code and new ways of doing business. I am one of them.
After writing proprietary enterprise and commercial software for my entire career, I finally took the plunge into open source development a few years ago when I joined G-Research’s Open Source team.
If you have a proprietary development background, like mine, and you’re also thinking of getting involved in contributing to a major open source software (OSS) project here are four tips you may find useful.
Tip 1 – Implement an already agreed feature
Naturally, you should work on features which are valuable for you (or your employer) and for which you have the requisite expertise. But if you’re proposing a new feature, be aware that unlike in-house development, where the product decision makers are typically few in number and share the same vision for the product, that’s not the case with OSS.
Large, open source projects are governed collaboratively by a set of important contributors and maintainers; all interested community members are welcome and encouraged to chime in.
That means that OSS programming requires developers to design within the bounds of what the community has determined that it needs. Often, that requires agreement, forged through negotiation and compromise.
In fact, “open design” is considered by some to be one of the principal pillars of a healthy, well-run OSS project. Although the overall scope of the project will be shared, each community member will naturally have a slightly different vision for what’s core to the project, what should be done next, how to prioritise the backlog of feature requests, and what’s undesirable.
It’s fair to say that achieving consensus on what should be built is at least as big a hurdle as agreeing on technically how it should be implemented. This rather flips on its head the in-house process where (at least in my general experience) it’s easy to agree on what should be built, while deciding how to build it can be contentious.
In my case, I got lucky for my first contribution inasmuch as the facility I helped to add to the Parquet C++ library had already been blessed by the community and had in fact been added already to the Parquet specification – it was just awaiting an implementer. I didn’t plan it that way, but this allowed me to dive right into the action (writing code!) without protracted discussion of the feature.
If you too just want to dive in – to get used to the development process, or the codebase, or the community – try to pick a feature in your project for which the community has already reached consensus.
Tip 2 – Submit design proposals
For my Parquet example above, although the feature had been specified at the format level, the implementation was somewhat greenfield. There were no existing abstractions or analogues in the codebase that were directly useful in building out the functionality. And the library was in C++ which (let’s just say politely) affords several alternate paradigms and idioms for designing solutions.
The codebase was also uneven – some written in the traditional style and some very modern C++, some well-tested, some less so, and so on.
Uncertain of how to proceed, I picked what I thought was a reasonable approach, coded it up, wrote a lot of unit tests and opened a Pull Request (PR). That took time – both in deciding on design and implementing code and tests.
In retrospect, I would have done it slightly differently: I would have sketched out the proposed approach (the interface, the broad strokes of the planned implementation, maybe an outline of the testing strategy) much earlier and listed alternative approaches that I considered and rejected. In the end, I re-implemented the feature several times both on my own initiative and in response to insightful review comments.
This meant a good deal of the time I spent polishing the first implementation was wasted.
My takeaway from this is that there’s little point in trying to predict precisely what the current set of reviewers and committers will consider acceptable: ask the question and listen to feedback because too much communication is better than too little.
And that lesson applies to the feature suggestion process as well as the code submission process. Find the most economical way to express your intended contribution clearly and fully and solicit feedback as early as possible.
Tip 3 – Relax expectations on timelines
The proverb says: if you want to go fast, go alone, but if you want to go far, go together. Major OSS projects always want to go far and so they willingly give up development speed to achieve stability and longevity.
There’s just no way around it: driving consensus among the large, heterogeneous, geographically dispersed set of volunteers that comprise most OSS communities is sometimes difficult and almost always time-consuming.
You won’t enjoy the fast feedback loop that you’re used to for in-house development. At every stage of the process there will be feature adoption, design agreement and code review – things will likely go slower than you’re accustomed to.
Tip 4 – Stay involved
My initial Parquet contribution consisted of two PRs and after they landed, I moved to the sidelines on that project. But I kept watching the Parquet mailing list fairly attentively.
I’m glad I did. After the library version containing my changes was released there was a hiccup related to my new functionality – the new feature was correctly implemented but it turned out that some of the assumptions built into the feature’s specification regarding how users were using the pre-existing (now superseded) feature were wrong. This meant that Parquet files written previously by several projects were now being misinterpreted when read by the new version of the library.
I could have been disheartened but the community – including me, now a small part of it, I suppose – fairly quickly introduced workarounds to resolve the issues. This provided a valuable reminder: even though you’re a volunteer in an OSS community, professional courtesy still includes standing behind your work product.
It’s possible to make the jump to OSS developer. Collaboration runs through every step of OSS development. It may feel inefficient at first, but the resulting community ownership of the product is a worthy goal.