Follow up: I have created another blog post to clarify some of these issues here.
The IPython project, through UC Berkeley and Cal Poly San Luis Obispo, just received a $1.15 million dollar grant from the Alfred P. Sloan Foundation to develop the IPython Notebook over a two year period. More details about this grant can be found on the IPython website. This is really exciting for us because, so far, we have mostly developed IPython in our spare time. But I think there is also a potential danger here. The danger is that we will add lots of new features. What, you say, lots of features will endanger IPython? What else are you going to do with a million dollars if you are not going to add lots of new features? The answer is simple: we are going to add as few features as possible and knock each of them out of the park. The future of the project depends on this.
This is a topic that I have been thinking about a lot lately: how do open source projects decide which features to implement. Most active open source projects I am involved in see a continual stream of new features. Just hop onto the GitHub pages for SymPy or IPython and watch the activity. Anyone with the right skill set can fork a project on GitHub and submit a pull request within a few hours. Amazingly, this is happening all the time; apparently people love to code. While each new feature is an asset for the project, it also brings a cost, or liability, with it. If a project ignores those costs, it can have long term, detrimental effects on the project. What are these liabilities and costs associated with new features?
- Each new feature adds complexity to the code base. Complexity makes a code base less hackable, maintainable, extensible.
- Each new feature increases the “bug surface” of the project. When a feature also adds complexity, those bugs become harder to find and fix.
- Each new feature requires documentation to be written and maintained.
- Each new feature requires support over email or IRC.
- Endless feature expansion, or feature creep, requires developers to specialize. They can’t follow the entire project, so they have to focus on a subset that can fit into their brain and schedule.
- Each new feature has to be tested on a wide variety on platforms (Linux, Mac, Windows) and environments (PyPy, Python 2, Python 3).
- Each new feature adds complexity to the user experience. Sometimes it’s the documentation, other times the UI or configuration options.
- When you spend on one feature, another feature or bug fix didn’t get worked on. If you didn’t prioritize things beforehand, you just spent time on something less important to your users. Either that or you did shoddy work while trying to do it all.
- Features multiply like bunnies. How many times have you heard, “wow, that new feature is really cool, could you make it do X as well?”
- Features are easy to add, difficult to remove. Once you add a feature, you are stuck with the costs and liabilities.
For a typical open source project, what percentage of features get used regularly by most users? 15%? 40%?. Let’s be really optimistic and say that number is 60%. That means that developers are spending 40% of their time and effort on features that don’t really get used. Ouch. Then why do open source projects keep adding features without counting the cost? I think there are a number of factors that lead to this, but one in particular comes to my mind. It is hard to say no. When an end user submits a feature request over email or on GitHub issues, it is hard to tell them, “great idea, but we are not doing that.” It is even more difficult to say no after someone submits a pull request that implements a new feature. It is difficult to build a vibrant project community if you are always saying no in these contexts.
Clearly, we need a better way of limiting feature expansion in open source projects. What can we do to better evaluate the hidden costs of adding new features so we can make informed, strategic decisions about which features to add? Here are some ideas that have emerged out of my recent reading and conversations.
- Create a wiki page for your project, where you list all of the features you are not going to implement. Publicize this list, discuss it and make it an important part of the development workflow. Another way of phrasing this is to decide on a finite scope for the project. When you are going through this exercise, come up with an initial scope and then force yourself to reduce it.
- Make features fight hard to be accepted and implemented. Communicate to the community and developers that the default answer to feature requests is no (it’s not personal!) and don’t even consider implementation until the much of the community is crying “we absolutely must have this.” Even then, you don’t have to say yes.
- Create a workflow that separates feature requests from other tickets/issues. When people submit new feature requests, encourage discussion, but don’t automatically promote the feature to the project’s todo list. Then you can promote them, as needed, to the project’s todo list in an informed and prioritized manner.
- When new feature requests appear, discuss the specific costs and liabilities associated with the feature. Build this thinking into your development DNA.
- Communicate to the community and its developers why it is important to fight against feature expansion. Focus on the benefits of waging this war: smaller, simpler code base, fewer bugs, more time to focus on important features, easier to support, etc.
- Remove features that have too great a cost or that few users actually use. Maybe even create a special exception you can raise (FeatureReductionWarning?) to help people transition away from them.
- Refactor the codebase to reduce complexity. While this doesn’t directly reduce the number of features, it can mitigate the cost of existing and future features. Extra bonus points if you can implement a new feature while dramatically reducing the complexity of the code base.
- Improve testing. Again, this is mitigation.
As you discuss and evaluate features, here are some questions you can ask yourself and the community:
- What fraction of your user base will use the feature? How often will they use it? If it won’t be used by most of your users, just say no.
- Can the feature be added as a third party plugin or library? This is especially useful if the new feature would increase the overall scope of the project, but make a great standalone project.
- How difficult will it be to test, debug, document, and maintain the feature? What fraction of your development team is capable or interested in doing this work? If the maintanence is huge and only one person is willing to do it, it is time to rethink it.
- Can you implement the functionality in a more limited, but much simpler manner? Developers absolutely love to implement features in the most general way possible. It requires developer discipline and focus to resist this temptation.
- One way that developers over engineer things is by making every conceivable thing configurable. Can you simplify the feature by removing configurability and just choosing great defaults?
One thoughtful discussion of these issues is in the book Getting Real, by some of the folks at 37signals. They propose something quite radical for handling feature requests in commercial web applications. Here is what they say: “How do you manage them? You don’t. Just read them and then throw them away.” Their experience is that the important features will keep coming up. When this happens, you know they are important and you don’t have to write them down to keep track of them. This is definitely my experience in developing the IPython Notebook. The most important features, the ones that I am going to spend time on this year, are probably not written down anywhere, but everyone in the community is discussing them and couldn’t possibly forget them. So why on earth do we currently have 177 open new feature issues (we call them “enhancements”) on IPython’s GitHub site?
In an open source project, I don’t think it makes sense to literally throw away feature requests; sometimes the ensuing discussion is valuable and worth preserving. But what about allowing the discussion to occur, for example in the context of a GitHub issue, but then closing the issue. If someone wants to re-open the feature request at a later time to voice their support, they should be encouraged to do that. But again, once discussion stops, the issue should be re-closed. As this process repeats itself, the community is essentially voting for the features they want. This would also dramatically reduce the number of open issues, which helps developers to better manage the active work on the project.
I don’t think this is the only way to manage feature requests intelligently in an open source project. I would love to hear other ideas. How are you managing these things in your own projects? I realize that I am far from the first person to write about these things. What other resources do you know of that address these problems?