Home / Agile / How to select open source libraries

How to select open source libraries

We use a lot of open source libraries and components in our daily business. Open source libraries provide us a big advantage regarding time to market with our products. Every time when we are facing a problem in our software (problem is related to business domain to implementation domain difficulties) we first look into the open source world if someone has already solved that problem or even parts of it. Sourceforge, codeplex and google code (to name a few) are often the first pages we visit to look for code samples, libraries and frameworks. But how can we find the needle in the haystack?

When it comes to the implementation of a software architecture or user interface every software developer team comes to the point where a decision needs to be taken whether or not to reinvent the wheel. Before getting the hands dirty with implementing a own solution approach I encourage every team to first look around what kind of open source library could address some or at best all of the problems the team is currently facing. For me the process of determination which open source library could dramatically increase the teams velocity can be summarized like:

  1. What kind of open source license is feasible?
    The license is quiet important depending on the software and customer you are building the software for. For example in regulatory environment often only Apache2 is feasible.
  2. What kind of problem needs to be address?
    for example distribution, caching, persistence, loose coupling, Inversion of Control, MVVM, MVP, broker etc.
  3. Find meta tags or descriptive words for the problem you want to solve
    for example for optimal workload on different machines the team would find words/tags like Genetic Algorithm, Job Shop Scheduling, Artificial Intelligence, Priority base scheduling, Scheduling Algorithm, Planning Problems, Plan Graphs, Shortest path etc. Filter projects which do not comply the needed license.
  4. Visit codeplex, google code or sourceforge (lately github.com is also a great and valuable source) and create a list of the found projects
    for example for IoC/DI the list could contain ninject, structuremap, autofac, windsor etc.
  5. Visit the project homepages and look for recent activities, remove projects with low activity
    Like news, releases, commits, site updates etc. It’s generally hard to describe what is a decent value of activity. I personally remove all projects of the list when the last activity is older than 6 months.
  6. Go to the source repository browser and look for the test assemblies
    Remove all projects from the list which have no or almost no unit tests. You may think this is a bit harsh but how is the author of the library going to assure you the quality of the library or framework without having unit tests? Is the quality still there when the author does refactor the code? I don’t think so.
  7. Visit the documentation section of the project page
    Remove projects from the list which do not have documentation or at least code samples or introduction tutorials. It’s always a struggle when you first use a new library or framework so it’s good to have a thorough documentation.
  8. Checkout the source code from the version control system and look for extension points
    Good libraries or frameworks do not enforce you to use a certain logging framework or DI container, various extension points should provide you the possibility to add your custom behavior, logging or container. If the library or framework has no extension points but your infrastructure is compliant to the library or framework you can ignore this point otherwise remove the project from the list.

Some additional points which are also important for me but which can be left out according to the team’s need:

  • Does the project have a build script to build it from the sources?
    Custom releasing can be a real pain if you don’t have build scripts.
  • Does the open source team use a continuous integration environment
    On every commit we get a feedback if the build is still working. We can download artifacts immediately after successful build.
  • Does the project have a road map?
    Road maps are good to plan when a new release is coming and when we need to upgrade the usage of the project in our software.
  • Does the open source team have a issue tracker?
    How else would we know what kind of issues are present and when the issues are closed?
  • How fast are community patches applied?
    Important patches should be applied fast, especially race condition bugs etc.
  • Response time in community (upon questions etc.)?
    When you are in trouble you want reasonably fast response times. So it’s important that the community is vivid.
  • Are other developers happy with the project and is the knowledge spread in the community?
    Google for blog posts and tutorials. What is the “tenor” about the project? Success stories etc.
  • How many contributors are actively working on the project?
    It only one author is maintaining the project what happens if he looses interest or the author’s wife (assuming the author is a man) is birthing?
  • Versioning concept?
    Clear versioning helps me to identify the releases etc.
  • Is the developer team integrating everything the community is requesting or are the requests well proofed and then integrated when it makes sense?
    Widely spread projects receive a lot of feedback from the community. But not every community request should be integrated into the project. Changes should be well thought of and only integrated as needed.

I am sure there are a lot more of these questions you could ask yourself. Please give me feedback and I will add them to the list!