Syndicate content

Quality of Open Source Software: how many eyes are enough?

Michael M. Lokshin's picture

In 2004, my colleague Zurab Sajaia and I submitted a maximum likelihood routine to the Stata SSC archive. The program was quickly propelled by the Stata user community to the top 10 most downloaded Stata files; it is still in use now. While experimenting with similar algorithms to develop test procedures (five years after the program’s release), we uncovered an error in the routine. Hundreds, if not thousands, of econometricians had used our program and looked at our code, but no one raised any concerns.

Open Source Software (OSS) is quickly gaining popularity in the corporate world as a practical alternative to costly proprietary software. 78% of companies are now using OSS extensively and open source components are found in more than half of all proprietary software. The rationale is simple: OSS lowers development costs, decreases time to market, increases developer productivity, and accelerates innovation.

Many OSS products are developed and supported by voluntary contributions where “people’s skills, level of involvement, and time commitments can vary” and “the quality assurance of such products can become challenging" (Thomas 2016). A recent survey asked 200 leading tech companies about their concerns regarding the use of OSS. The respondents primarily worried about security issues (53%) and legal issues (38%), with only a small minority (8%) concerned about the quality of OSS code.

Why were so few respondents concerned about the quality of OSS code? A possible explanation might be expressed in what’s known as Linus’s Law: “many eyes make any bug shallow.” This ‘mantra’ of the Open Source community suggests that users should not worry about the quality of OSS if enough people look at the code. This principle seems to work for products developed by large and active OSS communities, but not always for smaller scale development efforts.

Heartbleed, discovered in 2014, is one of the best-known cases of an OSS security bug, in this case in the OpenSSL cryptography library. The bug was introduced in 2011 when almost two thirds of the global worldwide web servers were using the library. It took two years and a security expert from Google to discover the flaw. Another example, which took more than 20 years to fix, is the error in the binary search algorithm (Peterson 1957). The algorithm was used without issue by generations of computer science students until someone tried searching through an array with more than 1 billion elements. The search revealed an obvious, ex post error (Figure 1). There are many such examples of bugs found in widely used OSS codes. Does proprietary software have fewer bugs or security issues? Not necessarily. What makes a difference is the amount of resources (both human and time) spent on testing and auditing the software.

Figure 1: C++ implementations of the Binary Search algorithm. Right panel – original algorithm. Left panel – corrected algorithm.

The Coverity Scan Open Source Report, which measures the quality of OSS code, finds that the density of code defects (the number of bugs per 1,000 lines of code) is smaller for OSS than for proprietary software. Interestingly, while small OSS projects have significantly fewer issues than proprietary software projects of comparable size, the quality of large proprietary software projects is higher relative to large OSS projects. Proprietary software companies have well-defined processes and teams of experts to identify and fix bugs, including security vulnerabilities. Meanwhile, large open source projects sometimes lack standardized processes for quality control, sufficient qualified code reviewers, and good inter-team release coordination.

OSS projects seem to be better at identifying usage, misuse and primitive bugs. But the “hard” bugs – those not obvious from the code reviews, which require a deeper understanding of the complex underlying logic – are more likely to be uncovered by the teams of experienced, paid auditors, or in exchange for bug bounty (Aumasson 2018).

A large share of bugs and vulnerabilities are found and fixed by bug bounty programs offered by Microsoft, Apple, Google, and other large software companies. Recognizing the importance of such initiatives to improve the quality of OSS, the EU started the Bug Bounties on Free and Open Source Software (with a budget of about $1M) in January 2019.

The quality of OSS is also affected by the mode of updates. In most commercial software, updates are automatically pushed to users. In contrast, OSS relies on a pull model where users are responsible for updating the OSS they use. If an organization is not aware of the OSS in use within its IT environment and thus fails to deploy necessary patches, it exposes itself to potential cyber attacks. As a result, the Open Source Security and Risk Analysis report found that, in 2018, 17% of the OSS codebase contained at least one well-known vulnerability, including Heartbleed.

The open source community has made great improvements in the quality of code it produces. To a large extent, these improvements are related to the increasing share of large software companies developing OSS under established quality control processes. Figure 2 shows that the top contributors to GitHub are commercial companies (Hoffa 2017).

Figure 2: Top OSS contributors

In many areas, the quality of OSS has surpassed the quality of proprietary software products. Nevertheless, the practices of code auditing and quality monitoring of OSS components should continue to be improved. The problem of code quality could be especially important for large organizations that distribute OSS products to its clients. Such organizations need to invest in procedures and systems to manage the use of OSS and to incorporate best practices of software development into the OSS lifecycle. Laporte et al (2012) show that software quality assurance typically exceeds 30% of the overall project cost. Funds are also required to fix the issues found and to ensure that patches are quickly and widely deployed. Without such investments, organizations could expose themselves to security, legal and reputational risks that might be costly to mitigate.

The global economy is highly dependent on IT technologies. In this context, the security, reliability and quality of OSS products could be seen as a global public good. International development institutions should consider playing a regulatory role by creating standards, facilitating the establishment of OSS communities, and funding activities that seek to improve the overall quality of OSS.

Comments

Submitted by Kai-Alexander Kaiser on

Great contribution!

Among many of our counterparts and colleagues, OSS is often still seen through the lens of "free software." (Open Office or QGIS instead of say the "premium" commercial versions from Microsoft or ESRI, for example).

With the growth of cloud computing, and infrastructure/platforms/software (IaaS/PaaS/SaaS) models, what constitutes OSS is rapidly changing, and as you note the issues. Traditional econometrics as practiced in STATA is also shifting to growing use of OSS in big data analytics (think text mining and AI as related to GovTech). Your point about updates and service models in the public sector, and how these link to security, is critical, especially as government shift the way the implement enterprise architectures increasingly to the cloud.

A key issue for governments and the Bank is the practicalities of procurement. The new procurement framework hopefully opens up some avenues, but clearly your blog points to some concrete ways to better many open and secure. Governments are increasingly also seeing that just having traditional "closed" vendor lock-in is not necessarily the safest or ultimately most cost effective way to proceed.

Submitted by Michael Lokshin on

A free puppy

Kai, thank for your feedback.

Indeed, for many people OSS is associated with "free software" without realizing the costs and risks associated with OSS model. I plan two follow up blogs on the problems of OSS sustainability and licencing. These are two other issues that, in my view, are often overlooked by the practitioners who wants to integrate OSS in there projects.

Add new comment