What's the Biggest Software Package by Lines of Code?
For the average user of any piece of software (or hardware for that matter), the code going on behind the scenes probably rarely enters their thoughts. Even if you have just a little bit of coding experience in coding, you likely appreciate the work that goes into making even the simplest task possible.
When it comes to really complex pieces of software like operating systems, the amount of coding involved can seem overwhelming. In the world of coding, the size of some of these programs is downright mind-boggling.
Better programs don't mean more code
When it comes to measuring the lines of code in any software package, it's not necessarily the total number that matters, but the quality of the programming involved. In fact, most programmers pride themselves on developing a software program in as few lines of code as possible. It's really a matter of quality over quantity.
For example, imagine a program that prints the phrase "Hello World" 200,000 times. Such a program's length wouldn't be a measure of its complexity.
In most cases, programers will endeavor to recycle chunks of code to cut down on the total work needed to develop a software program. The above would be a prime example. This program would constitute a single function that can be called upon whenever needed, rather than needeing a written a set of instructions each time for the computer to display the text.
To this end, many software programmers tend to follow the principles of DRY (Don't repeat yourself) and KISS (Keep it simple, stupid) almost religiously.
Both of these, but more so the former, dictate that the best practice when developing code is to reduce its repetition wherever possible. More specifically, "every piece of knowledge or logic must have a single, unambiguous representation within a system."
Depending on the programming language used, many large programs will make heavy use of classes, functions, and other reusable blocks of code that work like little machines performing the same task over and over again. Generally speaking, if you need some code to do something more than once, you'll generally want to build some reusable code for that rather than repeating yourself.
However, it's important to bear in mind that good programmers also add plenty of notes or comments to their code. These will technically be part of the line count but are never actually read or initiated by a computer. Instead, they are specifically included to explain what a particular piece of code is for. They are also invaluable for bug finding and code maintenance within a team of programmers.
Programmers will also leave blank "lines" of code between the actual written code, which can, in theory, constitute a large percentage of the total line count.
Without seeing the source code for a particular program, you can never really know exactly how many of its lines are explanatory notes, blank lines, or actual code.
For these reasons, among others, enormous code bases for programs tend to quickly become cumbersome and difficult for a single programmer to keep track of. And some of the most commonly used software packages can be made up of millions, or even billions, of lines of code.
These enormous code bases tend to be the work of whole teams of programmers, with each team working on smaller sections of code.
Another problem when uncovering the number of lines of code in a particular piece of software is the core nature of the code. For proprietory software programs, the source code is usually a closely guarded secret. Although, with the growing open-source nature of software programs today, more and more companies are beginning to disclose the size (but not necessarily the content) of the code behind their products.
That being said, let's take a look at the largest codebases of some common software programs you are probably familiar with, some of which are really breathtaking.
The software programs with the most lines of code
Looking at some of the biggest codebases currently in use around the world is insturctive here.
To put the following figures into perspective, the 1982 Space Shuttle required somewhere in the region of 400,000 lines of code to make it work. A mouse's genome, according to some estimates, comes in at around 120 million lines of code. A million lines would be about 18,000 pages of text if printed out — that's 14 times longer than Tolstoy's War and Peace.
On the other end of the spectrum, a simple iPhone game app generally contains some tens of thousands of lines of code.
It's also important to note that we have included standalone software packages (like operating systems), scientific research software, and web-based services, social media sites, and applications in this list, although these are often not directly comparable to each other in terms of functionality.
There are some rumors that the Human Genome Project amounts to over 3 billion lines of code. If true, this would make it the largest software program in existence. However, we could not find any reliable source to back up this claim.
We'll let you decide if you want to count the following as actual software programs or not. That being said, here are some of the largest software programs in existence by lines of code. Keep in mind that this list is not exhaustive and is presented in no particular order.
1. Google is one of the world's biggest software programs
Estimated lines of code: Roughly 2 billion
How many lines of code in Google? Put simply, more than you could ever read in your lifetime.
Google is one of the largest internet service platforms around today. It provides not only its famous search engine but also many other online services like Gmail, Google Drive, Google Calendar, Google Translate, Google Maps, and many more.
If we were to take all of these services as a whole, by some estimates, the code behind them constitutes somewhere in the region of several billion lines of code. Not only that, but Google is constantly adding new services and upgrades to older programs, further bloating the amount of code as time goes by.
Of those programs, the Google Chrome browser is thought to require something like 6-7 million lines of code alone.
Interestingly, unlike other companies like Microsoft, Google doesn't store all this code on a Git repository. A Git repository is a special kind of program that stores and helps track changes to files stored within it.
Google, on the other hand, has its own version control system that is specifically designed for the needs of tens of thousands of employees.
2. High-end car software is insane
Estimated lines of code: Roughly 100 million
You might be surprised to find out that the software used to run some high-end vehicles can run to into the hundreds of millions of lines. This code is usually used to run and monitor various parts of a car's engine, but is also used in features like entertainment, dashboard, and security systems.
It also includes code for enabling modern cars to have sophisticated, often cloud-connected functionality.
Generally speaking, the higher the price tag for the car and the more features it has, the more lines of code are likely powering the entire thing. High-end BMWs, Mercedes, or even Tesla vehicles have some of the most complex software behind the scenes.
To put this into perspective, the Windows XP Operating System contained 40-50 million lines of code. It's amazing to think that a modern car could have twice or more that amount of coding.
As cars become smarter, the software codebase needed to keep everything working will likely bloat even more. No wonder modern cars seem to require more maintenance or appear to have more problems than older cars.
3. Mac OS X Tiger is a very large computer program
Estimated lines of code: Roughly 85 million
Apple's Mac OS X 10.4 Tiger is another of the world's largest software programs. Consisting of well over 80 million lines of code, this operating system is one of the largest ever written.
This operating system is the fifth major release of macOS for desktop and server-operating-system Mac computers. It was first released in 2005 and included some new features not seen in the previous macOS builds like Spotlight, a new dashboard, and a so-called unified theme.
The operating system came installed by default on new machines after its release but was also available to download and install on existing machines.
However, when it comes to operating systems, it's important to note that these can be very difficult to unpack. One of the reasons for this it that the source code is almost never released to the public. Also, it is hard to tell which parts of the code are purely for the operating system, and which are used purely for native applications.
That being said, you can be pretty confident that most Mac operating systems are at least 10 million lines of code long. This is the approximate length of the Linux kernel at the heart of the macOS series of operating systems. With the graphical user interface (GUI) of macOS on top, its size is likely close to the commonly quoted lines of code.
For newer macOS versions, like Big Sur, the number of lines of code are likely to dwarf even that of Tiger, but until the source code is made public, we can only guess how large it really is.
4. The Debian 5.0 codebase is massive
Estimated lines of code: Roughly 67 million
Another truly enormous software program is the open-source Debian 5.0 operating system. Free to download and install, Debian is a GNU/Linux-based system that was developed by a community of programmers through the self-titled Debian Project.
The Debian operating system runs on almost any personal computer. Each new version release generally allows the system to work on more and more computer systems as well. However, it's important to note that some hardware manufacturers do not release their specifications, making support for the OS problematic.
The initial version of the operating system (Version 0.01) was first released in September of 1993, with the first reliable version released in 1996. Today, Debian has many distributions and is used on personal computers and servers alike.
Other popular operating systems like Ubuntu are based on it, and it's one of the oldest operating systems based on the Linux kernel in the world.
Since its first release, Debian has undergone routine development, with version 10.10, named Buster, released in June of 2021.
5. Facebook has a lot of code behind the scenes
Estimated lines of code: About 61 million
Another of the world's largest software packages by lines of code is Facebook. Estimated to require over 60 million lines to operate, this social media giant has been repeatedly refined since its release in 2004.
According to various sources, the lines of code used to build and run Facebook include backend code as well as its user interface and features. This includes code written in a variety of languages, ranging from PHP, C++, Python, Hack, Java, Erlang, XHP, to Facebook's own Thrift, and others.
All of this code is used to run its main social media site, but also its very popular messenger, gaming app, events, and e-commerce services.
As Facebook continues to expand and refine its services and acquires other social media platforms to integrate with, these lines of code are only likely to expand further over time.
6. Microsoft Office requires a lot of code to work
Estimated lines of code: Roughly 45 to 50 million
If you use Microsoft Office on a regular basis, you might be surprised how many lines of code it takes to power the whole thing. According to some estimates, older versions like Office 2013 weigh in at a hefty 45 million lines.
Most of this is written in C++, which is supposedly one of the hardest software programming languages to master.
The Microsoft Office suite for macOS is pretty similar, with some older versions requiring around 3 million lines of code to work.
This amount of code includes not only the instructions for each individual component (Excel, Word, and others) but also includes code to allow each package to communicate and work with each other and various operating systems. If estimates of the amount of code are correct, all that functionality takes a serious amount of coding behind the scenes to make it all work seamlessly.
It also explains why, from time to time, Microsoft Office can be pretty buggy. Being a proprietary software, we can never really be sure how many lines of code it has exactly.
7. Some Windows operating systems are gigantic
Estimated lines of code: Roughly 40 million
On the subject of Microsoft software, some of their operating systems are also pretty large when counting lines of code. This should come as no surprise.
According to some estimates, Windows XP and Windows 7 come in at upwards of about 40 million lines of code each. However, like other entries on this list, this likely includes whitespace and shared code between the OS and native Microsoft applications.
According to the Microsoft community, Windows 10 comes in at about 50 million lines of code. Love it or loathe it, all of this code helps millions of people worldwide use personal computers at home or in the workplace.
8. The software that powers the F-35 fighter jet is enormous
Estimated lines of code: Roughly 8-24 million
Moving down the scale just a little bit, the software installed on the F-35 fighter is also pretty substantial.
Used for everything from keeping the plane in the air to providing the pilot with targeting information, this software is critical to making this one of the most advanced and deadly fighter aircraft in the world.
Given that this kind of software is obviously a major national security concern, its code has never been released to the public. For this reason, 25 million lines of code are often given as the upper estimate. Other sources claim it's a fraction of that at around 8 million lines of code or so.
Until such time as it's released, we can only really guess how many lines of code this marvel of modern engineering uses.
9. Android OS is one of the largest programs
Estimated lines of code: Roughly 12-15 million
The Android mobile device operating system also happens to be one of the largest software programs by lines of code. Coming in at an estimated 12 to 15 million lines of code, it's also one of the most widely used operating systems in the world.
Based on the Linux kernel (and other open-source software), it's primarily designed for use on touchscreen mobile devices like smartphones and tablets. It was originally developed by a consortium of developers under the sponsorship of Google and was first released in 2007.
To this day, Android is a free and open-source software, but is usually shipped on devices with other proprietary software pre-installed too. For this reason, like other software programs listed here, it's fairly difficult to define exactly where an Android Operating System starts and ends.
Since 2011, it has been one of the best selling operating systems around the world and has over 3 billion users today.
10. The modest code behind the Hubble Space Telescope
Estimated lines of code: Between 50,000 and 2 million
The venerable and now sadly faltering Hubble Space Telescope is one of humanity's greatest technological achievements. First launched in 1990, it has paid for itself many times thanks to the ways it has allowed us to study and learn about the fundamental nature of the universe.
Estimates for the total length of the Hubble's code do differ depending on the source but it probably ranges from between 50,000 to 2 million in total. Most of this is written in C and Assembly programming languages.
A highly complex piece of machinery, the software behind the scenes allows the telescope to capture and send high-definition images from the deepest reaches of space and, relatively speaking, time.
The Hubble Space Telescope has certainly earned its place in history, but its future is now seriously in doubt.
11. The pacemaker has a surprising amount of code behind it
Estimated lines of code: Roughly 80,000
Moving even further down the scale, a relatively simple device like a pacemaker requires quite a lot of lines of code to work. While the basic function of a pacemaker seems pretty simple (regulating a heartbeat), it's far from a simple set of instructions to replicate synthetically.
Depending on the functionality of any particular fitted pacemaker, the lines of code are likely differ widely. To put this into perspective, more sophisticated medical devices like drug-infused pumps may require 170,000 lines of code. An MRI scanner, on the other hand, needs somewhere in the realm of 7 million lines.
These are only a few of the many truly enormous software packages out there.
It's incredible to think how much time and effort has gone into physically writing, testing, and refining these software programs. It took some serious teamwork and dedication from the code's programmers to achieve.
Scientists find ancient tree roots flooded oceans with excess nutrients, triggering mass extinction of life. Could today's human activity cause the same?