If you want to start a flame war, mention lines of code per day or hour in a developer’s public forum. At least that is what I found when I started investigating how many lines of code are written per day per programmer. Lines of code, or loc for short, are supposedly a terrible metric for measuring programmer productivity and empirically I agree with this. There are too many variables involved starting with the definition of a line of code and going all the way up to the complexity of the requirements. There are single lines that take a long time to get right and there many lines which are mindless boilerplate code. All the same this measurement does have information encoded in it; the hard part is extracting that information and drawing the correct conclusions. Unfortunately I don’t have access to enough data about software projects to provide a statistically sound analysis but I got a very interesting result from measuring two very different projects that I would like to share.
The first project is a traditional client server data mining tool for a vertical market mostly built in VB.NET and WinForms. This project started in 2003 and has been through several releases and an upgrade from .NET 1.1 to .NET 2.0. It has server components but most of the half a million lines of code lives in the client side. The team has always had around four developers although not always the same people. The average lines of code for this project came in at around ninety lines of code per day per developer. I wasn’t able to measure the SQL in the stored procedures so this number is slightly inflated.
The second project is much smaller adding up to ten thousand lines of C# plus seven thousand lines of XAML created by a team of four that also worked on the first project. This project lasted three months and it is a WPF point of sale application thus very different in scope from the first project. It was built around a number of web services in SOA fashion and does not have a database per se. Its average came up around seventy lines of code per developer per day.
I am very surprised with the closeness of these numbers, especially given the difference in size and scope of the products. The commonality between them are the .NET framework and the team and one of them may be the key. Of these two, I am leaning to the .NET framework being the unifier because although the developers worked on both projects, three of elements on the team of the second project have spent less than a year on the first project and did not belong to the core team that wrote the vast majority of that first product. Or maybe there is something more general at work here?
Helen Neely said:
This is a very nice write up. At least it shows on average how many lines of code some developers write per day. Again, I think it is quite relative in terms of the language and project complexity.
While you could write on averge 100 lines in C# or Java, you could end up writing over 500 in PHP.
phat shantz said:
Your comments are very interesting and reminiscent of days gone by when Ph.D. candidates sought the holy grail of software development management so as to make the software development team into another assembly line of efficient mass production. Ironically, we call many people “doctor” even after their work has utterly failed.
Project managers — who lack the history and knowledge of the decades of systems engineering failure — still seek the mythical crystal ball to forecast productivity. I guess Lines Of Code Per Developer Per Day is as good as probing the entrails of chickens, although not quite as sanitary.
There are far too many human and project-related variables for any technique to be useful even across days or weeks — within the same team. Worse yet, summed alongside analysis, design, and testing, I wonder what the lines/day/person are (irrespective of job title).
I find it interesting that, after decades of systems engineering predictions regarding the vast productivity improvements of “next-generation” software, the community stands at about the same productivity level as they did two decades ago.
Either the systems engineers are wrong (gasp) or there is something sinister in the work slowdown of the modern developer. I’m voting for wrong software engineers.
You also noted an interesting variance in line-based productivity metrics between projects. There exists an equally discriminating variance between platforms and project complexity and team size. Variable upon variable; difference upon difference; apples upon oranges; heaven knows how to compare one to the other. (But heaven is silent.)
Poor project managers. I weep for them. They seek to control the uncontrollable and predict the future based on insufficient devices. Worse, the historic battle cry of the systems engineer has long been “Manage Without Knowledge, Lead Without Understanding.” They have long sought the form and technique to manage and lead programming projects without either knowledge of the language nor experience in the problem domain. (As Dr. Phil would say, “how’s that workin’ out for ya?”
Until that glorious systems engineering breakthrough (don’t hold your breath), lines of code/day/developer will remain one of their only countable metrics and, therefore, a sacrament at their alter.
I hope developers don’t fall prey to this artifice and remain committed to their art and experience, and produce the best code necessary at the best pace possible — without letting the metronome in some manager’s head impinge on our own standards and excellence.
jdbennetames Bennet said:
I average 100 to 400 lines per day in C++
Spockmonster said:
I’ve been programming since the TRS-80 Model I captured my heart in 1979. That’s a lifetime of coding, and not one line of COBOL I might add!!!!! Yay
Anyway, if I were a code monkey typing the same for loop over-and-over then I could code 1,000 lines per day. But someone doing that should be paid about $10 an hour.
WIth an industry that evolves and reinvents itself as much as does Software Engineering, a majority of my time is researching, figuring out how to accomplish something, prototyping it, etc., and then writing it in the finished project. Another significant amount of my time is Requirements Gathering, Design, Test-Design, testing, and reporting status.
I probably do 100 lines of code per day when I am forging ahead on new functionality. It usually *seems* like 1,000. Today for instance, figuring out how to use the sourceforge library iTextSharp to create PDF’s – spent 10 hours today, it turns out to be about 100 lines of code. But tweaking PDF tables, figuring out Fonts, learning how to suppress borders, how to build tables, how to span rows and columns, how to get a Logo image to scale correctly, etc, while laying out the finished document, came to 10 hours and 100 lines of code.
On the other hand, if my day’s work consisted of writing stovepipe ADO.Net code for 20 tables, I could probably do 500 or 1,000 but I would be extremely agistated at the repetitiveness by that assignment.
And if I use Entity Frameworks on the 20 tables, then I could do 2,000 lines of code in 5 minutes. Hey, I only said that because I’ve read so many other ego-centric claims of over 1,000 LOC per day and *easily* over 100,000 per year, only to see them admitting it is generated – once you count generated code, it is a meaningless statistic in the individual regard. Although in a National sense, it might be something to think about, that with today’s generators, our nation’s programmers are 1,000 times as productive as our COBOL ancestors.
Anyway, over the years, the consensus I usually see is that for a corporate developer, the average throughput is something like 50 lines per day, because of all the overhead of the SDLC – Requirements, Design, Documentation, Testing, Status Reporting, Source Control, H.R. stuff, etc.
Live Long and Prosper!
Mr.Monster said:
SpockMonster, that is probably the best write up I’ve read on Lines of Code.
n5ac said:
I’m actively working on a project that is blazing new ground. We have two developers working on the project (in C on an embedded platform). I tend to be something like 2/3 time rather than full time and we are averaging as a team 115 lines per person per day. We have been working some crazy hours too. It really does seem low, but the deal is that we stop for hours at a time and discuss how we’re going to code something. Then we may consult with others on how we have designed something (on paper) and finally we code it. Everything is not like this, of course. The last project I worked on I really had total say on it and I just coded what I knew it needed to do and I suspect my rates were higher, but I didn’t measure that project.
Also, I’m using a real LOC counter (cloc) so these numbers don’t include blank lines or comments. Currently after about 60 days, we have 4300 blank lines, 6800 comment lines and 18,500 LOC.
Wayne Beavers said:
When measuring loc make sure your time includes unit test, integration test, system test, and writing all documentation. As I recall, this was all considered when “The Mythical Man Month” was written. I have not read it since 1973, so my recollection might be fuzzy.
Most of what I code does not require much documentation, other than the on-screen help, which gets counted as lines. I use 50 loc for estimating. I have found that to be pretty accurate.
A colleague wrote 150kloc in 10 years. That works out to 40 loc per day.
From my observation it is the system test that kills the numbers. For every input field verify that the field validation is correct. Enter alpha into numeric fields. For numeric range use minimum, min minus 1, max, max plus 1. You can spend days, if not weeks, performing system test and during that time, for all of the defects that you find, many of them will be only a few inserted lines and many modified lines.
The largest software program document I ever wrote by myself was a product installation guide. It was around 100 pages. I no longer recall how long it took, but to write that and then install the product using the document takes awhile.
Pingback: The Cost of Publishing Newspapers with WordPress | thisismyurl.com
msuworld said:
my point is writing 100 lines of new code everyday and solve a problem with that, what do you think? I guess that will increase productivity a lot.
Max Lincoln said:
Some of my most productive days are when I *DELETE* 100 or more lines of code.
This usually has a lasting impacting. Removing dead code, unnecessarily verbose code, or duplicate code makes the project easier to understand, enhance, and maintain. Similarly, writing and excessive amount of code every day can create an unwieldy, buggy beast of a program. (Though it will make another poor metric, defects/kloc look great!)
Max Lincoln said:
I’m not the only one that likes removing code:
“One of my most productive days was throwing away 1,000 lines of code.” – Ken Thompson
“Perfection is achieved, not when there is nothing more to add, but when there is nothing left to take away.” – Antoine de Saint-Exupery
“Measuring programming progress by lines of code is like measuring aircraft building progress by weight.” – Bill Gates
“Ask a programmer to review 10 lines of code, he’ll find 10 issues. Ask him to do 500 lines and he’ll say it looks good.” – @girayozil
“Now this doesn’t mean that LOC is a completely useless measure, it’s pretty good at suggesting the size of a system. I can be pretty confident that a 100 KLOC system is bigger than a 10KLOC system. But if I’ve written the 100KLOC system in a year, and Joe writes the same system in 10KLOC during the same time, that doesn’t make me more productive. Indeed I would conclude that our productivities are about the same but my system is much more poorly designed.” – Martin Fowler
codefornothing said:
There is a reason why there’s a “mythical” in the title. I was surprised by the coincidence in those projects but LOC is not much more than a reference of the size of the system, it does not say anything about the quality or even the number of features involved.
Quite a nice group of quotes there, thanks!
Bill Abbott said:
No surprise Bill Gates knows as much about airplane engineering as he knows about software engineering:
“Measuring programming progress by lines of code is like measuring aircraft building progress by weight.” – Bill Gates”
In the 50s and 60s, when making aluminum airplanes on assembly lines was approaching a mature form of manufacture, Douglas used gross weight of the plane as their first estimate of cost of manufacture ($x / pound of airplane) and for how much they should charge their customers.* It wouldn’t surprise me at all if if airplane manufacturing tracked completion by weight, or by labor hours, or by value of finished goods produced.
Bill
*Ed Heinemann, Combat Aircraft Designer, by Edward H. Heinemann, Rosario Rausa
Max Lincoln (@devopsy) said:
I guess that makes Space Shuttle Enterprise, the too-heavy-for-orbit prototype, the most complete. Or that Space Shuttle Endeavor and others were “incomplete” at the end of their careers since NASA reduced their weight.
neuroxik said:
Often I merge my code lines as I go along. For instance, if I start a procedure which finally just checks the a factor or two, I’ll group them, such as :
$tsClass = (empty($i) || ($i>0 && is_int($i / $wholeBeatDivisor))) ? ‘on_beat’ : NULL;
versus the more LOC-friendly :
if(empty($i) || ($i>0 && is_int($i / $wholeBeatDivisor))) {
$tsClass = ‘on_beat’;
}
else {
$tsClass = NULL;
}
That’s 6 lines grouped into 1. There’s so many examples where less code (and not only the deletion of dead/unused code) makes you code less LOC, sometimes on-the-fly, sometimes grouping after, which don’t favor LOC but favor more comprehensible code. Easier for the eyes, to scroll through, and maintain.
Also, when writing code, as soon as I find a procedure being used more often than once or twice, I write a method/function for it (as most quality-coders do) and I just have to call that function in all other instances. It also makes sense, because it’s easier to maintain ONE function that alter all procedures going through the same scenarios, across files, but that too diminishes the lines of code.
codefornothing said:
LOC is a very coarse measure indeed but on a large enough codebase these cases even out as far as quantity is concerned. LOC does not say much about code quality as you describe.
Max Lincoln (@devopsy) said:
I agree LOC is a very coarse measure. Here’s some recent, relevant articles:
http://www.slate.com/blogs/future_tense/2013/10/21/healthcare_gov_problems_why_5_million_lines_of_code_is_the_wrong_way_to.html
or if you like charts:
http://www.alexmarchant.com/blog/2013/10/22/healthcare-dot-gov-lines-of-code-comparison.html
Although it’s an apples-to-oranges comparison, I think I can draw some valid conclusions knowing something is 10x larger than Windows XP. At that order-of-magnitude difference from a typical project I don’t even think programming language is a factor.
The first article has some more good LOC quotes.
Pingback: Battle of Brothers | A Video Game Programming Duel