Code Line Counter: language statistics

In this blog we have already told you about the application programmer, but… what about the code we write? Do we have any tools that shed any light on this? Well, that’s what we’re going to talk about today, a code line counter. Let’s see.

In the field of chemistry there is qualitative analysis, as well as quantitative analysis: the first tells us how many elements or compounds a sample has and the second how much of each of them it has. In the field of automated computing, Pandora FMS work field, the programming languages are like compounds, so we would need to know how many of them make up a project.

Code Line Counter

Programming languages

A look at the past

Steve Ballmer (Microsoft®) in the 1990’s talked about the IBM® company and encouraged his employees to boast about projects with large amounts of lines of code, in some cases monstrous amounts (used the unit of measure K-LOC, thousands of lines of code, in English). We must understand that IBM® operating systems are designed to run at one hundred per cent use of each processor – yes, 100%, all the time, every day, every second – so they were prepared, at least as far as hardware is concerned, to have that working model for software development.
Besides, we are talking about very few programming languages, mainly Cobol, so the study to be carried out was not major.

Independent or installed?

Any modern operating system provides an application with all the capabilities (explore directories and files, for example), so a project only needs a few hundred lines for these tasks; there is already a saving of time and effort out there. But this wasn’t always so. In the 1990s we had to deal with direct calls and communications to the hardware, and the operating systems recognized their limitations and therefore allowed it.
For those who visit our blog for the first time, the tool embedded in Pandora FMS to connect remotely to computers is eHorus. This technology allows to install a software agent and thus have direct communication in monitoring tasks, but we can also use it without Pandora FMS. Up to 5 devices are free.

In the remote machine we can install it permanently or in your other option independently (standalone). Of course, in the second type of download there will be more files and data because everything must be in a single folder, after using it we delete the directory. On the other hand, in the permanent installation it will take into account the version of the operating system destination for the libraries necessary to interact with it. Because of this, the Lines of Code Counter we use must be able -or be able to be configured- to detect if our project uses any collection of utilities (libraries) in order not to add or analyse it as part of the project.

For example, if it is written in C or C++ language the header files must be considered separate, as well as in the particular case of Microsoft Visual C++ the “.cs” files, which are pre-compiled before running the source code and should also not be considered in the count (note: the Smalltalk language also has the same file extension and the situation changes in that case).

Code Line Counter

We have already defined the task to be carried out. We will now move on to the choice of tool. To begin with, most likely the programming language or languages have their own options for such tasks, but always in a very basic way. We could also use Ohcount, which is written under free software license and in C language. Other options are:

  • SLOCCount.
  • Unified Code Count.
  • loc.
  • scc.
  • gocloc.
  • Sonar.
  • vsclc.
  • tokei.

But on this occasion we noticed cloc, an “old” software released in 2006 and updated in October 2018, also with GNU v2 license and written in Perl… Yes, we know, it is a rough language for beginners, but once you use it we will notice its great power and, frankly, it is better than the C language.

Another interesting detail is that cloc is also available for the Windows® platform and even has an extension for Visual Studio Team Services® (VSTS, what is now known as Azure DevOps®). It is capable of analysing a simple file, directory or subdirectory, or if necessary a compressed file or installer in .deb format from a minimalist web browser and thus give us a summary like the following:

Code Line Counter

Code line counter applied to the Min web browser installer

We also apply it to the eHorus agent installer and PandoraFMS repository in GitHub:

Code Line Counter

Code line counter applied to eHorus agent installer

Code Line Counter

Code Line Counter applied to Pandora FMS 7.0 NG 731

What we are seeing is the download in zip compressed format of Pandora FMS 7.0 NG 731, hosted in GitHub. We observe that it leads PHP followed by JavaScript and Perl (the .PO files are translations of the instructions, because they are high level language they occupy high values). The Code Line Counter is also able to summarize an attack, such as version 5.0, when the order was first JavaScript, PHP and then Perl:

Code Line Counter

Code Lines Counter to extract the “commit” of Pandora FMS 5.0 version

Now, cloc has a lot more to offer:

  • You can compare two compressed files from different states or versions of a project and consistently show by programming language how many lines of code, blank lines and comments were added and/or removed (option –diff-aligment=filename.txt).
  • These resulting text files, in turn, can be merged and totalized again by cloc. You can export, as we saw, the results to a text file, in a separate format by commas, JSON, YAML, etc, but the most important thing in SQL: this gives the opportunity to have instructions to inject values into any database and be able to monitor broadly through an agent and graphics.
    With the –strip-comments option and a string of characters that we supply copies all the files removing the comments and then it analyses the resulting code, which must be equal to the one applied without the option; possibly we will be able to reuse the copied files without comments with the extension that we provide.
  • If a language is not supported we can make our own definitions of it with the option –write-lang-def=misDefinitions.txt
  • Flexibility

    Pandora FMS allows us to extend even to the area of source code analysis and it can help you in countless computer tasks, no matter how dissimilar they may seem. Get in touch!

Shares