Heuristic Source Code Analyzer

This console Java application was made to examine C source code and extract various attributes that may be relevant to design decisions. Mainly, it counts the global, module, and local variables, the internal and externally used types, the local functions and their average parameters, and a breakdown of internal and external function calls.

The reason it's a heuristic rather than a proper analyzer is because the analysis is done through a step-wise reductionist algorithm which serially examines the source code using regular expressions. Yes, regular expressions. What was I thinking? At this point it seems doubtful that I will find the time to completely rewrite this to use a tree-based parser.


Above all else, the idea here was to collect useful, mostly factual data about a C program that can be used to examine its design. For example, this was used to determine the relative level of cohesion and coupling in the modules of certain C compilers.

Whether you find it useful really depends on your purposes, your concept of what exactly a good design is, and how much accuracy you need. My previous sampling from some rather large C code bases suggested that the analyzer was more than 95% accurate on average, and the errors would tend to level or cancel out when comparing different programs.


Sadly, there is none at this time. However, there are a bunch of example config files that may help you included in the package.


This code is currently released under these licenses:

Choose one and only one for your usage.

I am considering an additional license, most probably either BSD or LGPL, but have not yet made a decision on that.


The most recent public version is 0.8.0, released on March 2nd, 2012. Provided here is a bzip2 compressed tarball of the code.