Automated Program Analysis for Enhanced Software Quality
What is Program Analysis?
Program analysis is the automated process of checking computer programs for measurable quality attributes. The focus is on automation that can be integrated into continuous integration and deployment (CI/CD) pipelines, ensuring checks are performed automatically on releases.
The analysis targets specifically measurable qualities, avoiding subjective assessments like user interface aesthetics. While crucial, such aspects are not easily quantifiable through automated means.
According to Wikipedia, “In computer science, program analysis is the process of automatically analyzing the behavior of computer programs regarding a property such as correctness, robustness, safety and liveness.”
Even though the process is automated, a manual aspect remains. The analysis provides results that require human interpretation, including triaging—deciding whether to act on or ignore discovered issues and determining their level of urgency.
Role in Software Quality Assurance
Program analysis is a core component of Quality Assurance. Software quality itself has several attributes, formally defined by the ISO/IEC 25010 standard. While analysis can address many of these attributes, it is most commonly used to assess security.
ISO/IEC 25010 Quality Attributes
- Functional Suitability: Functional completeness, functional correctness, and functional appropriateness.
- Performance Efficiency: Time behavior, resource utilization, and capacity.
- Compatibility: Co-existence and interoperability.
- Usability: Appropriateness recognizability, learnability, operability, user error protection, user interface aesthetics, and accessibility.
- Reliability: Maturity, availability, fault tolerance, and recoverability.
- Security: Confidentiality, integrity, non-repudiation (proof of origin), authenticity, and accountability.
- Maintainability: Modularity, reusability, analyzability, modifiability, and testability.
- Portability: Adaptability, installability, and replaceability.
Common Quality Checks
- Is the program correct with regard to a specification?
- Is the program secure? Does it have known vulnerabilities (CVEs) or exposures, or exhibit common weakness enumerations (CWEs)?
- Does the program have common bugs like deadlocks or null dereferences?
- Does the program follow coding best practices? (Note: This often generates false positives that require manual review.)
- Is the program scalable? Are the right data structures being used, and is it written to allow parallelization?
- Is the program energy-efficient? This is a growing trend, important for both embedded devices and large-scale computing.
Kinds of Program Analysis
There are two primary types of analysis, along with a third that combines their strengths.
Static Analysis
Static analysis involves building a model from the code without executing it. The process involves reasoning about, traversing, or querying this model and then optionally taking action, such as modifying the code and re-analyzing.
A classic example is the Java Compiler. It parses source code, checks compliance with the language grammar (e.g., correct keyword usage, brackets), and builds a model of types (hierarchy, types for method parameters, returns, fields, and variables). It then queries this model for violations, such as:
- Type incompatibilities in assignments (e.g.,
String f = new Object();) - Circular dependencies in the subtype graph
- Visibility rule violations (e.g., a private method called from outside its class)
The rules are defined in the Java Language Specification, and the aggregate result is a boolean value: the code compiles or not, with errors and warnings providing provenance.
The input for static analysis is the program, which usually means the source code plus additional configuration files, metadata, and resources (e.g., web.xml for web application entry points or pom.xml for library dependencies). Analysis can also take compiled code (binary/bytecode) as input. This is often easier to analyze because references are resolved to their fully qualified names (e.g., class Foo becomes pck1.pck2.Foo), removing ambiguity. In this scenario, the compiler acts as a pre-analysis step. Some static analysis tools, like the Checker Framework, integrate directly with the compiler.
Dynamic Analysis
Dynamic analysis involves executing the program using a driver and observing its behavior. A driver is a mechanism to run the program, such as a Java main method, a unit test, or an HTTP request to a web app. Observations can include console output, logs, debug information, or test reports. A model is then built from these observations (e.g., JUnit reports) to be reasoned about or queried, followed by optional action.
JUnit is a prime example. It runs test cases on a program, with the tests acting as drivers. JUnit records the results, creating a simple model that maps each test to a result: pass, fail, error (e.g., an unexpected RuntimeException), or skip (e.g., an assumption was violated). This can be aggregated to a boolean value: tests succeed or not.
A significant challenge in dynamic analysis is non-determinism. Rerunning tests may produce different outputs due to shared state, test dependencies, or race conditions. This phenomenon is known as test flakiness and is a massive problem in the industry, as the execution order of tests is not guaranteed without explicit enforcement.
Hybrid Analysis
Hybrid analysis attempts to combine static and dynamic analysis to leverage their respective strengths and compensate for their weaknesses.
Key Concepts and Timings
Software Provenance
Software provenance is the metadata that records the origin, development, and delivery of software components. In program analysis, it is vital for understanding why and how a particular output was produced.
Analysis Timings
- Compile-time Analysis
- Analysis performed when the program is compiled. This includes standard compiler checks (e.g., type checks) and analysis from compiler plugins.
- Build-time Analysis
- A broader category that includes compilation and other scripted, automated analyses. This can be static (e.g., FindBugs, SpotBugs, PMD, NullAway configured as build plugins) or dynamic (e.g., test coverage analysis executed on regression tests).
- Runtime Analysis
- Dynamic analysis performed on running code. Examples include logging, debugging, memory dumps, and monitoring via JMX MBeans.
Models Used in Program Analysis
Analysis tools create various models of the program to reason about its properties.
Graphs
- Call Graphs: Model which methods or functions call each other. Nodes represent functions/methods, and edges represent invocations. This is useful for security analysis, such as ensuring “powerful” functions (e.g., reflection via
method.invoke()orRuntime.execute()) are not reachable from an application’s entry point. - Data-flow Graphs: Model how data flows through the program. This can extend call graph analysis to determine if user-controllable data can be passed as an argument to a critical method.
Trees
The Abstract Syntax Tree (AST) is a natural tree model for source code, similar in concept to the Document Object Model (DOM) for websites.
Maps
Maps are used to associate program elements (classes, tests, methods) with analysis results, such as test outcomes, compilation status, or code coverage data. Maven Surefire reports are an example of a map-based model.
Databases (Tables, Relations)
A powerful approach where analysis results are written into a relational database (or a similar format like CSV). The database can then be queried with SQL to find issues. This involves extracting extensive details from the program—variable assignments, method parameters, control structures—and storing them for complex queries. A variant of this approach uses Datalog, a declarative logic programming language used by tools like CodeQL and Doop, where the analysis is defined as a set of rules.
