Catching Common Problems
Finding bugs and fixing them is more than a passion of mine—it’s a compulsion. Several years ago, as a QA developer, I created the MUnit unit testing framework for the Wolfram Language, which is a framework for authoring and running unit tests in the language. Since then, I’ve created more tools to help developers write better Wolfram Language code while seamlessly checking for bugs in the process.
Writing good tests requires a lot of knowledge and a great deal of time. Since we need to be able to test and resolve bugs as quickly as possible in order to release new features on schedule, we turn to static analysis to be able to do so.
What Is Static Analysis?
Static analysis is the process of examining source code before running it in order to try to predict its behavior and find problems. As a testing method, it’s incredibly useful. Finding problems while the code is running isn’t always viable. It can also be very expensive to run the code—all the more so if the code fails.
Considering the sheer volume of code that makes up the Wolfram Language (there are 1.2 million lines of kernel startup Wolfram Language code across 1,900 files and an additional 850,000 lines of paclet Wolfram Language code across 3,700 files), it’s imperative to have a strategy to test all of this code for bugs. Wolfram has tests dedicated to every square inch of the Wolfram Language—some of which I wrote!
The CodeInspector paclet is one of those vital static analysis tools that allow developers to do better work. Included in the recent release of Mathematica 12.2, CodeInspector scans Wolfram Language code and reports problems without requiring the user to manually run the paclet. CodeInspector along with CodeParser and CodeFormatter form the CodeTools suite, which is used by both internal and external users to improve the quality of their Wolfram Language code.
In general, static analysis cannot find all possible bugs in a program. (That is a consequence of the undecidability of the halting problem by way of Rice’s theorem.) But static analysis can still provide plenty of important information!
For example, it is easy to see that && True is not needed in the test here:
This may be leftover debug code, or simply a mistake in logic. A static analysis tool may warn that the && True is not needed and could be removed or changed to something else. While static analysis tools cannot discern the intention of the author, they can find classes of “likely problems” that merit investigation.
Creating a static analysis tool to test for bugs in the Wolfram Language comes with a very specific set of challenges. The Wolfram Language is incredibly dynamic and flexible as a coding language. While this would usually be considered a bonus for developers, it does make abstract modeling very difficult. Functions can be redefined at runtime, and it’s complicated to define precisely the concept of a value in the Wolfram Language.
Given the limitations inherent in the language, CodeInspector does lightweight static analysis based on pattern matching of syntax trees. This is similar to the “linting tools” that exist for other languages. (In fact, the original name of the CodeInspector paclet was Lint! It quickly became apparent that it would be doing more than just linting, so it was renamed to CodeInspector.)
CodeInspector currently has around two hundred built-in rules that are applied to code under inspection. The rules range from common syntactical problems (such as missing commas) to more obscure ones (such as using Q functions in symbolic solvers). Many rules include suggestions for fixing the code.
Using CodeInspector
CodeInspector is included in Mathematica 12.2. If you have an older version of Mathematica, you can get CodeInspector by evaluating the following:
In order to programmatically get a list of all problems in the following code snippet:
…you can run this test:
To get a visual summary of all the problems found in the test, use CodeInspectSummarize (included in the CodeInspector paclet):
You can even use CodeInspectSummarize on the command line:
There are various ways to control the output of CodeInspectSummarize. In order to do so, we need to categorize problems, which is an interesting problem in and of itself! This is because we need to strike the right balance between exposing many properties of problems in a queryable way versus having a system that is easy for humans to consume and understand.
I use two dimensions, at least for now: severity and ConfidenceLevel. If the output shows that there are problems, severity denotes how severe each problem is. Will the problem ever impact users? Will it accidentally launch nuclear warheads? Knowledge is power, especially when you need to understand the impact of the problems at hand.
ConfidenceLevel denotes the level of confidence that the problem is actually a problem and not a false positive. ConfidenceLevel is a Real value between 0.0 and 1.0. ConfidenceLevel → 0.0 means no confidence at all in the problem being reported, while ConfidenceLevel → 1.0 means that there is definitely an issue at hand, like mismatched brackets in a function. A ConfidenceLevel of 0.5 would mean that roughly half the time this problem appears, it is a false positive. ConfidenceLevel is 1.0 in the event of a mismatched bracket. More experimental rules in CodeInspector will have lower ConfidenceLevel, and as I add heuristics to remove false positives, I increase the ConfidenceLevel for problems. Re-appropriating the ConfidenceLevel symbol for my purposes may be an abuse of notation, but it is convenient.
Because the Wolfram Language is so dynamic, it’s difficult to tell when an alleged bug is actually a bug. Even in the previous examples, it is possible that the If statement was written deliberately. Only syntax errors such as:
… can be flagged with 100% certainty. Note that even “obvious” problems such as:
… don’t necessarily have ConfidenceLevel → 1.0. Thus, every problem reported by CodeInspector has an associated ConfidenceLevel that indicates the confidence that the problem is actually a problem.
CodeInspectSummarize, by default, reports issues with 95% confidence or higher.
There are also four different severities associated with problems:
These severities should be interpreted at the same time as ConfidenceLevel. Severities are only meaningful if the problem is not a false positive.
How CodeInspector Works
The Wolfram Language has a powerful built-in pattern matcher, and it can be used to do static analysis on expressions.
I designed CodeInspector’s rule engine to include knowledge of the relative position of the code under inspection, so we can move up the syntax tree to parent nodes and ask other questions. This is useful when writing a rule to make sure that some syntax occurs lexically within some other container syntax.
For example:
This illustrates a common mistake: forgetting the &.
Starting with the location of the #, we go up the tree, looking for a matching &:
No & is ever found, so a problem is reported. Notice that this rule has a lower confidence and I need to specify ConfidenceLevel → 0.8 to see it.
You can choose from different rules depending on the syntax that you care about. For example, if you wanted a rule to find cases where a Real is being added to an Integer, then you do not care about the concrete syntax of 1.2+3 versus Plus[1.2, 3].
There are three different levels of syntax:
Catching Common Problems
Example 1:
In this example, I forgot to put a semicolon at the end of the line, so the entire expression is treated as a=1*a+b. This is incorrect, and leads to infinite recursion when the code is run:
Example 2:
In this example, I forgot to insert a question mark for PatternTest.
CodeInspector catches cases when Q functions are being treated as a Head and suggests inserting a question mark:
Catching More Obscure Problems
Example 3:
In this example, I am trying to specify ImageSize using the output of ImageDimensions, but the two functions do not have the same units. The ImageSize option expects points, but ImageDimensions returns pixels:
Real-World Problems
CodeInspector is run regularly on the internal code written by developers at Wolfram Research. The following are two recently encountered problems that were found and fixed by CodeInspector. These problems are subtle, and would have been hard to find by writing tests.
Problem 1:
Parentheses are needed to wrap the entire right-hand side. The original code was equivalent to:
This is certainly not what the author intended.
Problem 2:
The extra underscore _ after inc means that {__} was being treated as the Optional value of inc. But the intention was for inc to match the pattern {__}. CodeInspector was able to find these issues and get them fixed before releasing the code.
The CodeInspector Workflow
CodeInspectSummarize reports problems with a given File in the exact same way as it reports problems with a given String.
Because Wolfram Language code is interpreted, and therefore does not have a compilation step, it may not be clear when would be the best time to scan for problems. In practice, I’ve found that the time when paclets are built is a good time to scan.
I have scripted CMake to scan each Wolfram Language file before building the paclet. Here is what it looks like when I have a typo in my code and I try to build the CodeInspector paclet itself:
As such, I can see the typo in my code and fix it immediately in the source code. Otherwise, I would have built the paclet with bad code, and would have encountered strange errors while trying to run the code. This highlights one of the many reasons why it’s important to catch and fix problems as soon as possible—demonstrating the significance of CodeInspector by testing CodeInspector itself.
New rules are continually being added to CodeInspector, which you can check out in the CodeInspector repository on GitHub. Many of the current rules were inspired by suggestions from users, so please let me know in the comments section if you have any ideas or suggestions.
© Copyright 2000-2023 COGITO SOFTWARE CO.,LTD. All rights reserved