GitHub
CERT Secure Coding

STR30-PL. Capture variables should be read only immediately after a successful regex match

Perl's capture variables ( $1 , $2 , etc.) are assigned the values of capture expressions after a regular expression (regex) match has been found. If a regex fails to find a match, the contents of the capture variables can remain undefined. The perlre manpage [ Wall 2011 ] contains this note:

NOTE: Failed matches in Perl do not reset the match variables, which makes it easier to write code that tests for a series of more specific cases and remembers the best match.

Consequently, the value of a capture variable can be indeterminate if a previous regex failed. The value can also be overwritten on subsequent regex matches. Always ensure that a regex was successful before reading its capture variables.

Noncompliant Code Example

This noncompliant code example demonstrates the hazards of relying on capture variables without testing the success of a regex.

Non-compliant code
my $data = "[    4.693540] sr 1:0:0:0: Attached scsi CD-ROM sr0";

my $cd;
my $time;

$data =~ /Attached scsi CD-ROM (.*)/;
$cd = $1;
print "cd is $cd\n";

$data =~ /\[(\d*)\].*/;  # this regex will fail
$time = $1;
print "time is $time\n";

This code produces the following output:

cd is sr0
time is sr0

The surprising value for the $time variable arises because the regex fails, leaving the capture variable $1 still holding its previously assigned value sr0 .

Compliant Solution

In this compliant solution, both regular expressions are checked for success before the capture variables are accessed.

Compliant code
my $data = "[    4.693540] sr 1:0:0:0: Attached scsi CD-ROM sr0";

my $cd;
my $time;

if ($data =~ /Attached scsi CD-ROM (.*)/) {
  $cd = $1;
  print "cd is $cd\n";
}

if ($data =~ /\[(\d*)\].*/) {  # this regex will fail
  $time = $1;
  print "time is $time\n";
}

This code produces the following output:

cd is sr0

This output might not be what the developer expected, but it clearly reveals that the latter regex failed to find a match.

Risk Assessment

Rule Severity Likelihood Detectable Repairable Priority Level
STR30-PL Medium Probable Yes No P8 L2

Automated Detection

ToolDiagnostic
Perl::CriticRegularExpressions::ProhibitCaptureWithoutTest

Bibliography

[ Conway 05 ]"Captured Values," p. 253
[ CPAN ]Elliot Shank, Perl-Critic-1.116 RegularExpressions::ProhibitCaptureWithoutTest
[ Wall 2011 ]perlre