=head1 NAME
perlsec - Perl security
=head1 DESCRIPTION
Perl is designed to make it easy to program securely even when running
with extra privileges, like setuid or setgid programs. Unlike most
command line shells, which are based on multiple substitution passes on
each line of the script, Perl uses a more conventional evaluation scheme
with fewer hidden snags. Additionally, because the language has more
builtin functionality, it can rely less upon external (and possibly
untrustworthy) programs to accomplish its purposes.
Perl automatically enables a set of special security checks, called I, when it detects its program running with differing real and effective
user or group IDs. The setuid bit in Unix permissions is mode 04000, the
setgid bit mode 02000; either or both may be set. You can also enable taint
mode explicitly by using the B command line flag. This flag is
I suggested for server programs and any program run on behalf of
someone else, such as a CGI script. Once taint mode is on, it's on for
the remainder of your script.
While in this mode, Perl takes special precautions called I to prevent both obvious and subtle traps. Some of these checks
are reasonably simple, such as verifying that path directories aren't
writable by others; careful programmers have always used checks like
these. Other checks, however, are best supported by the language itself,
and it is these checks especially that contribute to making a set-id Perl
program more secure than the corresponding C program.
You may not use data derived from outside your program to affect
something else outside your program--at least, not by accident. All
command line arguments, environment variables, locale information (see
L), results of certain system calls (readdir, readlink,
the gecos field of getpw* calls), and all file input are marked as
"tainted". Tainted data may not be used directly or indirectly in any
command that invokes a sub-shell, nor in any command that modifies
files, directories, or processes. (B: If you pass
a list of arguments to either C or C, the elements of
that list are B checked for taintedness.) Any variable set
to a value derived from tainted data will itself be tainted,
even if it is logically impossible for the tainted data
to alter the variable. Because taintedness is associated with each
scalar value, some elements of an array can be tainted and others not.
For example:
$arg = shift; # $arg is tainted
$hid = $arg, 'bar'; # $hid is also tainted
$line = ; # Tainted
$line = ; # Also tainted
open FOO, "/home/me/bar" or die $!;
$line = ; # Still tainted
$path = $ENV{'PATH'}; # Tainted, but see below
$data = 'abc'; # Not tainted
system "echo $arg"; # Insecure
system "/bin/echo", $arg; # Secure (doesn't use sh)
system "echo $hid"; # Insecure
system "echo $data"; # Insecure until PATH set
$path = $ENV{'PATH'}; # $path now tainted
$ENV{'PATH'} = '/bin:/usr/bin';
delete @ENV{'IFS', 'CDPATH', 'ENV', 'BASH_ENV'};
$path = $ENV{'PATH'}; # $path now NOT tainted
system "echo $data"; # Is secure now!
open(FOO, " $arg"); # Not OK - trying to write
open(FOO,"echo $arg|"); # Not OK, but...
open(FOO,"-|")
or exec 'echo', $arg; # OK
$shout = `echo $arg`; # Insecure, $shout now tainted
unlink $data, $arg; # Insecure
umask $arg; # Insecure
exec "echo $arg"; # Insecure
exec "echo", $arg; # Secure (doesn't use the shell)
exec "sh", '-c', $arg; # Considered secure, alas!
@files = ; # Always insecure (uses csh)
@files = glob('*.c'); # Always insecure (uses csh)
If you try to do something insecure, you will get a fatal error saying
something like "Insecure dependency" or "Insecure $ENV{PATH}". Note that you
can still write an insecure B or B, but only by explicitly
doing something like the "considered secure" example above.
=head2 Laundering and Detecting Tainted Data
To test whether a variable contains tainted data, and whose use would thus
trigger an "Insecure dependency" message, check your nearby CPAN mirror
for the F module, which should become available around November
1997. Or you may be able to use the following I function.
sub is_tainted {
return ! eval {
join('',@_), kill 0;
1;
};
}
This function makes use of the fact that the presence of tainted data
anywhere within an expression renders the entire expression tainted. It
would be inefficient for every operator to test every argument for
taintedness. Instead, the slightly more efficient and conservative
approach is used that if any tainted value has been accessed within the
same expression, the whole expression is considered tainted.
But testing for taintedness gets you only so far. Sometimes you have just
to clear your data's taintedness. The only way to bypass the tainting
mechanism is by referencing subpatterns from a regular expression match.
Perl presumes that if you reference a substring using $1, $2, etc., that
you knew what you were doing when you wrote the pattern. That means using
a bit of thought--don't just blindly untaint anything, or you defeat the
entire mechanism. It's better to verify that the variable has only good
characters (for certain values of "good") rather than checking whether it
has any bad characters. That's because it's far too easy to miss bad
characters that you never thought of.
Here's a test to make sure that the data contains nothing but "word"
characters (alphabetics, numerics, and underscores), a hyphen, an at sign,
or a dot.
if ($data =~ /^([-\@\w.]+)$/) {
$data = $1; # $data now untainted
} else {
die "Bad data in $data"; # log this somewhere
}
This is fairly secure because C\w+/> doesn't normally match shell
metacharacters, nor are dot, dash, or at going to mean something special
to the shell. Use of C would have been insecure in theory because
it lets everything through, but Perl doesn't check for that. The lesson
is that when untainting, you must be exceedingly careful with your patterns.
Laundering data using regular expression is the I mechanism for
untainting dirty data, unless you use the strategy detailed below to fork
a child of lesser privilege.
The example does not untaint $data if C