develooper Front page | perl.perl5.porters | Postings from July 2008

Removing files called minus (was: MAGIC MAGIC MAGIC)

Thread Next
From:
Tom Christiansen
Date:
July 29, 2008 06:58
Subject:
Removing files called minus (was: MAGIC MAGIC MAGIC)
Message ID:
24692.1217339882@chthon
In-Reply-To: Message from Nicholas Clark <nick@ccl4.org> 
   of "Tue, 29 Jul 2008 10:16:10 BST." <20080729091610.GA45868@plum.flirble.org> 

> This whole thread of threads is depressing me. A lot of talk. Little else.

Although this is probably little solace for you Nick, it *could* be worse:
imagine people wildly changing the core out from under us and breaking tons
of old code.  Better a lot of talk than that, eh?

I had hoped that my long message last night would have inspired those
flustered by magic open both to adopt their own measures and to understand
that these will never be enough in their hostile environments, and why.

Apparently it has done none of that.  So I will be more specific.

But first...  (skip to /history if you must)

TEN YEARS AGO I took some trouble to patiently explain the difference
between simplicity and specificity in Perl's open and sysopen functions.
This document came to be known as perlopentut.  Unlike other writings of
mine, this one appears to have suffered far less from the mutilating
effects of "helpful hands" than some others have.

I showed the simple mapping of:

    $ myprogram file1 file2 file3
    $ myprogram    <  inputfile
    $ myprogram    >  outputfile
    $ myprogram    >> outputfile
    $ myprogram    |  otherprogram
    $ otherprogram |  myprogram

into straight-forward statements like this:

   open(INFO,   "<  datafile") || die("can't open datafile: $!");
   open(RESULTS,">  runstats") || die("can't open runstats: $!");
   open(LOG,    ">> logfile ") || die("can't open logfile:  $!");

And continued to say:

   The other important thing to notice is that, just as in
   the shell, any whitespace before or after the filename is
   ignored.  This is good, because you wouldn't want these to
   do different things:

       open INFO,   "<datafile"
       open INFO,   "< datafile"
       open INFO,   "<  datafile"

   Ignoring surrounding whitespace also helps for when you
   read a filename in from a different file, and forget to
   trim it before opening:

       $filename = <INFO>;         # oops, \n still there
       open(EXTRA, "< $filename") || die "can't open $filename: $!";

   This is not a bug, but a feature.  Because "open" mimics
   the shell in its style of using redirection arrows to
   specify how to open the file, it also does so with respect
   to extra whitespace around the filename itself as well.
   For accessing files with naughty names, see "Dispelling
   the Dweomer".

at which point I wrote:

   Dispelling the Dweomer

   Perl is more of a DWIMmer language than something like
   Java--where DWIM is an acronym for "do what I mean".  But
   this principle sometimes leads to more hidden magic than
   one knows what to do with.  In this way, Perl is also
   filled with dweomer, an obscure word meaning an
   enchantment.  Sometimes, Perl's DWIMmer is just too much
   like dweomer for comfort.

   If magic "open" is a bit too magical for you, you don't
   have to turn to "sysopen".  To open a file with arbitrary
   weird characters in it, it's necessary to protect any
   leading and trailing whitespace.  Leading whitespace is
   protected by inserting a "./" in front of a filename that
   starts with whitespace.  Trailing whitespace is protected
   by appending an ASCII NUL byte ("\0") at the end of the
   string.

       $file =~ s#^(\s)#./$1#;
       open(FH, "< $file\0")   || die "can't open $file: $!";

   This assumes, of course, that your system considers dot
   the current working directory, slash the directory
   separator, and disallows ASCII NULs within a valid
   filename.  Most systems follow these conventions,

So pleading ignorance is no excuse: I disbelieve.  And even then it 
was hardly new knowledge, for YEARS BEFORE THAT, I wrote that very 
answer in the perlfaq.

(Sadly, you can no longer find it there, for a variety of reasons,
including that that document is now virtually unreadable, in part I believe
due to tab issues, but perhaps because people can't line up indented code
up properly to save their lives.  It now looks like a messy tattercloth
that fits no one -- and that is the *nicest* thing I can say about it.)

And even *then* it was a well-known "issue" in Perl.  We simply dealt 
with it.  You can't plead ignorance.

And it's not just Perl.  What was always *THE* number-one FAQ for unix
shell users?  How to remove a file named "-".  The answer was to call it
"./-" and be done with it.  So too here.  People who feign surprise are
either disingenuous or else unread in the field--neither of whom would
I want writing code for me, and neither should you.

Ok, end of history lesson.

For those who don't want <ARGV> to handle special filenames, the
simplest, most straightforward, and least disruptive approach is 
for them to map the filenames in @ARGV using code along the lines
shown above.  Something like this would be easy enough:

    for (@ARGV) {
       s!^(?=\s)!./!;	# leading whitespace preserved
       s/^/< /;		# force open for input
       s/\z/\0/;	# trailing whitespace preserved & pipes forbidden
    }

I had hoped that my programmatic preprocessing of @ARGV would have pointed
people in this direction, but apparently it did not.  Just make sure to do
it AFTER getopt processing and BEFORE entering the a while (<>) block.

No rocket-science here: this is not new technology, has been known for a
long time, and doesn't affect people who don't care for such nonsense.

But guess what?  This isn't enough to make your program "safe" against
hostile entities.  That's a very much more difficult task.  That's why I
think this is all a tempest in a teapot, a tale told by /*CENSORED*/
full of sound and fury--and signifying nothing.  And yes, Nick, that
*is* depressing; I'm not going to disagree with you, because you're right.

There is this assumption that "people" are putting out little bombs for
your program to stumble on, and that you must protect yourself from
them. I don't like assumption, and I don't believe it relevant the bulk
of Perl programming.  The "rm -rf / |" example filename is just some
fanciful bugbear to scare our children at night.  Who's writing that?  What
kind of system are you on?  And what the devil are you doing running around
executing code as the superuser THAT PEAKS INTO PEOPLE'S FILES?  There's
plenty of ethical miasma here on both sides of that equation.  And if
you're not running as the superuser, how much harm is it really going to
do?  And who would do that to you?

My examples last night also should have made people realize there are
perils that they haven't even thought of.  I don't need to create a file
named that to wreak havoc.  In fact, I can likely screw up your system 
with files whose names are to all appearances, just fine.  Here's one way:

    % mkfifo /tmp/foo

There: now when your nosy little superscript that seems to think it should
be reading everybody's files tries to open it, it will hang forever.  Hence
my cleansing of @ARGV for {-f && -T} in that specific order.  Do we "fix"
this by having building in the -f logic to -T?  No, we do *not*.

Oh, and what about systems with mandatory locking on files with funny
permissions, like g+s a-x files?  I'll bet you could get some harassment
value out of those, too.

And even that's still the least of your worries in a hostile environment.  

Suppose a program is known to create a file name /tmp/bar, 
presumably through open >/tmp/bar or the equivalent.  

Fine.  

Merely symlink /etc/passwd to /tmp/bar and watch the fun begin.
Or to something in /dev.  It's lots of fun.  /dev/mem and /dev/kmem
are also fun link targets.  Anybody got a directory that isn't sticky
I can play with?  Yum.

These are all old, old, old hacks.  Ignorance of the past is no excuse.
The mythical bugbears of automatically running process pipes are far less
scary than any of these scenarios I've presented, and changing magic argv
will do NOTHING about them.

The lesson is that if you are writing code for eventual execution in a
hostile environment where evil people are trying to kill you/it, that you
have to take a very, very different approach to matters.  Sometimes that
means taint mode or Safe compartments, but even these are only a small set
of tools; what's really needed is a lot of good, hard thought by security
experts.  Races and timing attacks are even more subtle.

You-against-the-world-code includes daemons, setID scripts, and CGI
scripts, but is hardly restricted to those alone.  It's a far scarier 
place than you think, and this will keep you awake at night, anxious
and depressed if you do think about it.  Perl should not be crippled
just because some very few people are paranoid, justly or otherwise.

Now do you see what I meant when I said this is not a real problem?
You can put my foreach preprocessing loop in before <ARGV>, and it
will do nothing at all to save you from any of those real problems of 
running in a hostile environment.  It may make you think it has, but
it hasn't.

And that's *not* a Perl problem, so Perl should not be expected
to solve it.  It's more of a social dysfunction.

So good luck at solving *that* problem, and have a nice day.

--tom

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About