develooper Front page | perl.perl5.porters | Postings from August 2008

Near-FMTEYEWTK instructorial on ties, handles, and methods (was: How to tell whether readline got an error or EOF)

Thread Previous | Thread Next
From:
Tom Christiansen
Date:
August 2, 2008 14:17
Subject:
Near-FMTEYEWTK instructorial on ties, handles, and methods (was: How to tell whether readline got an error or EOF)
Message ID:
31470.1217711834@chthon
In-Reply-To: Ed Avis's message of "Sat, 02 Aug 2008 17:16:57 -0000."
             <loom.20080802T170825-535@post.gmane.org> 

> Tom Christiansen <tchrist <at> perl.com> writes:

>> Clearly you must 
>> 
>>     (1) clear errno before the getc / readline 
>> and (2) use IO::Handle and inspect STDIN->error() to determine 
>>        whether your undef means eof or error, and only if it does
>>       say there's an error might errno be of any use to you.

> Thanks.  I wasn't aware that it was sometimes necessary to assign 
> $! = explicitly.

It isn't--provided that you are *very* careful about checking only on
syscall failure: your OWN syscall failures, in particular.  Because you
don't know for sure how and when Perl makes a kernel dive for whatever
reason, you don't know when errno is getting itself reset by libc behind
your back.

> Also I hadn't expected errno to get set on EOF, since EOF is not a
> true error condition.  

s/a true/an/;

It's surely some side-effect.  This is the peril of errno,
which Larry fixed for $@.

> But if the rule is to never look at $! unless
> error() returns true, that's okay.

Right.  Look before you leap, like if (p && p->foo) in C.

>> Something like this should work better than what you're doing:
>>
>>    use IO::Handle;
>>    $! = 0;
>>    unless (defined($ch = getc(STDIN))) {
>>      print "STDIN has ", STDIN->error() 
>>                              ?  "had an error: $!\n";
>>                              :  "hit EOF\n";
>>    } 

> How do error() and clearerr() work with tied filehandles?  They are
> not mentioned in perltie(1).

They don't.  You're missing something.  There's some conceptual
blockage in your head.

Ok, this may take awhile.  Better go make some coffee.

To begin with, regarding tying handles, perltie states:

    This is partially implemented now.

    A class implementing a tied filehandle should define the
    following methods: TIEHANDLE, at least one of PRINT,
    PRINTF, WRITE, READLINE, GETC, READ, and possibly CLOSE,
    UNTIE and DESTROY.  The class can also provide: BINMODE,
    OPEN, EOF, FILENO, SEEK, TELL - if the corresponding perl
    operators are used on the handle.

It then explains in more detail the expected signatures:

    TIEHANDLE classname, LIST
    WRITE this, LIST
    PRINT this, LIST
    PRINTF this, LIST
    READ this, LIST
    READLINE this
    GETC this
    CLOSE this
    UNTIE this
    DESTROY this

On the other hand, the standard Tie::Handle module, (not Tie:StdHandle; no
such beastie), says that, "This module provides some skeletal methods for
handle-tying classes."  It implements the following methods:

   TIEHANDLE classname, LIST
   WRITE this, scalar, length, offset
   PRINT this, LIST
   PRINTF this, format, LIST
   READ this, scalar, length, offset
   READLINE this
   GETC this
   CLOSE this
-> OPEN this, filename
-> BINMODE this
-> EOF this
-> TELL this
-> SEEK this, offset, whence
   DESTROY this

Which is a bit odd, because in fact you find the following, 
which seems to contradict the documentation above:

    % perl -MTie::StdHandle -de 0

    DB<1> m Tie::Handle
        CLOSE
        GETC
        PRINT
        PRINTF
        READ
        READLINE
        TIEHANDLE
        WRITE
        carp
        confess
        croak
        new
        via UNIVERSAL: DOES
        via UNIVERSAL: VERSION
        via UNIVERSAL: can
        via UNIVERSAL: isa

    DB<2> m Tie::StdHandle
        BINMODE
        CLOSE
        EOF
        FILENO
        GETC
        OPEN
        READ
        READLINE
        SEEK
        TELL
        TIEHANDLE
        WRITE
        carp
        confess
        croak
        via Tie::Handle: PRINT
        via Tie::Handle: PRINTF
        via Tie::Handle: new
        via UNIVERSAL: DOES
        via UNIVERSAL: VERSION
        via UNIVERSAL: can
        via UNIVERSAL: isa

In any event, what you're missing is that a tied THING is not a THING.  It
is a proxy THING providing hook that mimics (maybe) a real THING.  To mimic
a filehandle, as in STDIN or MY_SILLY_HANDLE_NAME_GOES_HERE, implementing
(at least osme of) these should suffice.

However, you're mixing these around and accidentally now talking about
something else.  You're talking about miming the *SOME_FH{IO} object.  This
is of class IO::Handle, and this is quite different.  (I still don't know
where the dirent stuff happens, but anyway.)  Remember that regular
built-in functions, operators, and operations are expected to be called on
tied THINGs, which magically (=implicitly) trigger ALLCAP method calls.
They're in all-caps because they are invoked implicitly, not explicitly.

On regular object classes, one is expected to make method invocations on 
the object proper (though perhaps using imperative-dative-accusative syntax), 
not subroutine calls on the tie's proxy variable.

Methods on an IO::Handle object (note capitalization!):
                                                                                                    DB<3> x *STDIN{IO}
    0  IO::Handle=IO(0x3c03849c)

  DB<4> m *STDIN{IO}
    DESTROY
    SEEK_CUR
    SEEK_END
    SEEK_SET
    _IOFBF
    _IOLBF
    _IONBF
    _open_mode_string
    autoflush
    blocking
    carp
    clearerr
    close
    confess
    constant
    croak
    eof
    error
    fcntl
    fdopen
    fileno
    flush
    format_formfeed
    format_line_break_characters
    format_lines_left
    format_lines_per_page
    format_name
    format_page_number
    format_top_name
    format_write
    formline
    gensym
    getc
    getline
    getlines
    gets
    input_line_number
    input_record_separator
    ioctl
    new
    new_from_fd
    opened
    output_field_separator
    output_record_separator
    print
    printf
    printflush
    qualify
    qualify_to_ref
    read
    say
    setbuf
    setvbuf
    stat
    sync
    sysread
    syswrite
    truncate
    ungensym
    ungetc
    untaint
    write
    via Exporter: as_heavy
    via Exporter: export
    via Exporter: export_fail
    via Exporter: export_ok_tags
    via Exporter: export_tags
    via Exporter: export_to_level
    via Exporter: import
    via Exporter: require_version
    via UNIVERSAL: DOES
    via UNIVERSAL: VERSION
    via UNIVERSAL: can
    via UNIVERSAL: isa

See how very, very different the methods on tied handles 
are from those on IO::Handles?

It is uncommon but not unknown to mix explicit, visible mimed interfaces
with implicit, invisible direct interfaces on the same THING.

Consider, for example, that using DB_File's tied hashes of the BTREE
flavor, usually--but not always--you need only deal with the proxy %hash
and not its implementational object.  That's because it supports further
operations that aren't in the TIEHASH interface.

For example, here I need the ->sync and the ->get_dup methods, but 
mostly I just want to deal with it as a normal hash.

    #!/usr/bin/env perl5.10.0

    # rulenick: demo of mergers, acquisitions, and inquisitions
    # Tom Christiansen <tchrist@perl.com>
    # Sat Aug  2 13:33:11 MDT 2008

    use 5.010_000;
    use strict;
    use warnings;
    use DB_File;  # also bequeaths O_* flags

    sub biggishly($$) {
        my($a,$b) = @_;

        length($b) <=> length($a)
                   ||
            uc($a) cmp     uc($b)
                   ||
               $a  cmp        $b 

    };

    my $DBASE_NAME = "BigStem.btree";

    $DB_BTREE->{flags}   = R_DUP();
    $DB_BTREE->{compare} = \&biggishly;

    my $trobj = tie( %ruler's_name, "DB_File", $DBASE_NAME, 
                     O_RDWR|O_CREAT, 0666, $DB_BTREE )
              || die "can't access/create $DBASE_NAME database: $!";

    # NB: creation *mode* is not 0666, but (umask() &~ 0666).
    printf("mode of %s is %#o\n", $DBASE_NAME, 
            07777 & (stat($DBASE_NAME))[2]);

    $ruler's_name{fred}  = "King Fernando II of Aragon";
    $ruler's_name{fred}  = "King Fernando V of Castile";
    $ruler's_name{liz}   = "Queen Isabel I of Castile and Leon";
    $ruler's_name{chuck} = "King Carlos I of Castile and Aragon ===> Spain";
    $ruler's_name{chuck} = "Emperor Charles V of the Holy Roman Empire";
    $ruler's_name{phil}  = "King Felipe II of Spain";
    $ruler's_name{phil}  = "King Felipe  I of Portugal";

    $trobj->sync();  # flush to disk

    my %seen_nick = ();
    for my $nick ( keys %ruler's_name ) {
        next if $seen_nick{$nick}++;  # keys also duped
        printf "\n%-10s =>   %s\n", ucfirst(lc($nick)), 
            join("\n  AND ALSO\t" => sort biggishly $trobj->get_dup($nick) );
    } 
    __END__
    ### Expected output of first run is:
    mode of BigStem.btree is 0644

    Chuck      =>   King Carlos I of Castile and Aragon ===> Spain
      AND ALSO      Emperor Charles V of the Holy Roman Empire

    Fred       =>   King Fernando II of Aragon
      AND ALSO      King Fernando V of Castile

    Phil       =>   King Felipe  I of Portugal
      AND ALSO      King Felipe II of Spain

    Liz        =>   Queen Isabel I of Castile and Leon


So while you can mix method calls on the hidden object implementing the
tie, there's nothing that says that lurking beneath a tied handle
should, must, or need be some sort of object that holds an open and valid
system file descriptor, despite whatever you may choose to have
fileno/FILENO may report.

Furthermore, just because I can call STDIN->error(), doesn't at all mean
that you can expect to call URHANDLE->error() meaningfully, where URHANDLE
is something you've tied.  You're assuming there's an underlying IO::Handle
object there, which is *not* a valid assumption in all tie handle classes,
or any other.

Consider this simple 

    #!/usr/bin/env perl5.10.0
    # randytester
    # Tom Christiansen <tchrist@perl.com>
    # Sat Aug  2 14:46:34 MDT 2008

    use 5.010_000;
    use strict;
    use warnings;
    use Tie::Randy;

    tie my $randy, "Tie::Randy";
    $randy = 10;
    say $randy for 1 .. 3;

    print "\n\nbegin again\n\n";
    $randy = 10;
    say $randy for 1 .. 5;

    print "\n\nnow with the handle version\n\n";
    tie *RANDY, "Tie::Randy";
    for (1 .. 10) {
        print scalar <RANDY>;
    } 

Here's the simple Tie::Randy class:

    package Tie::Randy;
    sub TIESCALAR {
        my $class = shift;
        bless \my $self, $class;
    }
    sub FETCH { rand() }
    sub STORE { 
        my($self, $seed);
        srand($seed)
    }
    sub TIEHANDLE { &TIESCALAR }
    sub READLINE  { rand() . "\n" }
    1;

As you see, there's no IO::Handle object you can 
invoke error() or clearerr() against.

Even when there should happen to be such a handle lurking way down there,
hidden somewhere, the tie interface makes it hard to get to.  Consider
this, which does use real IO objects.

    #!/usr/bin/env perl5.10.0
    # bidirection open2 tie demo
    # Tom Christiansen <tchrist@perl.com>
    # Sat Aug  2 14:50:08 MDT 2008
    use 5.010_000;
    use Tie::Open2;
    tie(*CALC, "Tie::Open2", "dc")
        || die qq(can't tie to "dc" command: $!);
    my $sum = 2;
    print CALC "$sum\n";
    for (1 .. 7) {
        print CALC "$sum * p\n";
        chomp($sum = <CALC>);
        print "$_: $sum\n";
    }
    close(CALC)   || warn "can't close CALC: $!";
    close(STDOUT) || die  "can't close STDOUT: $!";
    __END__
    ## Expected output:
    1: 4
    2: 16
    3: 256
    4: 65536
    5: 4294967296
    6: 18446744073709551616
    7: 340282366920938463463374607431768211456

How does that work?  Like this:

    package Tie::Open2;

    use strict;
    use Carp;
    use Tie::StdHandle;  
    use IPC::Open2;

    # do *NOT* use an @ISA here!

    sub TIEHANDLE {
	my ($class, @cmd) = @_;
	no warnings "once";
	my @fhpair = \do { local(*RDR, *WTR) };
	select((select($fhpair[1]), $|=1)[0] );  # unneeded?
	bless $_, "Tie::StdHandle" for @fhpair;
	bless(\@fhpair => $class)->OPEN(@cmd) || die;
	return \@fhpair;
    }

    sub OPEN {
	my ($self, @cmd) = @_;
	$self->CLOSE if grep {defined()} @{ $self->FILENO };
	return open2(@$self, @cmd);
    }

    sub FILENO {
	my $self = shift;
	return [ map { fileno $self->[$_] } 0,1 ];
    }

    for my $outmeth ( qw(PRINT PRINTF WRITE) ) {
	no strict "refs";  # play with my symbol table
	*$outmeth = sub {
	    my $self = shift;
	    return $self->[1]->$outmeth(@_);
	};
    }
    for my $inmeth ( qw(READ READLINE GETC) ) {
	no strict "refs";  # play with my symbol table
	*$inmeth = sub {
	    my $self = shift;
	    return $self->[0]->$inmeth(@_);
	};
    }
    for my $doppelmeth ( qw(BINMODE CLOSE EOF)) {
	no strict "refs";  # play with my symbol table
	*$doppelmeth = sub {
	    my $self = shift;
	    return $self->[0]->$doppelmeth(@_) 
			    && 
		   $self->[1]->$doppelmeth(@_);
	};
    }
    for my $deadmeth ( qw(SEEK TELL)) {
	no strict "refs";  # play with my symbol table
	*$deadmeth = sub {
	    croak("can't $deadmeth a pipe");
	};
    }
    1;

Just because I can say 

    print CALC "$sum\n"

with the object in dative position, and thanks to the tie, get the sub
Tie::Open2::PRINT invoked on the tied(CALC) object, does *NOT* mean I can
also turn around and call CALC->error() or CALC->clearerr() and expect
*that* to (miraculously) get at the IO::Handle methods of those names.
Why?  Lots of reasons, but for one, it's because *that* compiles into
"CALC"->error(), and there's not even a CALC class to be found, that's why.

Understand?

Are we having fun yet? :-)

--tom

-- 

		    +---------------------+
		    | Thought for the Day |
		    +---------------------+

    The ternary operator is unneeded in Perl, given that 
	    E1 ?  E2 :  E3
    is mainly syntactic sugar for
	    E1 && E2 || E3
    Sure, I wouldn't wanna stack'em, but so too with ?:, eh?

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About