develooper Front page | perl.perl5.porters | Postings from February 2008

Openvms largefile getpos/setpos

Thread Next
From:
Hein, Nashua NH
Date:
February 7, 2008 01:50
Subject:
Openvms largefile getpos/setpos
Message ID:
62806115-634e-49e2-a5e6-70c03e4103bd@c4g2000hsg.googlegroups.com
Hope this is the right place to ask...  If not, feel free to tell me
to 'go away' as long as you indicate a better place(http://
www.cpanforum.com/ ?) .

So I am experimenting with getpos/setpos on OpenVMS for large files (>
2GB) and having some trouble using perl 5.8.6 on I64.

I realize I should try 5.8.10, and will install that asap.
And I should probably try the raw C-RTL fgetpos/fsetpos also
But in the mean time...

My first issue, being an RMS hacker used the RFA's, is the getpos
essentially returns the position for the next record which my not
exist (yet). I guess that just takes getting used to.

If I look into the returned position variable as 2 unsigned ints, then
I pretty much see a byte offset for sequential files, and an 'encoded'
RFA for indexed files. A cookie perhaps? Perl is probably just
passsing along the c-rtl variable typically declared as: fpos_t posit;

For sequential files, that value suggest usage as simple 32-bit
unsigned int, not an opaque 2*32bit as I would expect to be able to
handle files > 4GB.

For SETPOS, only values up to 2GB seem to work!
Something is using signed variables in there it seems!
Or is it me, being clumsy?

Test below.
Comments?
Thanks!
Hein van den Heuvel

To test I created a 5GB file using 50,000,000 records of 99bytes + 1
newline. Do NOT use default IO. Takes too long!
Pre-alloated of course. And use sqo to prevent high-water marking
going crazy, 4 large buffers is enough (only ever saw 2 being written
from, while an other 1 was being filled).

--------------------- create_big.pl ------------------
use VMS::Stdio (qw(&vmsopen));
$fh = vmsopen(qw(>big.tmp alq=10000000 fop=sqo ctx=rec mbc=112 mbf=4
rop=wbh));
printf $fh qq(%09d%90s\n),$_,q(*) foreach (1..50000000);

-------------------- test_big.pl -----------------
use VMS::Stdio (qw(&vmsopen));
use filehandle;
my $i = 0;
my $records = 0;
my $position;
my @positions;

my $fh = vmsopen(qw(<big.tmp fop=sqo ctx=rec mbf=8 rop=rah)) or die
"$!";
  while (<$fh>) {
    # real work goes here...
    $records++; # just count for now.
    if (!($records % 1000000)) {
      $position = $fh->getpos;
      my ($a,$b) = unpack('LL',$position);
      printf "$records records at (%0X,%0X) %s\n", $a, $b, substr($_,
0,20);
      $positions[++$i] = $position;
    }
}
print "$records records.\n";
undef $fh;
my $fh = vmsopen(qw(<big.tmp ctx=rec mbf=1 mbc=1)) or die "$!";
foreach (1 .. $i) {
  $fh->setpos($positions[$_]);
  $line = <$fh>;
  printf ("%4d %s\n", $_, substr ($line,0,20))
}

--------------- result ----------
1000000 records at (5F5E100,0) 001000000
2000000 records at (BEBC200,0) 002000000
3000000 records at (11E1A300,0) 003000000
:
21000000 records at (7D2B7500,0) 021000000
22000000 records at (83215600,0) 022000000
:
42000000 records at (FA56EA00,0) 042000000
43000000 records at (4CCB00,0) 043000000
44000000 records at (642AC00,0) 044000000
:
50000000 records.
   1 001000001
   2 002000001
   3 003000001
:
  21 021000001
  22
:
  42
  43 50328
:
  50 50328



Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About