develooper Front page | perl.perl5.porters | Postings from May 2008

Re: Archive::Tar issue with VMS - M::B::ppm.t

Thread Previous
From:
John E. Malmberg
Date:
May 29, 2008 22:44
Subject:
Re: Archive::Tar issue with VMS - M::B::ppm.t
Message ID:
483F942D.5000202@qsl.net
Craig A. Berry wrote:
> On Tue, May 27, 2008 at 12:31 AM, John E. Malmberg <wb8tyw@qsl.net> wrote:
>> Hello Jos,
>>
>> I have been trying to track down an issue with why ppm.t is failing on VMS.
> 
>> It appears that Archive::Tar is creating a corrupt archive, which it can not
>> decode.
> 
> Do you have a small reproducer, preferably a one-liner, that
> demonstrates this problem?

Not yet.  But I think I understand what is happening.  The file that is 
causing the problem is an Alpha binary in both cases.

I just patched my copy of Archive::Tar as a test, and now ppm.t is only 
failing because the files in the archive are all in lower case, and it 
is looking for an exact case match.

What I think is happening is an artifact of the default way that the VMS 
  C library does I/O, and my hack bears me out.

Some terms for those not familiar with VMS file format.

Stream format is the name for how Unix and Windows structures files, 
where they are just a stream of bytes.

Record format is the native VMS type and there are many different 
standard record formats.  Fixed length binary record formats can 
normally be treated the same as stream format.

My hack is to use VMS::Stdio::vmsopen to force the access to be in 
stream mode for all files that Archive::Tar opened.

The files created by Perl are stream-lf format, and when the C library 
reads them, it reads them in stream mode.  When the C library reads a 
record oriented file, it reads it in record mode.  In record mode, it 
only reads the amount of bytes that are in the record, even if more are 
present.  If you read less bytes than what are in the record, those 
bytes are ignored, and the next read starts at the beginning of the next 
record.

Archive::Tar is reading in the file with the <fh> operator.  So it looks 
like this issue may have implications with how Perl does I/O.

The problem with forcing stream access is that it will not correctly 
read normal VMS text file correctly.  So stream access should only be 
forced for files of fixed sized record formats, or indexed files.

Now of those type of files, only some like .BCK files and .EXE can 
survive and have their attributes recovered.

The other issue is why the <$fh> read did not pull in the entire file. 
This is why the tarball created was corrupt.  Archive::Tar apparently 
assumed that it had read the file to the size that the file claimed to 
be, and did not realize that it was short.

Again, this may be an issue with how <$fh> reads are implemented that 
has not been noticed before.

And there is another issue.  The file size reported on VMS is accurate 
for stream files and fixed format binary files, but not on normal VMS 
text files.  This is because of the way the files are structured.  The 
VMS library does a conversion to make the normal VMS text files look 
like stream format files.

In gnu tar, it also has a problem doing a short read, when creating a 
tarball with the VMS Alpha executable, but it detects this and pads the 
archive to make up the difference, so the archive is not corrupt, even 
though the file is.

So once we get this issue fixed in Archive::Tar, I will need to put a 
similar fix into gnu tar for VMS.

-John
wb8tyw@qsl.net
Personal Opinion Only



Thread Previous


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About