develooper Front page | perl.perl5.porters | Postings from October 2015

[perl #126414] perl rounds inode in PP stat

Thread Next
Tony Cook via RT
October 27, 2015 03:45
[perl #126414] perl rounds inode in PP stat
Message ID:
On Tue Oct 20 18:46:47 2015, bulk88 wrote:
> While I was thinking about implementing real inode info in Win32's PP
> stat()[1], I discovered that Perl rounds the inode integer.
> associated ticket
> An inode has 2 uses AFAIK.
> -undelete a file
> -compare hard if they are the same file
> An inode is effectively an opaque pointer (integer) into a FS. If the
> OS's st_ino is 64b (Win32 inode is always 64b per FS driver API) on
> 32b IV perl, it has the potential of being rounded if its > 2^53 and
> therefore is garbage/uninitialized. In a DB on FS scheme, bad things
> could happen if 2 files that aren't links in reality, "==" in perl as
> to being the same file.
> An inode can be implemented in a couple ways by any FS.
> -a 0 based offset into an array of something (offsets or structs) on
> the disk
> -an absolute sector (units of 512 bytes) or byte position into the FS
> partition
> -random looking hash number
> -XOR against a secret of any of the 3 above
> [1]
> The last 2 implementations would give very frequent high bits that are
> > 2^53. Worst case scenario for 0 based inode FS is 2^53 bytes, which
> is 9007 TB of storage, which I dont think anyone would use 32 bit perl
> on a enterprise SAN/cluster, but that doesn't address the theoretical
> hash/xor/checksum based inode FSes.
> Everyone agrees storing 64 bit C pointers in 64 bit doubles is
> forbidden, so why is perl storing inodes in NVs/doubles? Can something
> be done about this? Fatally error if its over 2^32 or over 2^53? Store
> the 64 bit integer as a SVPV in printable ASCII and let the user in PP
> figure out what to do with it (Math::Int64 it)?

I can see us producing some sort of error if the inode number changes value when stored as a NV, perhaps with an option to disable that error, since stat() isn't only used to fetch a file's inode number.

As to how to behave when the inode number doesn't fit, we can look at existing implementations, from the Solaris stat() man page:

     The stat(), fstat(), and lstat() functions may fail if:

     EOVERFLOW    One of the members is too large to store in the
                  stat structure pointed to by buf.

Similarly on FreeBSD:

     [EOVERFLOW]        The file size in bytes cannot be represented correctly
                        in the structure pointed to by sb.

If we follow the lead of Solaris/FreeBSD, perl's stat() would simply fail with EOVERFLOW (if available) if the inode number doesn't fit.

As to optionally disabling the error, we could use a variable like ${^WIN32_SLOPPY_STAT}, perhaps ${^SLOPPY_STAT_INO}, or something lexically scoped, to keep the sloppy behaviour restricted to a small amount of code.

Sort of related: systems with large files can run into this issue for st_size:

$ ./perl -le 'print +(stat "/home/tony/somefile.txt")[7]'
$ ls -l ~/somefile.txt
-rw-r--r--  1 tony  tony  90081992547409912 Oct 27 03:43 /home/tony/somefile.txt

(that's a sparse file on ZFS)


via perlbug:  queue: perl5 status: new

Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About