develooper Front page | perl.perl6.users | Postings from February 2019

Re: binary test and position?

Thread Previous | Thread Next
From:
ToddAndMargo via perl6-users
Date:
February 5, 2019 07:47
Subject:
Re: binary test and position?
Message ID:
83a7c259-5ee5-ca07-7088-7a9a6696aa97@zoho.com
On 2/2/19 9:29 PM, Brad Gilbert wrote:
> Subs do not need to have a `return` statement if it is returning the last value.
> 
> You also broke the return value of the subroutine that I wrote by
> assigning it to a variable.
> 
> What I wrote would return `Nil` if it failed to find a match, yours
> will return an undefined `Int`.
> It should return `Nil` because that is what `index` would return.
> 
> Doing bytewise conversion from Buf to Str is pointless. It will break
> on Unicode data.
> It would also be exactly the same as converting ASCII if it worked.
> (It won't work on binary data)
> 
> If you are dealing with something that is mostly Unicode but also has
> binary data
> decode using 'utf8-c8'.
> 
> If you are dealing with something that is mostly binary, decode using 'latin1',
> or just use raw bytes in a buffer.
> 
>      my Buf $buffer = $fh.read(10);
>      my Str $string = $buffer.decode('latin1');
> 
>      # the first three bytes were really a Utf8 encoded character
>      my Str $char = $string.substr(0,3).encode('latin1').decode('utf8');
>      # or
>      my Str $char = $buffer.subbuf(0,3).decode('utf8');
> 
> Also note that `encode` doesn't always return a Buf.
> 
>      my Buf $buf = Buf.new( 'hello'.encode('utf8') );
> 
> ---
> 
> The subroutine I wrote was simplified to work for an Array or List, not a Buf.
> 
> It is also weird that you are using CamelCase for variables,
> and a mixture of CamelCase and snake-case for the subroutine name.
> 
> 
> Improving your variant, and changing it so the second parameter is a Buf.
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List; # only call .List once
>          my Any $Position is default(Nil) = Nil;
>          my Int $Elems = $Matcher.elems;
> 
>          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> $Matcher, :k);
>          return $Position;
>      }
> 
> `$Position` has to be `Any` (or `Mu`) so that it can store the value `Nil`.
> `Nil` sets a variable to its default, so we have to change the default
> with `is default(Nil)`.
> (The normal default is the same as the container type)
> (The `= Nil;` is always pointless in the declaration of a `$scalar` variable.)
> 
> One simplification is to just have the return value as the last thing
> in the subroutine without a `return`.
> (It may also be slightly faster, but not by much.)
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Any $Position is default(Nil) = Nil;
>          my Int $Elems = $Matcher.elems;
> 
>          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> $Matcher, :k);
>          $Position; # <------------
>      }
> 
> Assignment is a rvalue, so we can remove that last line
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Any $Position is default(Nil) = Nil;
>          my Int $Elems = $Matcher.elems;
> 
>          $Position = $Buffer.rotor($Elems => 1- $Elems).first(* eqv
> $Matcher, :k);
>          # <-----------
>      }
> 
> `$Position` is now completely pointless.
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>          # ^
>      }
> 
> If you want `return` (even though it isn't doing anything)
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf ) {
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>          # ^
>      }
> 
> You could also declare the type of the return value
> 
>      sub Buf-Index ( Buf $Buffer, Buf $SubBuf --> Int ) { # <----
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          return $Buffer.rotor($Elems => 1- $Elems).first(* eqv $Matcher, :k);
>      }
> 
> Note that `Nil` can sneak around the return value type check.
> 
> ---
> 
> As an added bonus, here is a subroutine that returns all of the indices.
> (Note that the only differences are `grep` rather than `first`, and
> the return type)
> 
>      sub Buf-Indices ( Buf $Buffer, Buf $SubBuf --> Seq ) {
>          my List $Matcher = $SubBuf.List;
>          my Int $Elems = $Matcher.elems;
> 
>          return $Buffer.rotor($Elems => 1- $Elems).grep(* eqv $Matcher, :k);
>      }
> 
>      }
> 
> On Sat, Feb 2, 2019 at 10:05 PM ToddAndMargo via perl6-users
> <perl6-users@perl.org> wrote:
>>
>> On 2/2/19 6:09 AM, Brad Gilbert wrote:
>>>       sub buf-index ( Buf $buf, +@match ) {
>>>           my $elems = @match.elems;
>>>           $buf.rotor( $elems => 1 - $elems ).first(* eqv @match.List, :k)
>>>       }
>>>
>>>       my $buf = Buf[uint8].new(0x4D, 0x5A, 0x90, 0x00, 0x03);
>>>
>>>       say buf-index( $buf, (0x90, 0x00, 0x03)); # 2
>>
>> What did I do wrong?
>>
>> First I did a byte wise conversion of
>>
>>      Buf $BinaryFile   to   Str $StrFile
>>
>> and
>>
>>      Buf $VersionInfoBuf  to  Str $VersionInfoStr
>>
>>
>>
>> sub Buf-Index ( Buf $Buffer, +@SubBuf ) {
>>      # `index` for buffers
>>      # $Buffer is the buffer to search through
>>      # $ +@SubBuf is the sub buffer pattern to search for in $Buffer
>>      # returns the first instance of a match, Nil if no match
>>
>>      my Int $Position = Nil;
>>      my $Elems = @SubBuf.elems;
>>
>>      $Position = $Buffer.rotor( $Elems => 1 - $Elems ).first( * eqv
>> @SubBuf.List, :k );
>>      return $Position;
>> }
>>
>>      $i  = index(     $StrFile,    $VersionInfoStr );
>>      $bi = Buf-Index( $BinaryFile, $VersionInfoBuf );
>>      say "i = <$i>   bi = <$bi>";
>>
>>
>>
>>
>> $ FileVer.pl6
>> i = <11371>   bi = <>
>>
>>
>> 11371 is correct.
>>
>>
>>
>> What did I do wrong?
>>
>> Many thanks,
>> -T


Got it working.  Thank you!

It is a tad slow.  Depending on the file's size, it is
20 to 190 times slower than "index".

I have a lot of thinking to do.

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About