develooper Front page | perl.perl6.language | Postings from May 2004

question regarding rules and bytes vs characters

Thread Next
From:
Ph. Marek
Date:
May 31, 2004 22:56
Subject:
question regarding rules and bytes vs characters
Message ID:
200406010756.41486.philipp.marek@bmlv.gv.at
Hello everybody,

I'm about to learn myself perl6 (after using perl5 for some time).

One of my first questions deals with regexes.


I'd like to parse data of the form
	Len: 15\n
	(15 bytes data)\n
	Len: 5\n
	(5 bytes data)\n
	\n
	OtherTag: some value here\n
and so on, where the data can (and will) be binary.

I'd try for something like
	my $data_tag= rule { 
		Len\: $len:=(\d) \n 
		$data:=([:u0 .]<$len>)\n  # these are bytes
	};

Is that correct?

And furthermore is perl6 said to be unicode-ready.
So I put the :u0-modifier in the data-regex; will that DWIM if I try to match 
a unicode-string with that rule?


Is anything known about the internals of pattern matching whether the 
hypothetical variables will consume (double) space?
I'm asking because I imagine getting a tag like "Len: 200000000" and then 
having problems with 256MB RAM. Matching shouldn't be a problem according to 
apo 5 (see the chapter "RFC 093: Regex: Support for incremental pattern 
matching") but I'll maybe have troubles using the matched data?


Thank you for all answers!


Regards,

Phil

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About