develooper Front page | perl.perl5.porters | Postings from September 2012

Re: What is UTF8CACHE for?

Thread Previous
Karl Williamson
September 26, 2012 11:06
Re: What is UTF8CACHE for?
Message ID:
On 08/24/2012 12:03 PM, Dave Mitchell wrote:
> On Fri, Aug 24, 2012 at 11:32:08AM -0600, Karl Williamson wrote:
>> On 08/24/2012 11:11 AM, Dave Mitchell wrote:
>>> Magic can be attached to a utf8ish SV that caches a length and up to two
>>> further mappings between byte and char offsets. Used chiefly by
>>> sv_len_utf8(), sv_pos_b2u() and sv_pos_u2b(), and thus indirectly by pos()
>>> etc.
>>> See also utf8_mg_pos_cache_update() and utf8_mg_len_cache_update().
>>> Its what stops the pos() function and the length() function having to scan
>>> a whole or part of string to convert a bytes offset into a char offset
>>> each time.
>> Thank you.  Is it all right if I add a shortened form of this to
>> perlvar.pod?
> Well, its an internal implementation detail, highly subject to change.
> So perhaps just something to the effect that "it *might* be used to cache
> byte to char length conversions on things like length() and pos()",
> without giving any details like the fact that it's implemented using
> magic etc.

Now done by commit 94df5432700afa9b1cda1919857f958a0af99066
  < This variable was added in Perl v5.8.9.
  > This variable was added in Perl v5.8.9.  It is subject to change or
  > removal without notice, but is currently used to avoid recalculating the
  > boundaries of multi-byte UTF-8-encoded characters.

Thread Previous Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About