On Tue, Jun 24, 2014 at 11:53:02AM +0100, Dave Mitchell wrote: > On Sat, Jun 21, 2014 at 01:06:42PM -0600, Karl Williamson wrote: > > On 06/20/2014 07:53 PM, Mark Martinec (via RT) wrote: > > >Under perl 5.20.0 the following program fails (or warns) on: > > > > > > Malformed UTF-8 character (unexpected end of string) > > > in substitution iterator at ./test.pl line 16. > > I can reduce the demo code to the following: > > $ p -Twe '$_ = "XXXX\x{1000}aaaaaaaaaaaaaaaaaXX" . $^X; s/X/"xxxxxx"/ge' > Malformed UTF-8 character (unexpected end of string) in substitution iterator at -e line 1. > $ > > I haven't looked into it any further yet. Now fixed with the following. A good candidate for 5.20.1 commit cda67c9995c6d90b71a0939aaae084e1869b8248 Author: David Mitchell <davem@iabyn.com> AuthorDate: Wed Jul 2 17:13:45 2014 +0100 Commit: David Mitchell <davem@iabyn.com> CommitDate: Wed Jul 2 17:22:52 2014 +0100 s///e on tainted utf8 strings got pos() messed up RT #122148: In 5.20, commit 25fdce4a165 changed the way pos() was stored in magic attached to SVs from being a byte offset to a char offset, *except* that, for efficiency, strings being used for pattern matching were kept as byte offsets (with a flag indicating thus), *except* where the SV already had magic attached (such as taint, as in the bug report and in this commit's test), in which case it kept it as chars. The code that updated pos() after an iteration of s///e was faulty: the string buffer it used for converting byte legnths to char lengths (via utf8_length()) was the wrong buffer: rather than using the src string being matched against, it was using the destination string being built up via iterations of s///. Once double-byte utf8 chars were involved, all the pos() calculations went wrong, and utf8 warnings started mysteriously appearing. -- No man treats a motor car as foolishly as he treats another human being. When the car will not go, he does not attribute its annoying behaviour to sin, he does not say, You are a wicked motorcar, and I shall not give you any more petrol until you go. He attempts to find out what is wrong and set it right. -- Bertrand Russell, Has Religion Made Useful Contributions to Civilization?Thread Previous