On Thu, Mar 11, 2010 at 01:15:49PM -0700, karl williamson wrote: > Jan Dubois wrote: > >On Mon, 08 Mar 2010, Nicholas Clark wrote: > >>However, even 7-14Mb doesn't scare me. On any demand paged OS, if it's not > >>used, it's not paged in. I doubt that we have any platform where we're > >>that > >>tight on virtual memory space for executables. > > > >It is also a little bit of an issue for people who distribute stand-alone > >applications written in Perl and bundled into a single executable with > >tools > >like PAR, PerlApp or Perl2Exe. That was one case I hadn't considered. > >Right now you can fine tune the size of your bundled Unicode support by > >for example removing the 4 big tables for the major Asian scripts and > >still have Unicode support for all the European languages. Having just one > >huge table would likely make this impossible. > > > >Having such a big footprint is also a huge drag on application startup > >time if they have to extract the table to disk every time the application > >is > >run (yes, you can leave the files behind in a cache area, but some people > >prefer not to leave stuff behind when the application exits). > > > > As a compromise measure, couldn't these tables be dynamically loaded on > demand? I would guess that Perl already has a mechanism to do this, but > if not, a malloc and copy from disk would work. For that a read only mmap() would be better. It allows the memory to be shared between processes (when the files are not extracted temporaries) > >I would also be curious how good those 7-14MB could be compressed, as that > >too would make a big difference for the actual filesize of packaged apps. There are some lovely suggestions for tight data representations in the link that Ævar mailed to the list last night: http://swtch.com/~rsc/regexp/regexp3.html I particularly liked the concept of doing Unicode classes as DFAs on the octets of the UTF-8 representations of the code points. That might be a win for us. (Independent of the more radical ideas of adaptively choosing a DFA or an NFA based on the feature complexity of the pattern compiled) (It also references this insane idea: http://research.swtch.com/2008/03/using-uninitialized-memory-for-fun-and.html A technique with a stated caveat of "This would drive valgrind nuts.") Nicholas ClarkThread Previous | Thread Next