develooper Front page | perl.perl6.internals | Postings from September 2005

Re: [RFC] Debug Segment, HLL Debug Segment And Source Segment

Thread Previous | Thread Next
Jonathan Worthington
September 27, 2005 04:00
Re: [RFC] Debug Segment, HLL Debug Segment And Source Segment
Message ID:
Rumour has it this thread got warnocked... ;-)  My original task from leo 
was to sort out the PASM and PIR debug segment to handle multiple files.  I 
thought I might try and sort out the HLL debug seg while I was on the job. 
From Roger's input and further discussion on IRC, it seems that we need 
something more clever for the HLL debug seg than the PASM/PIR one.  So, I'll 
back off trying to deal with HLL debug for now (provided my supply of time 
goes on, I'll try and come back to that in the not too distant future) and 
implement something much like I spec'd for PASM and PIR, which only needs a 
simple debug segment with file and line number.

"Roger Browne" <> wrote:
> Great! Anything that brings parrot closer to being able to report the
> HLL filename and line numbers is a good thing!
Seems there will be a slighlty longer wait on this one now, but this is very 
much needed, I agree.

>> ... the idea would seem to be
>> that this segment can contain source code.  I suspect the intention of it
>> was to store the source code of high level languages rather than PASM or
>> PIR.
> I don't think Parrot should care about what languages are in the source
> segments. If someone is writing directly in PASM or PIR, that can go in
> a source segment. If someone is writing in a high-level langauge, that
> can go in a source segment. If someone is writing data from which HLL
> code is generated by some utility (e.g. yacc, a UML tool, or a GUI
> designer), that data can go in a source segment too.
> Any kind of source code for which there exists some kind of debugging
> tool is a candidate to go into a source segment. This implies that there
> could be more than one source segment per .pbc file, and more than one
> source location for each opcode. It also implies that (eventually)
> parrot will have a way of knowing how to call all the candidate
> debuggers for a particular bytecode location (according to which source
> language the programmer wants to debug in).
> [Incidentally, source segments may also meet the needs of those who wish
> to distribute source with every application, without burdening those who
> just want to run the compiled code.]
Pretty much agree with this.

> ...
>> 2) Allowing for a reference into the source segment in place of a 
>> filename.
> Some development tools are still going to want the filename, even if
> there is a corresponding source segment in the .pbc file. I think it
> should be possible to include both.
I was thinking of putting the filename in the source segment, so you could 
iterate over the source segments and get the filenames of the source files. 
So the filenames would be there.

>> This change is incompatible with the current debug segment format.  But
>> that's OK, we're still in development.
> Sure, but if we're going to change it, let's change it to something
> general that won't need to be changed again after version 1.0 is
> released.
This is the argument that makes me think we hold off the HLL debug seg for a 
little while, until somebody (maybe myself) can come up with a design that 
meets the needs of HLLs better.

> This is something that Dan Sugalski mooted in his "WCB: Full bytecode
> metadata" blog entry:
> I like the idea that each HLL can store whatever kind of metadata it
> wants. In particular, I'd like to have my Amber compiler put column
> numbers as well as line numbers into the .pbc file, and perhaps even
> information about which optimizations it has applied.
Yeah, though we also have to consider how Parrot will know what metadata to 
show when an error occurs.  I guess we need something per language that gets 
called along with a reference to the appropriate chunk of meta-data for the 
current location and knows how to render an error message for that language. 
Then just have a default way to dump the data when this is not supplied. 
Also need some thought with regard to how we can efficiently store such 
metadata in a packfile.

>> 3) Still being space-efficient on disk
> Source segments should probably be compressed. There's a lot of
> repetition and whitespace in most source languages, so they tend to
> compress really well. Any reference into the source would be an offset
> into the uncompressed source (which would only need to be uncompressed
> during debugging runs).
Hadn't thought of this...may be a good idea provided we can find a cheap to 
implement and free of legal issues compression algorithm.  I'll admit now to 
not knowing a great deal about this kinda stuff.

>> The opcode stream will contain one line number per
>> bytecode instruction.
> You are proposing to use a chain of mappings to record the filename; why
> not use the same system for recording all kinds of metadata including
> line numbers? Sure, there's a small performance penalty - only during
> debugging runs - but there's a worthwhile space saving on disk (because
> typical HLLs produce a lot of bytecodes per line of source).
HLLs do, but for PASM/PIR that isn't the case.  Thus another reason to do 
something different for each.



Thread Previous | Thread Next Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at | Group listing | About