develooper Front page | perl.pep | Postings from August 2016

Re: Email::Address::XS

Thread Previous | Thread Next
From:
pali
Date:
August 2, 2016 21:03
Subject:
Re: Email::Address::XS
Message ID:
201608022303.10081@pali
On Tuesday 02 August 2016 01:00:02 Ricardo Signes wrote:
> * pali@cpan.org [2016-07-12T11:43:02]
> 
> > On Monday 04 July 2016 01:52:41 Ricardo Signes wrote:
> > > I'd stick to header_str, I think, but I'm not sure.  At any rate:
> > > yes.
> > 
> > And this is what I do not like... to pass objects to function with
> > name header_str. That name sounds like it takes string, not object
> > (or more objects)...
> 
> Either we can add a new name, so people end up having to give
> "header_str" and "header_obj" or we can say "in general everything
> uses header_str, which follows these simple rules."  I would rather
> do that.

I can imagine, that people could be confused about header_str meaning. 
It has suffix _str and I would expect it needs (Unicode) string, not 
object... Name "header" is better as it does not say it needs string.

> > > > Still do not know how to handle non-MIME headers correctly in
> > > > Email::MIME module. We can either create blacklist of non-MIME
> > > > headers and extend it every time when somebody report problem
> > > > or create whitelist of MIME headers... Or let caller to decide
> > > > if his header must be MIME-encoded or not.
> > > 
> > > I'm sorry, I don't understand.  Could you elaborate?
> > 
> > If passed pair (header-name, header-value) needs to be MIME encoded
> > or not. Currently there is blacklist in Email::MIME for header
> > names which are never MIME encoded (like Message-Id, Date, ...)
> > when passing as header_str.
> 
> So, I'd assume we'd go forward with:
> 
> * if you know exactly octets you, the user, want in the header field,
> use "header", but this is likely rare

Do you mean $email->header_raw_set()?

I think it is not rare to encode header (to MIME) externally and then 
pass ASCII 7bit string to $email. At least I see this usage for From 
header (in previous version of Email::MIME encoding of From/To/Cc 
headers was totally broken).

> * if you want to provide a string for a field that's pretty much just
> a string, use header_str and if it requires special handling, we do
> our best, which should get better over time

I fully agree.

> * but if things are complicated, use an object that represents the
> structured data

Yes.

> I don't like the idea that this will be broken further by adding the
> object behavior, though.
> 
>   $email->header_str_set($field => $email->header($field));
> 
> ...should not break things.
> 
> > > "header_str" is "text string" which means it will get encoded.
> > 
> > Not exactly, there are exceptions (Message-Id, Date, ...) plus
> > special behaviour for addresses headers.
> 
> Those /mostly/ still get encoded, but we know that the strings are
> meant to be structured, so we try to deconstruct them and encode
> them correctly. I think those fields that get passed through
> unchanged are probably in error at least insofar as they let you put
> non-7-bit-clean data in your headers.  This should probably be
> fatal:
> 
>   header_str => [ Date => "\N{SMILING FACE WITH HORNS}" ]

Here is problem: Should Email::MIME understand meaning of email headers?

--> If yes, then for Date should be accepted only valid Date header 
(according to RFC!) and so Unicode string is disallowed.

--> If not, then Email::MIME should not distinguish between header Date 
and X-MyOwnDate. And so it should be allowed to MIME encode string for 
headers.

But Email::MIME currently do something between...

Here we see that header_str does not say (or specify) which string must 
be specified as parameter. Unicode string? Arbitrary 8bit string? 7bit 
ASCII string? Or ASCII subset visible characters?

I think we should unify API for it. And ideally describe into 
documentation how to correctly use it.

That /mostly/ with special exceptions for Message-Id or Date is wrong.

> > Addresses and groups are really something different as previous
> > types (strings). And if we threat them as objects, I would rather
> > see e.g. header_obj (or other different name) instead mixing it
> > again with header_str (which already have exceptions :-(). This is
> > my initial reason for header_addr/grps to distinguish it.
> 
> My feeling is that Perl programmers are used to polymorphic
> interfaces, and that multiplying the number of ways to specify
> headers is a needless confusion. What is the benefit to the end user
> of splitting things up?

I see at least 3 benefits:

1) Function name say what it accept

2) No problem with meaning which type of string is accepted (subset 
ASCII, ASCII or Unicode as described above)

3) Possible performance optimization (less objects are created)

And there is another problem still not solved. From $email object it is 
needed also to read From/To/Cc headers and user (caller) of Email::MIME 
module is sometimes interested in de-composited addresses objects (e.g. 
when want to parse each email address in CC field) and sometimes 
interested only in one string representation (e.g. want to write header 
to STDOUT)...

With explicit $email->header_str() $email->header_addr() and also 
$email->header_grps() calls user get type which wants. I cannot imagine 
without 3 different calls how to achieve it.

----

But if you still prefer that there should be only one function which 
accept both objects and strings, lets define its name, how should it act 
on different types of strings + header names. And also how user of 
Email::MIME can receive for arbitrary header Unicode string value...

Thread Previous | Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About