develooper Front page | perl.dbi2.dev | Postings from July 2009

Re: Perl 6 DBI API ideas

Thread Next
From:
Darren Duncan
Date:
July 31, 2009 17:16
Subject:
Re: Perl 6 DBI API ideas
Message ID:
4A738930.4070608@darrenduncan.net
Ken Youens-Clark wrote (to dbi-users):
> Tim,
> 
> I spoke to you briefly at the OSCON last week during the Perl Lightning 
> Talks (I gave one on SQL::Translator).  Though I didn't raise my hand 
> when you asked for people to contribute to a DBI module for Perl 6, I've 
> had some ideas that I thought about sharing.  Part of what I'm thinking 
> is based on the "7 Principles of Better API Design" tutorial I heard 
> Damian give earlier in the week.  There he made a point that a great 
> place to start designing a module is to first decide how you want to 
> use/call it, and I definitely agree with that.  So I thought I might 
> send you some ideas about P6 DBI API.

Ken, thanks for bringing the ideas up.

By the way, there is a dedicated list for this, dbd2-dev@perl.org, which I have 
also sent this reply to.  (There is also a dbdidev@perl.org, but AFAIK that is 
about a lower-level Parrot layer.)

However, I think we need to be more progressive than what you have suggested.  I 
will suggest further enhancements next to your suggestions.  And yes, I *have* 
thought about this a long time.

> 1) I respectfully submit to use "new" instead of "connect" to 
> instantiate a connection.
> 
> And since I always include "{ RaiseError => 1 }," I would request that 
> this argument be the default and that someone would have to turn this 
> off explicitly.
> 
> I would also like to see the DSN change since they all start with "dbi:" 
> -- couldn't that be left off, then?  I would also love to see some 
> standardization on the driver names ("mysql" when it's normally written 
> "MySQL," "Pg" when the db is called "PostgreSQL," etc.).  Also, WRT 
> there being "no standard for the text following the driver name," 
> perhaps there should be?
> 
> use DBI;
> 
> my $dbh = DBI.new( driver => 'MySQL', dsn => 
> 'host=cabot;database=markers30', user => 'web_ro' ... );

One thing I find very distasteful about DBI as it is now is that it has 
hard-coded knowledge and names in it about various separately distributed driver 
plugins, whereas really the DBI shouldn't have any such names.  I refer for 
example to $dbd_prefix_registry in DBI.pm.

So for starters, the "driver" argument or portion of a dsn needs to simply 
contain the actual package name of the module, for example:

   driver => 'DBD::SQLite'

... and this will remove any need for DBI to either know how to map, or for it 
to have a cludgy assumption of prefixing the argument with "DBD::".  And then it 
also gives extension writers more flexibility to name their module what they 
want; Perl in general doesn't ascribe any special meaning to namespaces, as 
those are just a community practice, and DBI shouldn't hard-code this.

Secondly, DBI shouldn't need any special knowledge of a registered prefix to 
invoke driver-provided methods.  Rather, if you want a solution which is most 
similar to the current one, then DBI should just designate a single specific 
prefix, such as "driver_", and since things are all done as object-methods these 
days, no single object would have extra methods for multiple drivers.  And if 
you still want to support non-OO syntax, then people can just invoke eg:

   DBD::SQLite::foo( $dbh, ... )

... or what have you, and the driver can individually choose to export.

For that matter, in the interest of being more Perl 6 savvy, DBI itself should 
mostly just be a set of roles, driver modules be classes that compose/do those 
roles, and users would just 'use' the driver modules directly; eg:

   use DBD::SQLite;

   my $dbh = DBD::SQLite.new( dsn => ... );

Or, if you want to avoid sprinkling your DBMS-agnostic user code with the name 
of a driver, something like DBI's connect could exist as a wrapper for the above:

   use DBI;

   my $dbh = DBI.new( driver => 'DBD::SQLite', dsn => ... );

... much as you're used to now.  And the driver argument could even come from a 
config file / not be in code.

But the point is that $dbh is a *DBD::SQLite* object, that *does* DBI, not *is* 
a DBI object that wraps the driver.

An advantage of working this way is that at the end of the day it is each DBI 
driver itself that chooses how it is implemented and what API it has.  By not 
necessarily going through an extra call layer of the DBI itself, we may get 
better performance.  Including DBI roles in driver classes is at least making a 
programmatic declaration that the driver objects conform to the DBI API, and 
also they can reuse generic code that DBI provides for all drivers by default.

For that matter, thanks to Moose, we can do this in Perl 5 too if we wanted to, 
but at least the Perl 6 version should work that way, and I believe it is the 
best practice for Perl 6.

As a tangent, other roles can exist for reusability; for example DBD::File could 
just be a role, and so eg the new DBD::CSV would compose both DBI and DBD::File. 
  Except that presumably DBD::File would have some other name that doesn't say 
that it is a file itself.

See also my DBI-alike Muldis::Rosetta framework (still in development), 
specifically 
http://search.cpan.org/dist/Muldis-Rosetta/lib/Muldis/Rosetta/Interface.pm , for 
a concrete demonstration of how this might work.  Muldis::Rosetta has, as much 
as possible, the exact same API in Perl 5 and Perl 6, with Perl 6 being the 
driver for appropriate design, and Moose giving that for Perl 5.

Now following that example, the new DBI could also be further abstracted. 
Rather than having "connection" objects" at the root, it could have "virtual 
machine" objects, one per instantiated driver, and "process" objects, the latter 
which most directly correspond to the "connection" of old, except 
connect/disconnect is something you do within the context of a "process", but 
that connect/disconnect is called mount/unmount, analogous to a file system, and 
supporting that you can connect to multiple data stores within one process, 
allowing you to do cross-database queries, such as SQLite and some other DBMSs 
natively support (SQLite calls it ATTACH/DETACH, Oracle calls it CONNECT TO ... 
AUTHORIZED BY ..., and so on), but that you use the same syntax at the DBI level 
to connect to the first database as to the second.  For example:

   use DBI;

   my $dbms = DBI::new_machine( driver => 'DBD::SQLite' );
     # note, $dbms is recommended to be a singleton per driver

   my $dbh = $dbms.new_process();  # always auto-commit by default

   $dbh.mount_db( dsn => ... );  # you can do this more than once

   my $sth = $dbh.compile( $sql );  # aka prepare()

   $dbh.start_trans();  # can do this more'n once too, for nested

   $sth.execute( arg1 => $arg1, arg2 => $arg2 );

   $dbh.execute( $sql2, argX => $argX );

   $dbh.commit_trans();

   $dbh.unmount_db( dsn => ... );  # fails if a transac is curr active

   # later, the $sth/$dbh/$dbms objects can be garbage collected

Another thing DBI can do is have a property of "process"/etc objects, or extra 
arguments to "compile"/"prepare"/etc, where one can explicitly name what 
language the code they are writing is in.

DBI is supposed to be query language agnostic, even if SQL is the most commonly 
used group of languages, and if a user can declare this explicitly, it saves the 
driver from having to guess what they were given, which might be ambiguous.

Mainly this is for when a single driver might support multiple query languages, 
more than one might be code strings, others might be Perl structs such as what 
SQL generation modules take; the potential exists to avoid a lot of work if the 
driver potentially is allowed to take input closer to its internal form, as well 
as in standard SQL or in competitor dialects, etc.

Also especially useful if a driver wants to evolve its API/query-language, and 
having users declare their language makes it less ambiguous for the driver to 
use "compatibility mode" or some such when it needs to.

> 2) Damian harped on having to write things on every call, and I have 
> definitely found the middle hashref arg to most DBI calls to be irksome 
> as for me using MySQL/Pg/SQLite almost all my Perl career, I've only 
> ever passed "{}" (except for "{ Columns => {} }" on 
> "selectall_arrayref").  So I'd love to see that arg go away, and I'm 
> assuming that with P6 sub prototypes that you won't have to worry so 
> much about positional args and can get rid of that anyway.  Perhaps all 
> the args should be named, which is particularly easy with the new pair 
> syntax:
> 
> my $data = $dbh.selectcol_arrayref( :$sql, :@args, :%option );

I think the result of a query execution should just be a data-representing 
driver object, which can then be interrogated to get the data out in the Perl 
structures of your choice.  So eg:

   my $result = $dbh.execute( $sql, %args );

   my @data = $result.as_array();

> 3) Another thing I'd like to suggest is a wholesale revamp of the method 
> names with an eye towards throwing out most of them, or, if not that, 
> then adding a new method that uses something like Contextual::Return's 
> ability to figure out the right thing to return:
> 
> my @data = $dbh.get( :$sql ); # returns an array
> my $iter = $dbh.get( :$sql ); # returns an iterator

Besides being a bad idea usability-wise (what if you want an array in $iter, or 
the result as a ref in just the first element of @data), the Perl 5 feature that 
lets a sub know whether it is called in list context or scalar doesn't exist in 
Perl 6, partly because that messes with Perl's multiple dispatch feature.

Better to just interrogate a result object for the format you want as per my 
previous example.

> Likewise, each member of the array or iterator could be polymorphic 
> depending on how it's used.
> 
> for my $obj ( @data ) {
>     print $obj.{'foo'};
>     print $obj.[0];
> }

That's fair enough I suppose.

> 4) In addition to getting rows back as array/hash refs, perhaps add an 
> object interface either that uses (santized) column names as (read-only) 
> accessors or has a general-purpose accessor.  The idea here is to throw 
> an exception when someone requests a non-existent column:
> 
> my $sql = 'select foo, bar from baz';
> my $iter = $dbh.get( :$sql );
> 
> while ( my $rec = $iter.next ) {
>     my $foo = $rec.foo;
>     my $bar = $rec.value('bar');
>     my $q1  = $rec.quux; # exception as there is no "quux" selected
>     my $q2  = $rec.value('quux'); # another exception
> }

That might work.

Personally I think of the whole concept of duplicate or missing column names to 
be a terrible idea, but that's more of a SQL problem than a DBI problem; SQL 
should be more clean in that regard but it isn't.

> I added something like this in my Text::RecordParser module:
> 
>     http://search.cpan.org/dist/Text-RecordParser/lib/Text/RecordParser/Object.pm 
> 
> That's all I can think of for now.
> 
> ky

-- Darren Duncan

Thread Next


nntp.perl.org: Perl Programming lists via nntp and http.
Comments to Ask Bjørn Hansen at ask@perl.org | Group listing | About