Front page | perl.dbi2.dev |
Postings from July 2009
Re: Perl 6 DBI API ideas
Thread Next
From:
Darren Duncan
Date:
July 31, 2009 17:16
Subject:
Re: Perl 6 DBI API ideas
Message ID:
4A738930.4070608@darrenduncan.net
Ken Youens-Clark wrote (to dbi-users):
> Tim,
>
> I spoke to you briefly at the OSCON last week during the Perl Lightning
> Talks (I gave one on SQL::Translator). Though I didn't raise my hand
> when you asked for people to contribute to a DBI module for Perl 6, I've
> had some ideas that I thought about sharing. Part of what I'm thinking
> is based on the "7 Principles of Better API Design" tutorial I heard
> Damian give earlier in the week. There he made a point that a great
> place to start designing a module is to first decide how you want to
> use/call it, and I definitely agree with that. So I thought I might
> send you some ideas about P6 DBI API.
Ken, thanks for bringing the ideas up.
By the way, there is a dedicated list for this, dbd2-dev@perl.org, which I have
also sent this reply to. (There is also a dbdidev@perl.org, but AFAIK that is
about a lower-level Parrot layer.)
However, I think we need to be more progressive than what you have suggested. I
will suggest further enhancements next to your suggestions. And yes, I *have*
thought about this a long time.
> 1) I respectfully submit to use "new" instead of "connect" to
> instantiate a connection.
>
> And since I always include "{ RaiseError => 1 }," I would request that
> this argument be the default and that someone would have to turn this
> off explicitly.
>
> I would also like to see the DSN change since they all start with "dbi:"
> -- couldn't that be left off, then? I would also love to see some
> standardization on the driver names ("mysql" when it's normally written
> "MySQL," "Pg" when the db is called "PostgreSQL," etc.). Also, WRT
> there being "no standard for the text following the driver name,"
> perhaps there should be?
>
> use DBI;
>
> my $dbh = DBI.new( driver => 'MySQL', dsn =>
> 'host=cabot;database=markers30', user => 'web_ro' ... );
One thing I find very distasteful about DBI as it is now is that it has
hard-coded knowledge and names in it about various separately distributed driver
plugins, whereas really the DBI shouldn't have any such names. I refer for
example to $dbd_prefix_registry in DBI.pm.
So for starters, the "driver" argument or portion of a dsn needs to simply
contain the actual package name of the module, for example:
driver => 'DBD::SQLite'
... and this will remove any need for DBI to either know how to map, or for it
to have a cludgy assumption of prefixing the argument with "DBD::". And then it
also gives extension writers more flexibility to name their module what they
want; Perl in general doesn't ascribe any special meaning to namespaces, as
those are just a community practice, and DBI shouldn't hard-code this.
Secondly, DBI shouldn't need any special knowledge of a registered prefix to
invoke driver-provided methods. Rather, if you want a solution which is most
similar to the current one, then DBI should just designate a single specific
prefix, such as "driver_", and since things are all done as object-methods these
days, no single object would have extra methods for multiple drivers. And if
you still want to support non-OO syntax, then people can just invoke eg:
DBD::SQLite::foo( $dbh, ... )
... or what have you, and the driver can individually choose to export.
For that matter, in the interest of being more Perl 6 savvy, DBI itself should
mostly just be a set of roles, driver modules be classes that compose/do those
roles, and users would just 'use' the driver modules directly; eg:
use DBD::SQLite;
my $dbh = DBD::SQLite.new( dsn => ... );
Or, if you want to avoid sprinkling your DBMS-agnostic user code with the name
of a driver, something like DBI's connect could exist as a wrapper for the above:
use DBI;
my $dbh = DBI.new( driver => 'DBD::SQLite', dsn => ... );
... much as you're used to now. And the driver argument could even come from a
config file / not be in code.
But the point is that $dbh is a *DBD::SQLite* object, that *does* DBI, not *is*
a DBI object that wraps the driver.
An advantage of working this way is that at the end of the day it is each DBI
driver itself that chooses how it is implemented and what API it has. By not
necessarily going through an extra call layer of the DBI itself, we may get
better performance. Including DBI roles in driver classes is at least making a
programmatic declaration that the driver objects conform to the DBI API, and
also they can reuse generic code that DBI provides for all drivers by default.
For that matter, thanks to Moose, we can do this in Perl 5 too if we wanted to,
but at least the Perl 6 version should work that way, and I believe it is the
best practice for Perl 6.
As a tangent, other roles can exist for reusability; for example DBD::File could
just be a role, and so eg the new DBD::CSV would compose both DBI and DBD::File.
Except that presumably DBD::File would have some other name that doesn't say
that it is a file itself.
See also my DBI-alike Muldis::Rosetta framework (still in development),
specifically
http://search.cpan.org/dist/Muldis-Rosetta/lib/Muldis/Rosetta/Interface.pm , for
a concrete demonstration of how this might work. Muldis::Rosetta has, as much
as possible, the exact same API in Perl 5 and Perl 6, with Perl 6 being the
driver for appropriate design, and Moose giving that for Perl 5.
Now following that example, the new DBI could also be further abstracted.
Rather than having "connection" objects" at the root, it could have "virtual
machine" objects, one per instantiated driver, and "process" objects, the latter
which most directly correspond to the "connection" of old, except
connect/disconnect is something you do within the context of a "process", but
that connect/disconnect is called mount/unmount, analogous to a file system, and
supporting that you can connect to multiple data stores within one process,
allowing you to do cross-database queries, such as SQLite and some other DBMSs
natively support (SQLite calls it ATTACH/DETACH, Oracle calls it CONNECT TO ...
AUTHORIZED BY ..., and so on), but that you use the same syntax at the DBI level
to connect to the first database as to the second. For example:
use DBI;
my $dbms = DBI::new_machine( driver => 'DBD::SQLite' );
# note, $dbms is recommended to be a singleton per driver
my $dbh = $dbms.new_process(); # always auto-commit by default
$dbh.mount_db( dsn => ... ); # you can do this more than once
my $sth = $dbh.compile( $sql ); # aka prepare()
$dbh.start_trans(); # can do this more'n once too, for nested
$sth.execute( arg1 => $arg1, arg2 => $arg2 );
$dbh.execute( $sql2, argX => $argX );
$dbh.commit_trans();
$dbh.unmount_db( dsn => ... ); # fails if a transac is curr active
# later, the $sth/$dbh/$dbms objects can be garbage collected
Another thing DBI can do is have a property of "process"/etc objects, or extra
arguments to "compile"/"prepare"/etc, where one can explicitly name what
language the code they are writing is in.
DBI is supposed to be query language agnostic, even if SQL is the most commonly
used group of languages, and if a user can declare this explicitly, it saves the
driver from having to guess what they were given, which might be ambiguous.
Mainly this is for when a single driver might support multiple query languages,
more than one might be code strings, others might be Perl structs such as what
SQL generation modules take; the potential exists to avoid a lot of work if the
driver potentially is allowed to take input closer to its internal form, as well
as in standard SQL or in competitor dialects, etc.
Also especially useful if a driver wants to evolve its API/query-language, and
having users declare their language makes it less ambiguous for the driver to
use "compatibility mode" or some such when it needs to.
> 2) Damian harped on having to write things on every call, and I have
> definitely found the middle hashref arg to most DBI calls to be irksome
> as for me using MySQL/Pg/SQLite almost all my Perl career, I've only
> ever passed "{}" (except for "{ Columns => {} }" on
> "selectall_arrayref"). So I'd love to see that arg go away, and I'm
> assuming that with P6 sub prototypes that you won't have to worry so
> much about positional args and can get rid of that anyway. Perhaps all
> the args should be named, which is particularly easy with the new pair
> syntax:
>
> my $data = $dbh.selectcol_arrayref( :$sql, :@args, :%option );
I think the result of a query execution should just be a data-representing
driver object, which can then be interrogated to get the data out in the Perl
structures of your choice. So eg:
my $result = $dbh.execute( $sql, %args );
my @data = $result.as_array();
> 3) Another thing I'd like to suggest is a wholesale revamp of the method
> names with an eye towards throwing out most of them, or, if not that,
> then adding a new method that uses something like Contextual::Return's
> ability to figure out the right thing to return:
>
> my @data = $dbh.get( :$sql ); # returns an array
> my $iter = $dbh.get( :$sql ); # returns an iterator
Besides being a bad idea usability-wise (what if you want an array in $iter, or
the result as a ref in just the first element of @data), the Perl 5 feature that
lets a sub know whether it is called in list context or scalar doesn't exist in
Perl 6, partly because that messes with Perl's multiple dispatch feature.
Better to just interrogate a result object for the format you want as per my
previous example.
> Likewise, each member of the array or iterator could be polymorphic
> depending on how it's used.
>
> for my $obj ( @data ) {
> print $obj.{'foo'};
> print $obj.[0];
> }
That's fair enough I suppose.
> 4) In addition to getting rows back as array/hash refs, perhaps add an
> object interface either that uses (santized) column names as (read-only)
> accessors or has a general-purpose accessor. The idea here is to throw
> an exception when someone requests a non-existent column:
>
> my $sql = 'select foo, bar from baz';
> my $iter = $dbh.get( :$sql );
>
> while ( my $rec = $iter.next ) {
> my $foo = $rec.foo;
> my $bar = $rec.value('bar');
> my $q1 = $rec.quux; # exception as there is no "quux" selected
> my $q2 = $rec.value('quux'); # another exception
> }
That might work.
Personally I think of the whole concept of duplicate or missing column names to
be a terrible idea, but that's more of a SQL problem than a DBI problem; SQL
should be more clean in that regard but it isn't.
> I added something like this in my Text::RecordParser module:
>
> http://search.cpan.org/dist/Text-RecordParser/lib/Text/RecordParser/Object.pm
>
> That's all I can think of for now.
>
> ky
-- Darren Duncan
Thread Next
-
Re: Perl 6 DBI API ideas
by Darren Duncan