[PEAK] Cross-platform stored-procedure support

Thu Dec 11 15:36:53 EST 2003

The Python DBAPI is somewhat underspecified (and poorly supported) for 
stored procedure invocation.  For example, it lacks support for named 
(non-positional) arguments.  Different database drivers, notably 
ObjectCraft's Sybase driver and DCOracle2, have very different 
interpretations of various aspects of the specification.  Finally, the 
actual API (even if implemented as specified) is quite inconvenient to use, 
being stuck on a cursor and forcing both input and output parameters to be 
passed in *and* out of the procedure.

I'd like to give PEAK a stored procedure mechanism that is:

* Cross-platform
* Convenient
* Covers all functionality of the platform(s)

One way to accomplish this is with per-driver thunks in the 'appConfig' 
map.  However, it's extremely tedious to hand-write marshalling code for 
every procedure, and on platforms with stored procedures that *do* support 
keyword arguments, it's not necessary.

First, let's look at cross-platform issues.  Some platforms return results 
from procedures, others return only values.  Some return both.  So we have 
to support both.  Some support only positional parameters, others keywords 
or positional parameters.  Some allow default values, others don't.  Then 
there's the whole input/output parameter thing.  Yuck.

 From my point of view, the ideal interface would be to access something 
like 'db.procs.someProcName(foo,bar=baz)' and get back whatever it is I'm 
looking for, which might or might not include a cursor to get results 
from.  I'd like positional and keyword parameters to be possible on input, 
and I'd like to receive only relevant results on output.

This doesn't seem possible to do in the general case.  First, we'd have to 
be able to know -- for each procedure -- which parameters were input 
parameters, and which parameters were output parameters.  We'd need to know 
what order the parameters were in, and what their default values should 
be.  And, for the crazy databases that want their output parameters passed 
*in*, we'd need to know where to stick those parameters into the input.

Except for that last item, this information is most compactly represented 
as a Python function signature, e.g.:

     def foo(bar,baz=42,spam="Default"):
         return "widget", "wadget", baz, storage.ICursor

The idea here is that this is a function that has three inputs (bar, baz, 
and spam).  Of these, 'baz' is an "in/out" parameter, and there are 
'widget' and 'wadget' output parameters.  The expected return from this 
function is widget, wadget, baz, and a cursor.  We can create a 
ProcSignature class that extracts this information from the Python function 
object, and provides it in a more digestible form.  (We'll call the 
function with numeric positional arguments, then check the results for 
strings, numbers, and ICursor or IRow instances, replacing numbers with the 
name of the parameter in that position.  And we'll assume a return value of 
'None' means no output.)

This information is "cross-platform" in the sense of representing the 
desired result.  But, the corresponding information about how to map this 
to a particular back-end is missing.  Such mappings may be shared for all 
procedures on a given back-end, or may be specific to a particular 
procedure, so we need a property namespace of procedures to define their 
mapping.

A procedure mapper should accept a ProcSignature and a database connection, 
returning a callable that implements the signature.  Each backend should 
provide a default procedure mapping that provides a reasonable 
interpretation of the signature.

If the functions used to create ProcSignatures are grouped in an interface 
(perhaps a special interface subclass), then that interface can be used as 
a basis for adapting.  That is, we could use:

     db = binding.Obtain("some URL", adaptTo=IMyAppProcedures)

in order to obtain a database connection object that had been extended with 
the methods described by the procedure signatures in IMyAppProcedures.  By 
adding a "getMappedProcedure" method to database connection objects, it 
should be possible to write an adapter class that would decorate the 
database connection with methods corresponding to the ones defined by the 
interface.

For some backends, the mappings will require tedious specification of the 
precise signature of each procedure.  However, it's also possible that many 
mappings will be specifiable by policy.  For example, if a company has a 
practice of always declaring output parameters after input/output, which in 
turn are after input parameters, then they can write their own mapping 
function instead of hand-specifying mappings for every occurrence.

Also, for platforms without stored procedure support, it will naturally be 
possible to implement the procedures in Python code, and hook them in via 
the mapper subsystem.  To make this more convenient, we could specify that 
the connection checks for either a mapper (thing that converts a signature 
to a callable) or a mapping (the callable).  That way, it would be easier 
to specify either one in a given situation.

One thing I'm not sure of is whether the configuration namespaces should be 
segregated by the target interface.  That is, if there is more than one 
'foo' procedure, used by different interfaces.  Seems like a YAGNI, though, 
since one could always rename the procedures 'foo1' and 'foo2' in the 
interfaces, and then map 'foo1' and 'foo2' back to a 'foo' procedure in the 
underlying database(s).

Whew.  That's a bit complex, but all in all, it sounds workable.  Any thoughts?