[PEAK] A ridiculously simple mapping layer

Mon Jul 26 15:21:13 EDT 2004

"The obscure, we can do right now.  The obvious takes a little longer."

It seems I've been overlooking an obvious, dirt-simple, slap-your-forehead, 
why-didn't-I-think-of-that way to specify an object mapping layer, the 
bridge from abstract model to concrete implementation, because I was too 
busy trying to figure out how to create registries of classes to tables or 
some such.  And all along, it could just be something as simple as this:

     # === mymodel.py =============

     class Invoice(model.Element):
         # etc...

     class Customer(model.Element):
         # blah blah

     # === mystorage.py =============

     class PhysicalSchema1(storage.Schema):
         Invoice = mapTable('invoices', someField=someConverter, ...)
         Customer = mapTable('customers', otherField=thus_and_such,...)
         # etc.

     class PhysicalSchema2(PhysicalSchema1):
         # customer is mapped differently in this version of the schema,
         # but everything else is the same
         Customer = mapTable('customers', otherField=sturm_und_drang,...)

     # === myapp.ini =============

     [Storage Mappings]
     mymodel = mystorage.PhysicalSchema2   # or 1, or whatever...

     # === make_invoice.py =============
     from peak.api import *
     import mymodel

     root = config.makeRoot(iniFiles=[("peak","peak.ini"), "myapp.ini"])

     # create a workspace for the model
     ws = storage.newWorkspace(root, mymodel)

     storage.beginTransaction(ws)

     # look up a customer
     cust = ws.Customer.get(custno='1234')

     # create a new invoice
     inv = ws.Invoice(customer=cust)

     # etc...

     storage.commitTransaction(ws)

Duh.  D'oh!  So simple, yet 100% modular and reusible/extensible.  We 
define our domain model as a module, or set of modules.  We define a 
physical schema as a class.  Descriptors on the schema map from the "pure" 
domain class to an implementation-specific subclass.  That is, in the 
example above, 'ws.Customer' and 'ws.Invoice' do not return either the 
original class in 'mymodel', but rather a *subclass*, that knows it is part 
of that workspace.  The subclass is just the original class plus:

   1) a mixin (e.g. 'RelationalMixin', 'LDAPMixin', etc.) that defines the 
mapping process
   2) some metadata to drive the mapping (table name, field names, converters)
   3) classmethods such as 'find', 'get', and 'delete'
   4) overrides of domain-specific instance or class methods where 
necessary for performance or to handle unusual mapping requirements

Further, the new subclass has the workspace object as its parent 
component.  So, the subclass *itself* is a component, and thus has access 
to the workspace, which lets it refer to other implementation-specific 
classes.  For instance, ws.Invoice can refer to ws.Customer, as long as it 
uses a classAttr binding, e.g. 
'binding.classAttr(binding.Obtain("Customer"))' to get at it.

Equally important, the implementation-specific classes can bind to the 
physical database component(s) they need, via the workspace component's 
context.

We're not limited to a single module and its enclosed classes, either.  If 
we allow schemas to contain schemas, then we can map entire packages within 
our workspace,  using e.g. 'ws.Billing.Invoice' and 'ws.CRM.Customer' to 
access them.  If this is too awkward, then in principle we can just make 
another module that imports the bits we need from the different packages, 
possibly renaming them in the process.  Due to the way peak.model already 
works, classes know their "real" names and modules, so the mapping system 
shouldn't be confused by such "repackaging" for programmer convenience.

It seems it would also be possible within this scheme to "upgrade" a model 
to include extensions, without changing the base model.  For example, the 
mechanism that creates the schema-specific subclass can check a property 
namespace to see if the class should be replaced by another, thus allowing 
an application to be locally extended.  (Thus replacing the "class 
replacement" AOP use case.)

Also note that this entire approach works just fine with *multiple 
instances* of a given schema.  That is, you could have more than one 
workspace open with the same schema, but using different databases.  Or 
different schemas, for that matter, which might be quite useful for 
migration.  We'll probably also have a few simple schema types that don't 
require any metadata to do the mapping.  For example, an "in-memory" 
mapper, or a "document" mapper that reads/writes pickles or XML.  For these 
you'd just specify 'mymodel = someGenericMapperClass' in the configuration 
and not need to create an explicit mapping definition.

While not all of these mechanisms are interchangeable at the top-level of 
code (since one would have to specify the file to load, when to save it, 
etc., while others do not), any code that receives an already-opened 
workspace object is not affected by these mapping issues.  Only the code 
that manages the workspace object's lifecycle has to know anything about 
the overall persistence model.  (Assuming that we have a query language 
that can be applied to the classes exposed by a workspace, and it operates 
on storage interfaces supplied by the implementation-specific subclasses.)

There are many complications lurking in this approach, of course, since for 
example I don't know what those "storage interfaces" will consist of, and 
won't be entirely sure until I'm trying to develop the query language.  I'm 
also worried that the complexity of the namespace mapping will start to 
blow up once the scope includes module-level functions, constants, and 
variables, plus the features and methods of the actual model classes.

On the bright side, though, these are mostly technical difficulties.  The 
part that's really been bothering me the most in trying to design the "next 
generation" model and storage systems has been getting a clean API -- a 
simple way to "spell" what we want to do.  And now that we have it, my joy 
at having a solution is rivalled only by my chagrin at not having thought 
of it much, much sooner.  Maybe it's just taken me this long to regrow 
enough of the brain cells that were destroyed by an overexposure to 
corporate groupthink.  :)