[PEAK] Persistence styles, MDA, AOP, PyProtocols, and PEAK
Phillip J. Eby
pje at telecommunity.com
Wed Jul 7 02:50:46 EDT 2004
Preface
-------
This is a rather long article, even for me. (And that's saying
something!) However, it addresses a coming "sea change" that will affect
PEAK at many levels, including at least:
* PyProtocols
* peak.model and peak.storage
* peak.metamodels
* the AOP/module inheritance facilities in peak.config.modules
Naturally, I think the changes will be very good ones for PEAK, and they
are unlikely to have effects on existing code that doesn't depend on any of
the above. In particular, I do not expect any destabilizing effects on the
core (except perhaps peak.model) or any primitives. (Some API changes in
PyProtocols 1.0 are almost certain, however, with backwards-compatible APIs
remaining available at least through 1.1.)
However, for the non-core packages, especially metamodels and the AOP
stuff, major upheavals and/or outright replacement or removal are likely.
So... now that I have your attention, let me begin. :)
How We Got Here
---------------
My work at Verio has for a long time been the driving force that defined
requirements for how PEAK would develop. That is, my goals for PEAK
reflected my long-term vision for how software would be developed there,
and thus the software design goals included both social and organizational
targets, as part of an overall development communications methodology
covering matters from early requirements gathering all the way through
deployment and ongoing support/maintenance. You can see this, for example,
in tools like the 'peak.ddt' package, which facilitates requirements
communication and quality control by making a system's conformance to
requirements visible to a group.
Part of this long-term vision included an MDA, or Model-Driven
Architecture, based on UML and other OMG technologies. You can see this
influence as far back as the earliest days of the TransWarp frameworks, and
as far forward as the current peak.metamodels and peak.storage.xmi packages.
However, with my departure from Verio, the company's needs are no longer a
primary source of requirements for future PEAK development. On the
downside, this removes some of the clarity and focus from the requirements.
But on the upside, I now feel more able to take an evolutionary approach to
certain aspects of the system. While at Verio, I felt a (self-imposed)
pressure to only build in PEAK what were reasonably complete solutions to
the requirements I knew about. This tended to leave some areas of PEAK not
developed at all, while they waited for me to find a "final answer".
Now, I feel more comfortable with prioritizing feature development with
even more YAGNI (You Ain't Gonna Need It) and STASCTAP (Simple Things Are
Simple, Complex Things Are Possible) than ever before. You may notice this
in some of my recent moves to consolidate subsystems and eliminate
dependencies, while simplifying the code base... and there's a lot more
where that came from.
Currently, I consider the bulk of the PEAK framework to be a smashing
success. Nearly all of the core, and even some of the non-core frameworks
are in good shape. While some are far from complete, they have relatively
few unresolved questions or "architecture smells", as noted in February's
STATUS.txt. But there is one gigantic unresolved architecture smell in
PEAK right now, that stinks so bad it gives me headaches. And that's our
story with respect to persistence and MDA.
I think I've probably been railing about this issue for almost a year now,
talking at length about conceptual queries, relational algebra and
calculus, business rules, predicate dispatch, fact orientation and all of
that. Mostly these have been concepts I've grasped at to try and
unify/solidify PEAK's existing persistence and MDA philosophy. And in the
last few weeks, I've come to realize that PEAK's existing philosophy on
these matters is what needs to be revisited. Most importantly, our concept
of the central role of the "domain model" is flawed.
Historically speaking, PEAK's persistence philosophy evolved from Zope's by
way of ZPatterns. The basic concept was that objects were retrieved from
containers, and those containers made the objects seem like ordinary Zope
objects, even though they were retrieved from LDAP or an RDBMS instead of
ZODB. These containers, called "Racks" were then combined with
focal-points (called "Specialists") for methods on multiple (or zero)
objects of a particular interface.
While I'd like to call this concept a brilliant invention by Ty and myself,
it really falls more under the heading of identifying and clarifying a
pattern that we and others were haphazardly following, and turning it into
a library. What we didn't really clarify were the limits of applicability
of this pattern, and that's a big part of what's gotten PEAK's persistence
and MDA philosophies into trouble.
Specifically, the ZPatterns approach was well suited to web-based
applications. Indeed, it's still IMO an ideal way to approach the design
of a web-based application's URL space, and I intend to carry it forward in
peak.web. But it's not only not the best way to structure an application's
internals, it's a *lousy* way. Just look at the PEAK "bulletins" example
code, or any non-trivial example of a PEAK application that uses data managers!
It worked well for Zope 2 applications, of course, because Zope 2
applications are built in the image of their URL space, but this is not so
for "normal" Python apps. You can see in the evolution of TransWarp and
PEAK how there was an implicit assumption of this sort of hierarchical
structuring according to URLs, that we carried with us from ZPatterns.
Another assumption that was carried forward has to do with MDA and the
theory of building an application based on a "domain model", independent of
its storage mechanism. The idea of the domain model is that you put an
application's core behaviors into objects that reflect the application domain.
Unfortunately, this idea scales rather poorly when dealing with large
numbers of objects. Object-oriented languages are oriented towards dealing
with individual objects more so than collections of them. Loops over large
collections are inefficient when compared to bulk operations on an RDBMS,
for example. This means that building practical applications requires
these bulk operations to be factored out, somehow. ZPatterns took the
approach of moving the operations to Specialists, and having domain objects
delegate bulk operations to the specialists.
But this just moves the scalability issue from performance to
development. Now we're writing routines in the domain model that do
nothing but pass the buck to Specialists or DMs, which then have to do
something "real". And every new reporting or analysis requirement for the
application leads to more paired methods of this kind.
And, it still doesn't address a key scalability issue for virtually any
enterprise-class business application: business rules! Of course, the
domain model itself may have some facility for storing and implementing
rules that are native to the model (such as discount rules), but what about
operational rules like "e-mail the customer a notice when X
happens"? These rules have to be incorporated into the code that does X,
either in a DM or the domain model, and in either case making the system
more brittle. That is, the code is less reusable in other contexts.
Persistence Styles
------------------
These observations have helped me to realize that there are different
"persistence styles", with corresponding scopes of applicability for the
use of domain models. I have dubbed them "fact base", "persistent root",
and "document", where "document" persistence can be thought of as a
specialized form of "persistent root" persistence. The existence of these
styles conflicts with my multi-year assumption that truly "transparent
persistence" in a single framework was possible and practical. That is, my
assumption that one could create a domain model and have it be meaningfully
persisted to arbitrary forms of storage without reflecting change in the
domain model.
In fact, this is *only* really practical with a domain model that is
entirely suited to the "document" style (at least with our current
technology -- more on that later). Thus, attempting to build a
one-size-fits-all mechanism will tend to produce a mechanism that's at its
best only for the document style -- as PEAK's is.
But I digress. Let me explain the styles, as I understand them at the
moment. A "fact base" model is one that requires operations over large
numbers of objects, with complex queries and reporting. (By their nature,
most serious "business" applications fall into this category.) In a "fact
base" model, objects are almost always initially retrieved using keys known
to humans or other systems, rather than by navigation from a standard
starting point.
A "persistent root" model is one where there is some distinguished root
object from which the rest of the objects descend, as in ZODB. Large
collections and mass operations are infrequent; direct manipulation of
individual objects the rule. Zope 2 applications are mostly like this, but
ZCatalog can be seen as an effort to overcome the inherent limitations of
this style once the structure becomes large enough and mass operations
(like searches) become more desirable.
A "document" model is one where the persistent root and all its children
are sufficiently small to allow them to be loaded into memory all at once,
and written out all at once when changes are required. It does not suffer
from the sorts of issues ZCatalog tries to work around, because it does not
need to scale to a size that needs such a thing.
These styles are not 100% mutually exclusive (since a "fact base" might
contain "documents"), nor are their boundaries really fixed. For example,
the "Prevayler" philosophy of persistence effectively says that if you have
enough memory to play with, you can use the "document" model for anything,
as long as you also keep transaction logs. In theory, this would mean that
in another 5 or 10 years, we might be able to use the Prevayler approach
for all applications.
But that's baloney, actually. At best, Prevayler covers the "persistent
root" and "document" models well, where navigation occurs on the basis of
object trees with relatively low fan-out. Without additional structures
performing functions analagous to those of ZCatalog, applications with
"fact base" characteristics (like needing to look up one out of millions of
transactions by an order number) just don't work out-of-the-box under a
"prevalent" architecture: you end up having to design your own data
structures to deal with these issues.
Currently, PEAK's DM system attempts to support all of the styles, and thus
ends up really only being *good* at the "document" model. This is
well-illustrated by the fact that the only implemented DM's in PEAK that
are not just tests or examples are document-based: the XMI DM in
peak.storage.xmi, and the HTMLDocument DM in peak.ddt! The only drawback
to PEAK DM's in the "document" and "persistent root" models is that DM's
require a special key to indicate the root object. Apart from this, they
work quite well.
But for the "fact base" model, DM's aren't really very good with dealing
with multiple objects, and you have to write a *lot* of them, typically one
per type, plus one for every kind of query. Of course, I've mentioned that
a lot of times before, and that I intend to do something about it, often
accompanied by lots of hand-waving about "fact orientation". :)
Mostly, though, I've been thinking in terms of how to expand or revise DM's
to accomodate fact-orientation or SQL mapping. I see now, however, that
this is really not the right place to start, since DM's do just fine for
the models they currently cover, and they are not well-suited to such a
transformation.
So, instead, we need tools that are fact-oriented from the ground up. For
the most part, this means simply abstracting SQL and a physical schema,
replacing them with a logical schema. Queries and commands issued against
the logical schema can then be translated to SQL or other operations
appropriate to the back-end.
In order to make that work, we're going to need several things, including:
* A model for facts
* A query language that can be mapped to either SQL or in-memory Python objects
* A data management API that's query-focused and easily extended
* A mechanism for defining mappings between an abstract fact model, and one
or more concrete storage models
MDA and AOP
-----------
As I said, our theory -- hypothesis, really, or perhaps just an article of
faith -- was that the answer to all of these issues could be found in an
MDA (Model-Driven Architecture) implemented using AOP (Aspect-Oriented
Programming).
More specifically, I knew that we needed to separate storage-related
concerns from application concerns, and that further, we needed a way to
reuse, extend, and refine models without changing source code (so that
versions of an application that was simultaneously targeted for different
markets could share common source code despite differences in domain models
and business rules).
This was the theory driving TransWarp through most of its history, and to
some extent it continued on into PEAK. TransWarp, however, never really
produced much of direct value, and to the best of my knowledge PEAK's AOP
facilities are currently only used by maybe one person/company -- and it's
not me or anybody at Verio. :)
There are several reasons for this. At least one of them is that I never
succeeded in writing a really good AOP tool! But I think an even bigger
reason is that most of the things I wanted to do with AOP, turned out to be
easier to do and understand in other ways. For example, PyProtocols and
protocol adaptation made possible lots of things I'd planned to do with
AOP. So did the evolution of peak.binding and peak.config.
There are only two things left that the AOP stuff was intended to do (in
the sense of being important enough to spend the time on developing it in
the first place):
* Allow variations of an application to change domain object classes to
use different collaborator classes
* Allow mixing domain-specific behavior into classes that were
automatically generated from UML or other modelling tools
The first of these is in fact the only use case I know of where anybody
might actually be using PEAK's AOP facility. I don't personally expect to
have either use case any time soon, though. I *do* expect to still
accommodate both scenarios in the future, it's just that the way of
achieving them may have to change.
Whether the way changes or not, though, I do not currently intend to
continue maintaining the AOP code or distributing it with PEAK, as there
are much better ways of doing this now than my quirky bytecode recompiler
mechanism. However, if there's enough interest or need, I might be willing
to spin the code off into a separate, "user-supported" distribution.
So, if you are currently using PEAK's AOP facilities in any way, *please*
post what you're using it for now, so I can make sure I'm not overlooking
any use cases in the transition to a post-AOP PEAK.
Going Beyond AOP (with 40-Year Old Technology!)
-----------------------------------------------
"Greenspun's Tenth Rule of Programming: any sufficiently complicated C or
Fortran program contains an ad hoc informally-specified bug-ridden slow
implementation of half of Common Lisp."
-- Phil Greenspun
Apparently, this rule applies to Java as well. I recently ran across the
book-in-progress, "Practical Common Lisp" (available at
http://www.gigamonkeys.com/book/ ), only to discover a surprising
similarity between this chapter:
http://www.gigamonkeys.com/book/object-reorientation-generic-functions.html
And the AspectJ language. In fact, as I did more research, I began to
discover that the only real plusses AspectJ has over what's available in
Common Lisp are that it 1) works with "oblivious" code that wasn't written
with AOP or extensibility in mind, and 2) uses predicates.
Now, before anybody panics, I am *not* planning to rewrite PEAK in Common
Lisp! But I *do* want what they've got, which includes many things AspectJ
does not. And, it addresses numerous things we need in the
currently-unstable parts of PEAK... the parts of PEAK that have been
unstable precisely because they've been lacking this sort of functionality.
I'm speaking here of "multiple dispatch". More specifically, I'm talking
about a kind of symmetric multi-dispatch that's closer in nature to the
kind found in the Cecil and Dylan programming languages than what's in
CLOS, but the basic idea is the same.
Specifically, one defines a "generic function" that can have multiple
implementations (aka "methods"). The appropriate method to invoke is
selected at runtime using information about *all* of the function's
arguments, not just the first argument, as Python implicitly does for you.
What good does that do? Well, think about storage. Suppose you defined a
generic function like this:
def save_to(ob, db):
...
And what if you could define implementations of this function for different
combinations of object type and database type? Maybe something like:
[when("isinstance(ob,Invoice) and isinstance(db,XMLDocument)")]
def save_to(ob,db):
# code to write out invoice as XML
[when("isinstance(db,PickleFile)")]
def save_to(ob,db):
# code to write out arbitrary object as a pickle
Doesn't this look a *lot* easier to you than writing DM classes? It sure
does to me.
You might be wondering, however, how this is different from writing a bunch
of "if:" statements. Well, an "if:" statement has to be written in *one
place*, and has the *same set of branches*, all the time. But generic
functions' methods are different.
First, they can be written all over the place. The two methods above could
live in completely different modules -- and almost certainly would. Which
means that if you don't need, say, the pickle use case, you could just not
import that module, which means that branch of our "virtual if statement"
simply wouldn't exist, thus consuming no excess memory or CPU time. So,
they can be "write anywhere, run any time". Now that's what I call
"modular separation of concerns". :)
In addition to these very basic examples, expanding the approach to support
full "predicate dispatching", and to support CLOS-style "method combining"
and "qualifiers", would allow us to write things like this "adventure game"
example:
[when("not target.isDrinkable()"]
def drink(actor,target):
print "You can't drink that."
[when("target.isDrinkable()")]
def drink(actor,target):
print "glug, glug..."
target.consume()
[after("target.isDrinkable() and target.isPoisonous()"
" and not actor.isWearing(amulet_against_poison)")]
def drink(actor, target):
print "oops, you're dead!"
actor.die()
The idea here is that the "after method" runs *after* the successful
completion of any "primary methods" (defined with 'when()'), as long as its
conditions apply.
In some ways, this is a bit like the old SkinScript of ZPatterns, except
that it's 1) pure Python rather than a new language, and 2) applicable to
any and every kind of function you want, rather than a handful of
pre-defined "events".
And speaking of events, generic functions make *great* extension or
"plug-in" points in general for systems that need them. For example, an
e-commerce framework could call generic functions at various stages of
processing an order, such as "completed order", allowing custom business
rules to execute according to the defined triggering conditions.
Indeed, almost anything that needs to be "rule-driven" can be expressed as
a generic function.
So what's the Catch?
--------------------
You may be wondering what the catch to all this is, or perhaps you've
already made up your mind as to what that catch *must* be. You're maybe
thinking it must be hellishly slow to eval() all those strings. Or maybe
that it's all a pipe dream that will take ages to implement.
But actually, no, neither one is the case, as you might already know if
you've been paying close attention to recent PyProtocols checkins. The
new, experimental 'protocols.dispatch' module on the CVS trunk has already
proven that it's possible to implement highly efficient generic functions
in Python, complete with predicate dispatch. They don't yet have the nice
API I've presented in this article, but they do exist, and their execution
speed in typical cases can actually approach that of the PyProtocols
'adapt()' function!
In addition to the functionality proof-of-concept, I've also got a
proof-of-concept Python expression parser (that so far handles everything
but list comprehensions and lambdas) for what's needed to implement the
fancy 'when/before/after()' API. And there's a proof-of-concept for the
"function decorator syntax" as well.
So, actually, all of the major technical pieces needed to make this happen
(expression parser, decorator syntax, and dispatch algorithm) have been
developed to at least the proof-of-concept stage. The parser will reduce
normal Python expressions to fast-executing objects, so there's no need to
eval() expression strings at runtime. Further, the in-CVS prototype
dispatcher automatically recognizes common subexpressions between rules, so
that e.g. 'target.isDrinkable()' will get called only *once* per call to
the generic function, even if the expression appears in dozens of rules.
Also, the prototype dispatcher automatically checks "more discriminating"
tests first. So, for example, if one of the tests is on the type of an
arguments, and there are methods for lots of different types, the type of
that argument will be checked first, to narrow down the applicable methods
faster. Only then will pure boolean tests (like 'target.isDrinkable()') be
checked, in order to further narrow down the options.
Finally, the already highly-optimized expression evaluation and dispatching
code is destined to move to Pyrex, where I hope to make it evaluate
expressions as quickly as the Python interpreter itself does, by way of a
few tricks I've come up with.
The net result is that the production version of this code should be
roughly comparable in speed to a series of hand-written 'if:'
statements! For some tests, like type/protocol tests and range/equality
tests, it's likely to actually be *faster* than 'if:' statements, due to
the use of hash tables and binary searches. (But this will likely be
balanced out by other overhead factors.)
Even, however, if it were to end up being say, twice as slow as 'if:'
statements would be, it's still a heck of a bargain in development time,
compared to having to write the 'if:' statements yourself. Consider, for
example, this set of example rules from the prototype dispatcher's unit tests:
classify = GenericFunction(args=['age'])
classify[(Inequality('<',2),)] = lambda age:"infant"
classify[(Inequality('<',13),)] = lambda age:"preteen"
classify[(Inequality('<',5),)] = lambda age:"preschooler"
classify[(Inequality('<',20),)] = lambda age:"teenager"
classify[(Inequality('>=',20),)] = lambda age:"adult"
classify[(Inequality('>=',55),)] = lambda age:"senior"
classify[(Inequality('=',16),)] = lambda age:"sweet sixteen"
As an exercise, try to write the 'if:' statements (or design a lookup
table) to implement this by hand. More to the point, try to do it
*correctly*, and without duplicating any of the comparisons. Then, try to
do it without actually rewriting the rules. That is, try to do it using
*only* the comparisons shown above, without (for example) changing the
'>=' rules into '<' rules. (After all, in a business application, the
closer the code is to the human-readable requirements spec, the better off
you are. If you rewrite the rules to make them dispatchable, you've just
introduced a "requirements traceability" issue.)
Have you tried it yet? Even ignoring the traceability issue, it's quite
tedious to implement, because you have to explicitly work out what rules
are "more specific" than others. If somebody asked you to make the above
classifications by age, it's trivial for you to "execute" the rules in your
head, because our brains just "know" what's more specific than something
else -- they go by what's "closest" or "most specific". The advantage of
predicate dispatching -- above and beyond the advantages of generic
functions in general -- is that it teaches the computer how to figure out
what "closest" means, just from the contents of the rules themselves.
It's not perfect, of course. The current dispatcher only knows about types
and value ranges (and it'll soon know about booleans). But it doesn't know
that one test might affect the applicability of another test. For example,
if an object is of some type other than NoneType, then obviously an 'is
None' or 'is not None' test doesn't need to be executed. However, short of
coding this and dozens of other rules into the system, there's no way for
the dispatch system to know that.
In practice, though, I suspect it will be rare that this causes anything
other than a little inefficiency here and there. And, the long term
solution in any case will be to use a generic function to compare the
rules. In this way, one could do something like:
[when("isinstance(r1,ClassTerm) and r1.klass is not NoneType"
" and isinstance(r2,IsNotTest) and r2.value is None")]
def implies(r1,r2):
return True
to extend the dispatch system with knowledge about the relationship between
various kinds of tests. (Which just gives another example of how
extensible a system based on generic functions can be!)
Back to Reality
---------------
So how does all this relate back to PyProtocols, PEAK, persistence, MDA,
and all that? Well, for PyProtocols the relationship is simply this:
sometimes you don't really need/want an adapter object, especially when the
interface has only one method. It's sometimes surprising to realize how
many useful interfaces are just that: one method. In fact, it's wasteful
to create an adapter for such an interface, as you only throw it away after
you call that one method. So, generic functions are an obvious
replacement. If you only do type tests on one argument, generic function
dispatch is basically the same speed as regular adaptation, as it follows
the exact same lookup algorithm. (And note, by the way, that the prototype
dispatcher can dispatch on protocol tests as easily as it can on type tests.)
Most of PEAK's core (binding/config/naming) APIs aren't going to be
affected, though. If they really needed this kind of complicated
dispatching, odds are good they wouldn't already be as stable as they
are. However, in the less stable areas of the system (like storage), one
of the major reasons *why* they're not stable is because of how hard they
are to write *without* something like predicate dispatch and generic functions.
A while back in this article, I wrote that there were two remaining AOP use
cases in my mind for PEAK:
* Allow variations of an application to change domain object classes to
use different collaborator classes
* Allow mixing domain-specific behavior into classes that were
automatically generated from UML or other modelling tools
Both of these can be accomplished with generic functions. The first, by
having features use a generic function to look up their target type, and
the second, by having the code generation tools insert "abstract" generic
functions into the generated code. That is, stubs without any
implementations. One then simply writes the implementations in a separate
module.
Poof! We're done. No more AOP, no module inheritance, no bytecode
hacks. All gone, but the potential for a model-driven architecture and for
modular separation of concerns remain intact.
Okay, so what about the persistence stuff, fact bases, and all that? I
said we needed:
* A model for facts
* A query language that can be mapped to either SQL or in-memory Python objects
* A data management API that's query-focused and easily extended
* A mechanism for defining mappings between an abstract fact model, and one
or more concrete storage models
I haven't narrowed these down 100%, but here's what I think so far. The
query language is probably going to end up being Python, specifically list
or generator comprehensions, e.g.:
[(invoice,invoice.customer) for invoice in Invoices if status=="pastdue"]
However, these will be specified as *strings*. If the objects being
queried are in memory, the string can just be eval()'d, but if they are in
a database, the query can be converted to SQL, by a mechanism similar to
the one I'm writing now for doing predicate dispatch. Of course, the
decision to eval() or to translate (and translate to what SQL dialect,
etc.) will be made by generic functions.
The data management API and mapping mechanisms will probably end up being
mostly generic functions, possibly accessed via methods on an "editing
context" object (as I've mentioned in previous mailing list posts), but the
back-end implementation code that you write will look more like the
'save_to' function examples I gave earlier in this article, rather than
looking anything like today's DM objects. Indeed, this will ultimately be
the death of DM's, and I believe everyone will be happy to see them
go. Overall, the storage API is probably going to end up looking somewhat
like that of Hibernate, a popular Java object-relational mapper. (The main
difference being that instead of their specialized object query langugage,
we'll just use Python.)
As for the fact model itself, I haven't yet absorbed the full effect of
this new technology, but I feel fairly good about the ability to
potentially express model constraints such that they can be tested
(immediately, or at commit time, or at a later time) using predicate-driven
rules.
Finally, note that the ability to dispatch on both the type of an object
*and* the type of database being persisted to, means that we can even
escape the tyranny of "persistence styles", by selecting different
implementations. Using generic functions, the "storage framework" becomes
so loosely coupled that its dominant structuring no longer influences what
persistence style it's "most suited" for. In fact, it doesn't *have* a
dominant structuring, so there's nothing to get in the way.
Of course, all of this is still extremely "hand-wavy", but the use of
predicate dispatch and generic functions should completely eliminate major
areas of concern that I previously had -- especially with respect to how to
register things in the configuration system such that they end up with the
right precedence ordering, given the various types they applied to and what
kind of database they were for. Indeed, just the fact that we won't need
to define any new configuration *syntaxes* (such as .ini file section
types) makes the generic function route pretty appealing.
Indeed, speaking of configuration, there are other subsystems in PEAK that
have been crying for more sophisticated configuration, that generic
functions fit the bill quite well for. For example, 'peak.web' would like
to have "views" and other kinds of adaptation defined "in context" of
particular parts of a site, or possibly depending on various conditions
that might apply to the object at a point in time, or what kind of web
request it is (e.g. browser vs. XML-RPC).
'peak.security' also has a system that's *sort of* like predicate
dispatching, in that it evaluates rules to determine whether a user has
permission to do something. But its syntax isn't exactly easy-to-use, and
there's lots of extra junk involved in the mechanisms. I don't know if
I'll actually go back and "fix" it, but I know if I were writing it today,
I'd want to do it with generic functions rather than building up all the
special-purpose framework code that's in there now for rule evaluation.
(By the way, when I speak here of "configuration", I'm not really talking
about file formats like .ini and ZConfig and all that, so much as I am the
assembling of the components needed to implement a specific
application. Such assembly is done by its developer, or by someone who is
extending the application. Today, it's often done in config files, but in
the future more of it may be done in code, with the configuration files
mainly saying what modules to import. But don't confuse this with
configuration *settings* used by an application -- these will stay in
configuration files where they belong.)
Conclusion
----------
Generic functions can replace AOP, in code that uses them. This makes
PEAK's AOP facilities moot for their original intended uses. Generic
functions will also greatly simplify the design and implementation of
future storage and query features, while opening up many new possibilities
for extensibility, business rules development, and similar
capabilities. They are also a natural complement/extension to protocol
adaptation.
(For example, if you use a generic function as an adapter, you can have
flexible dynamic adaptation, with no ambiguity as far as PyProtocols is
concerned. Indeed, it's possible that you could replace 'adapt()' itself
with a generic function, meaning that you could even register adaptations
to protocols that aren't PyProtocols protocols.)
Therefore, the plan is to finish implementing generic functions, phase out
the AOP subsystem (and perhaps also separate peak.metamodels into a
different package and distribution), and then begin developing the new
storage and query APIs.
Currently, my intent is to put generic functions in a 'protocols.generic'
subpackage, which you'll use in a way that looks something like:
from protocols.generic import when, before, after
[when("something()")]
def do_it(...):
...
However, I'm at least somewhat open to the possibilities of:
* Making a separate top-level package for generics (e.g. 'from generics
import when') instead of a subpackage of 'protocols'
* Making a separate 'PyGenerics' package (that includes PyProtocols)
* Doing something else altogether that I haven't thought of, but which
someone else suggests. :)
So, here's your chance to change history. (Or at least a HISTORY.txt file
or two!) Tell me, what am I missing? What am I doing wrong? Ask me
questions, tell me this is crazy, or whatever you want. Your feedback
(including questions) about this plan is respectfully requested, and will
be greatly appreciated.
More information about the PEAK
mailing list