SchemaEvolution

This is a rough rendering of a page from the old Prevayler wiki. Please see the new wiki for current documentation.

I'm not sure I understand how your system would handle schema evolution (changes to the business classes). How will the snapshot be read?


Some changes in the code are backwards compatible and some are not.

Sometimes it is just a question of declaring a "private static final long serialVersionUID = [any number]" (see Java serialization spec) and being prepared for uninitialized variables.

Sometimes, though, you will have to write specific migration code. Simply load the old classes using one class loader and the new classes using another, since they will probably have the same name.

The advantage is you are writing in Java (and not SQL ou PL) and you can use any business logic already defined in your objects.

Another interesting approach is to have the system running with the new classes querying the old classes, when needed, and migrating the old objects on-the-fly, on demand, until all objects have been migrated.

-- KlausWuestefeld


Actually we are using a code generator to do exactly this in our most
recent web application that utilizes Prevayler. Using an XML export
of the data structure created in Rational Rose we then generate the
following code:
1. All persistent objects (serializable business objects)
    as defined in the schema
2. Store-, Update- and Remove-commands for persistent objects
3. The Prevalent System with TreeMaps for all business objects and
    corresponding store-, update- and remove-methods. This class
    also has findBy Methods to query the TreeMaps.

Using this code generator make schema changes "almost" trivial.
"Almost", because we still have to manually write the migration
code in the changed classes to convert old versions of serialized
classes on the fly. But compared to using a relational database
and a persistence layer, where schema migration is a nightmare,
this is almost heaven!


--MarkusFix


It would be really cool if this project would integrate to a object migration framework.

Then it would be really useful.

I have used the same idea for an earlier project, but I got stranded on object migration.

The thing that really was the "gotcha" on that project was that it seemed that the easiest way to make object migration would be to have data in some meta-format (like a RDBMS or XML) and then manipulate it there. So it seemed that it would have been a better approach to use a normal persistence framework like ObjectBridge in the first place.


Even when working with an SQL database and persistence layer, I prefer to write migration code using my business objects instead of accessing the data directly. It avoids breaking the systems encapsulation (are ALL your consistency rules in the database? ;) and I actually find it easier. I resort to SQL or PL only when I need performance. With all objects in RAM, performance is a different story.

-- KlausWuestefeld


Could you direct me to some project / resource /examples where I can study the issues of object migration in greater detail ?

-- LasseLindgard


Sorry. I cannot point you anywhere particularly interesting. Maybe someone else can.

It is a disgrace that people tend to use OO so weakly (client code for database apps) that this OBJECT schema evolution problem hardly ever crops up.

-- KlausWuestefeld


Yes, I find it rather odd that noone deals with Object Schema Evolution.

I have been searching all over the web and I have not been able to find any articles that deals with it in a serious way.

If I had more time on my hands (and a good idea on how to attack the problem) I would start a Sourceforge project aiming at solving the problem.

-- LasseLindgard


I propose the following:

1) People post suggestions here for an "evolution" they would like to see in the Prevayler demo bank application.

2) Someone (I, for example) writes example migration code using two approaches: traditional and on-demand (see earlier message). That can become part of the demo. Any tools/utils produced can also become part of Prevayler.

3) We discuss how the approaches can be inproved.

-- KlausWuestefeld


Object evolution is built into Java's Serialization framework. See http://developer.java.sun.com/developer/TechTips/2000/tt0229.html for a good explanation of what to do. http://developer.java.sun.com/developer/TechTips/2000/tt0425.html talks about Externalizable objects, which are more work, but give you more power, and usually, speed.


Something that has worked really well for me is to have a private int variable called classVersion (or whatever) which is originally set to 0 like this:

private int classVersion = 0;

When I modify the class then I ...
1) increment the hard-coded classVersion value
2) add a private readObject method (see http://developer.java.sun.com/developer/TechTips/2000/tt0229.html) that checks the classVersion number and does any needed migrations.

private void readObject(ObjectInputStream stream)
throws IOException,
ClassNotFoundException {
    stream.defaultReadObject();
    if (this.classVersion == 0) {
        // convert field values here...
    }
}

Once all the objects are converted the conversion code can be removed or left for historical purposes.

JonathanCarlson




> odd that no one deals with Object Schema Evolution

There's actually been tons of research on this (mainly academic), and most of it not very useful for the implementer. Years ago when I worked on this, I found that Barbara Staudt Lerner's model was fairly comprehensive and implementable. See http://citeseer.nj.nec.com/staudtlerner96model.html for one of her papers on a system that was created as part of the Arcadia project (http://www.ics.uci.edu/~arcadia/system_pages/tess.html). Some of her other papers (also on CiteSeer) describe the model better. Note that some of the assumptions made in her model result in a system that uses heuristics to judge if one data element may be the same, but had maybe it's name and type changed. IMHO, with restrictions in allowable changes, you could simplify the model a lot.

Uttam Narsu

--

An excellent discussion of the issues involved in maintaining serialized objects can be found in Joshua Bloch's book, "Effective Java". --MichaelPrescott


As a practical approach, you can export the essential (e.g. public) features of
objects into XML. This has been implemented in several automatic marshaling XML frameworks,
including the latest JDK1.4 (Java Beans long term persistence). Since I'd
like to have full control of what gets exported (and how) into XML, I've used Jelly (Jakarta-commons) with huge success over a very complex serialized objects graph.
My experience is that you might need to write perhaps a few
additional Jelly tags (for example if you use non-standard collections), but that is all AND IS MUCH EASIER THAN DOING IT IN SQL or any of the above
mentioned dirty tricks!
One thing to bear in mind is that Jelly tags are not recursive and you have to "flatten"
your object graph by providing "view" methods (e.g. " getEmployeeAsList()" or "getPhonesAsMap")
to make it easy for Jelly. But as a good design principle, you already have this, don't you? Also, test exportability BEFORE! you freeze the code for the next release!

Hristo Stoyanov


is it just me or anyone else can see the value of http://skaringa.sourceforge.net/doc.html#ToC18 as a nice solution for Object Schema Evolution?

RodrigoOliveira

I have had a similar problem before, and Jonathan's solution (classVersion) worked best for me. I found two possible directions of migration: forward and backward.
- By moving "forward" I mean that newer versions of my classes must be able to understand/deserialize older ones. This direction must be implemented in order to guarantee the evolution of your object schema.
- "Backward" is the opposite situation (duh): older versions of my classes need to deserialize the newer ones. This scenario is not a "must" for object schema evolution, but is necessary if you have a situation where a mix of new and old classes co-exist. From what I found, there is not really a good way of implementing this case, but a possible approach is to use some meta-format like XML. Usually, you won't have a 1-to-1 mapping between both classes (otherwise you wouldn't be evolving at all), and you end up with some uninitialized members here and there that you need to handle (as mentioned before in the thread).

BTW, I couldn't find much literature about this topic either! Maybe we are the only ones that find it a big deal, and other people think it's as a trivial problem... Or maybe not... ;-)

- Rafael Sagula

--

A few thougts....

First the site is becoming a mess! :) How about droping wiki? Perhaps using some old fashioned "left-frame-with-links-and-right-frame-with-contents" approach would be simpler and cleaner

Anyways...

I'm implementing a small web application with prevayler. I've divided my code into the following layers/packages.

app.BusinessObjects -> Plain and simple BO's with get's and set's.
app.BusinessCommands -> Commands that manipulate a given object and modify the persistence objects.
app.PersistenceObjects -> A few classes that extend AbstractPrevalentSystem and hold the business objects into Collections or mere references.
app.System -> The business façades that contain entry points to applications funcionalities, including Load() methods
app.WebBeans -> the needed web tier.

I'm very used to writing 3-tier code for plain old RDBMS. The usual DataAccess, BusinessLogic and Facade tiers. So this seems a good approach to organize the code.

I've a question about updating objects. What is the best way to update an object fields? Having a ton of commands for each combination (object, field) ? Or duplicating my attributes into a couple of versions? Like proposed and current value in ADO.NET? I don't really know what is the best approach to follow!
For example. It's common for me to generate web editors to apply CRUD operations to a BO. If an object has 30 fields and the user updates only one how should I deal with it? Consider 30 different use cases and code 30 commands?

Now about schema migration... It is really the worst thing um prevayler. It's a nightmare to deal with this during the development phase of an application. I've been pondering using some XML serialization framework (or java.beans.XMLEncoder, even) to put the data into a format that I can "control". Trouble is, it's a mess to unserialize a few AbstractPrevalentSystem classes and reconstruct the references between all the business objects.

Tiago Matias


Use a single command for updating each of your CRUD objects (not a ton of them). Make that command know what fields to update.

Don't do schema migration during development. Just don't. This is not just a Preavayler thing. If there is a universal truth, this is it. @:) Use automated test scripts à la JUnit and create all the necessary objects from scratch for every single test script. We have a system with over 5000 tests scripts here that works just that way. --KlausWuestefeld


Have you seen JSX?

http://www.csse.monash.edu.au/~bren/JSX/

It serializes Java to XML making the serialized files human readable.  It also allows the java classes to change inbetween persistence and re-loading.  Can prevayler do that?

Prevayler does not do that but allows us to use any serialization framework instead os Java's native serialization, which is default. Thanks for the tip. --KlausWuestefeld


JSX has a big problem, though: it's commercial software. Therefore, we cannot include uses of it in Prevayler, but we might create an adapter if there's enough need for it.

-- CarlosVillela


Actually JSX seems to be GPL (and it's on freshmeat)
http://freshmeat.net/projects/jsx/?topic_id=868%2C45%2C50%2C19%2C866

-- Dav


JSX is dually licensed GPL/commercial. It is open source, but not free (as in speech). - TalRotbart


Consider the ThereCanOnlyBeOne approach to schema evolution. --KlausWuestefeld.