First things that come to my mind are referencial integrity, schema evolution, etc.
The VM's strong typing does referential integrity. DBMSs
don't help you much with schema evolution. You have to write the migration scripts anyway. -- KlausWuestefeld
Won't a sufficiently complicated database application written using Prevayler
contain an ad-hoc, informally-specified bug-ridden slow implementation of half of a DBMS
No. Your database didn't allow it so you probably didn't consider this an option but, now, with Prevayler
, you are finally free to use object libraries! @;) -- KlausWuestefeld
Are the best practices one finds in today database theory all best forgotten?
All the ones spawned from the problem of paging memory blocks to disk, yes. -- KlausWuestefeld
I think one of the reasons that Prevayler
can be so radically simple is that it doesn't deal with the issue of making data remotely accessible. A database server, apart from providing a reliable storage mechanism for data, necessarily has to expose this to remote clients. Prevayler
's striking simplicity is possible because it assumes that the data will be consumed only by the one, local application. It's likely that at some point during the lifetime of a Prevayler
-based application, there will be another application that needs to use its data. -- MichaelPrescott
Yes. With a database you hava a DATA server. With Prevayler
, you have an OBJECT server. The application is free, therefore, to decide how it will let its clients and other systems access those objects. Some options include RMI, CORBA, XML, Sockets, JSP, Servlets, etc.
For web applications, you have an extremely simple and fast architecture which is JSP/Servlets accessing the objects directly in the same VM. -- KlausWuestefeld
How is record locking implemented?
You now have objects, not only records. See: WhatAreTheCodingRestrictionsMyClientClassesHaveToObey
Has anyone tried IBM's open-source ICU4J unicode compressor? For string-heavy applications using Latin charsets, it looks like it could save as much as 50% of the space. It claims it works best for small to medium sized strings.
Here's the API: http://www-124.ibm.com/icu4j/doc/com/ibm/icu/text/UnicodeCompressor.html
Any plans of integration with EJB containers for persistence?
That might be good marketing. Technically, though, it would be a step backwards. -- KlausWuestefeld
Your idea is really the coolest one of 2002 so far!
Thanks, we're glad you liked it! @:) --PrevaylerTeam
Wouldn't it make sense to use some persistent collection (BTree?) that uses the filesystem to temporarily swap out objects? This would reduce the memory requirements...
That would be as complex to implement and as slow as an object database. -- KlausWuestefeld
for the PalmPilot
I'm realy happy you did this @:) I just was a month away from starting exactly this project as I'm about to build a massive multi user world for which I considered this persistence schema the best suiting.
Best Regards, angel'o'sphere aka email@example.com
*sigh* I've been looking at the demo's, tests and implemntation classes all day long, and I still can't figure it out. It's giving me a headache. Would it be possible to write a simple document on how to design an application to store its data using Prevayler
? I really can't figure out what class to use how and where, and how to handle everything :(
Take a look at GettingStarted
, especially the articles. There's a nice intro by CarlosVillela
The comparison of prevayler with SQL-Databases is misleasing. The latter supports
concurrent writers while prevayler does not. Currently I cannot see how Prevayler
addresses concurrent updates from multiple processes (VMs).
As of release 1.03, Prevayler
does provide concurrent transaction logging. Multiple processes (VMs) will access the Prevayler
VM normally, as if they were clients. -- KlausWuestefeld
OK. Assume this "classical" problem: Two users of an ERP system try both to
update the address-record of a customer. Both are connected by separate
processes to the db-storage. With an SQL-RDBMS
*one* solution is to open
transactions for both clients. Ony a single one will succeeed.
So how does Prevayler
support shared access in detail. Are there any
No. You actually have to know your OO and your Java. --KlausWuestefeld
How can one make the Prevayler
compatible with languages like Delphi or Kylix ?
Do they have some sort of object serialization? If so, then an ObjectPrevalence
layer can be written for them.
Schema evolution is the biggest problem. I am trying to port one of my applications to use Prevayler
and it is difficult to work in an iterative manner as all data has to be reloaded again and again using crude tooling. I decided to start building the business classes and commands based on my spec and then adding the servlet/velocity layer. Using a DB is not cool (but SimpleORM
helps there (www.simpleorm.org)) but allows for schema modification and good reporting.
Well, I will do both versions and see what is the most appropriate for the small product of mine.
If I get it done, I'll post it somewhere in the next weeks.
Cool. I suggest you start using automated testing. That way, you only have to reload your data at release time and not during development. --KlausWuestefeld
Next comment is that the system becomes a very big FacadeClass with too many methods relaying the work to CommandObjects.
I would like to be able to partition those methods in several groups:
- to allow separate developers to work on their part
- to avoid a big class that does everything on the BusinessObjectModel
I takes a while to get used to it (half a day or so) but it can prove useful (for prototyping, it's sure that I can win some time). For production development, that's another story... let's give it a chance though as the idea is clever (kind of like a SmallTalk image for BusinessModelObjects).
"I would like to be able to partition those methods in several groups"
Just make BigFacade return ^SmallFacade1, ^SmallFacade2, etc, with commands/transactions grouped according to their functionality. You can even have business objects returning the Commands/Transactions that operate on them. --KlausWuestefeld
What happens if I run out of RAM unexpectedly? Is graceful degradation just swapping out to Virtual Memory until the RAM delivery van comes around again?
Yes. Your system will become pityfully slow, though. See the PrevalentHypothesis
's use raw partitions to gain better control over physically writing to disk -- can Prevayler
use a DB to store the log?
The current implementation cannot usa a DB to store the log. I think the JDBC overhead is greater than the Java file-writing overhead anyway, though. --KlausWuestefeld
Also, disk is significantly cheaper than RAM, and is getting cheaper much faster. If I've got a 20GB database, can I really afford 25GB in RAM to use Prevayler
? I think my SA's would just laugh.
This looks useful for small to medium data requirement apps, but I can't see it scaling without transparent paging to disk (which OS may provide enough of).
How much does 25GB of RAM cost? US$ 3000? Can you afford that?
I can see a lot of advantages in using this for small projects. However I am still not sure how to use it in a web application architecture. Since a servlet based architecture (jsp, velocity, etc) is multi-threaded, what are some best practices about building access to the Prevalent objects? The simple approach is to use an exclusive lock around the request processing. Since data access is so fast it might be possible. To increase concurrency one would need more complicated locking around each usage of a "prevaylerbase" controlled object, but this makes things harder to set up. Any comments/experiences would be interesting to hear. - Andrew
How is Prevayler
related to and what are the advantages of Prevayler
over the RAM I/O Project (http://www.eecs.umich.edu/Rio/) (1996) which suggests a similar solution?
How can you keep bugs from contaminating the Data in a system using Prevayler
? If the whole system crashes, my data would be safe in a conventional storage system. What happens if my objects are RAM-persistent such as in Prevayler
? My application may be running for a while and messing things up before crashing. There's no place where my data would be safe. A bug in my software may cause the corruption of data which, in a conventional persistent system, could be more reliably protected.
That is a myth. "How can you keep [application] bugs from contaminating the Data in a system" using a database? --KlausWuestefeld
Do you guys have any reference implementation?
1.00.xxx is the reference implementation. --KlausWuestefeld
What is the main difference with the existing HSQL database, which is tiny SQL compatrible database written in pure Java with JDBC interface. It also can keep tables, indexes, etc. in memory, make snapshots and transaction logging...? Another clone?
is MUCH simpler (therefore much more robust), orders of magnitude faster (no JDBC overhead), and works for objects, not only for records. --KlausWuestefeld
Doesn't the system have to stop and restart to apply code changes even if the data and its structure is unchanged?
, new commands (soon transactions or queries), new business logic code, or new client code must be compiled into a new executable. To substitute executables, one must stop the old executable--preferably after a snapshot--and start the new executable, which must load data from a snapshot and log files.
Compare this to SQL, where the database server (Oracle
, Sybase, Postgresql, ...) is a stable, long-running process independent of application code changes. With SQL, I can change reporting requirements, transactional data sources, etc. without shutting down the database server. For example, my web shopping cart can continue to accept orders as I change my order fulfillment system. --JoelShprentz
We can just load the new version of the code on a replica and route the clients transparently to it. See: DoesntTheSystemHaveToStopInOrderToProduceAConsistentSnapshot
Also, Java is more naturally run on a virtual machine rather than compiled to an executable. If we don't want to use a replica, therefore, we can use Java's dynamic class loading in association with lazy migration to do a live upgrade of the system. That way, our web shopping carts can continue to accept orders as we change our SHOPPING CART system! @8) --KlausWuestefeld
'That is a myth. "How can you keep [application] bugs from contaminating the Data in a system" using a database?'
This is not a myth. You can actually use stored procedures and constrains to protect the data in a DBMS
. Also I can think of situation on which roll-back is really usefull. Ries
Did your stored procedures and constraints come as part of your DBMS
or are they part of "the application"? @;) Also, I didn't say rollback is useless, I said RollbackIsNeedless
I discussed this with coworkers and some just cannot get the point.
Let me quote a line from 'The Matrix': "There is no database"
"You have to understand, most of these people are not ready to be unplugged. And many of them are so inert, so hopelessly dependent on the system that they will fight to protect it." --Morpheus in "The Matrix"
We ran across Prevayler
last week and were surprised to find a lot
in common with a system we are developing. Chief differences:
+ Commands (we call them updates) are implicit; we trap mutative
operations and use the Java reflection API to extract the methods
and their arguments for later reapplication. (We rely on AspectJ
to help with this magic.)
+ We use XML to serialize both the operation lists and the state
snapshots, not Java serialization.
+ We can group multiple mutative operations into a single update.
In the context of this update, local state appears to have
changed -- accessors return the changed values -- but uncaught
exceptions cause the state to be rolled back to the point before
the update was initiated. Normal return from the update context
commits the changes atomically. (Again we rely on AspectJ
make this work.) As a by-product of this, we can offer robust
But for the most part the approaches are strikingly similar. No
database, all state must be in memory, changes are logged with
a repository before committing to in-memory state.
To answer the RAM cost question. For 1GB of Sun RAM, for an x000 range, is $4,000 -- $100,000 for RAM is pretty expensive, plus in order to use this much a BIG box is required, at least an Enterprise 6000. When this maxes out because of extra data, upgrading a whole box instead of adding a disk to the array/SAN.
I think there are alot of really excellent things about Prevayler
... It's fast, easy to learn and use, small footprint, and so simple that it doesn't come with (or need) an ant build. These are great things. But, alot of what I'm reading here about RAM vs disk costs and the associated limitations (such as the admitidly "pitifully poor performance" of Prevayler
when the OS runs out of RAM) lead me to think that a hybrid solution is probably the most logical for larger projects.
Fer Instance. Lets say I have an online shop that sells books. To try to cache the entire book catalog in the same memory space as the app server and application is just eating up resources that most of the time aren't used (most people will likely be interested in a set of books way more than another set). This (as well as full text searching, etc) are where I think you're still going to need a RDBMS
. But why does it mean that you can't store the "top 1000" selections in ram, as well as say, the customer information (so they can quickly login). This is all hypothetical, but I guess what I'm trying to say is that I can see alot of instances where you would want to load and unload a subset of data into something like Prevayler
. Why does it have to be one or the other. Isn't there room for the 2 to peacefully coexist? I'm convinced it's fast, but not convinced that for a number of applications it wouldn't grow unanageably large and eventually cause OutOfMemoryErrors or suffer from very slow performance. How about storing a rolling set of commands and archiving the previous ones to the database? Does the Prevayler
team see any use for any types of hybrid approaches? Has the Prevayler
team done any testing to determine and recommend strategies for developing very large catalogs or other similar applications? Simply saying "your modeling sucks" really doesn't solve this problem
I think all of this worrying about infinitely large datasets being stored in memory is a bit on the ridiculous side... Of course you wouldn't want to store hundreds of gigabytes (or more) in memory on a single machine. That wouldn't be cost effective, and since in this game it is all about cost vs. benefit it would be almost ludicrous to spend the kind of money to store a very large system in main memory.
That being said. There are many options for large data sets (as Joe said). One is using a RDBMS
or ODBMS. Another is "roll your own" solution maintaining your "business" data in memory and your "non-business" data in a slower to access medium (the file-system, tape, shelf 552 box 27, etc). Prevalayer serves a purpose in that it allows projects that need a simple way to have persistence from having to worry about having an external database and communicating with it.
--Brian S. Lloyd-Newberry
Why was this site unavailable for almost 24 hours after it was posted on slashdot? That doesn't sound like a highly responsive data store.
How foolish. This guy incorrectly assumes the Prevayler
site runs on Prevayler
and draws his conclusions from that. --KlausWuestefeld
Where's my relational algebra? (e.g. http://www.cis.ohio-state.edu/~gurari/course/cis670/cis670Ch4.html) Prevayler
does not match the capabilities offered by a relational database. I'd say you're comparing apples to oranges.
Yes, it is comparing apples to oranges. You keep using your RDBMS
. My clients average database size is 1 - 50mb. Prevayler
is absolutely perfect for that.
They also don't mind the fact that development time is cut in half. - Sean
A highly responsive data store also needs a highly responsive pipe to pump data through. The slashdot flooding effect doesn't always mean that a server is unable to handle the load, but the internet connection too.
This site does not run on Prevayler
but yes, Prevayler
actually raises the performance bar on all other components in your architecture. --KlausWuestefeld
I'm really loving prevayler. Although I must make one comment... I wouldn't use it for a project where the DB size was going to be over 4gb... The comment made is "How much does ram cost? Can you afford <insert amount here> for 32gb of ram".. The rebuttle to that is "Yes I can afford the ram. No I cannot afford the server to stick it in." A server which can hold 32gb of ram isn't an intel based machine, not one you buy at your local computer market anyway :)
Until the 64bit machines become common place, and desktops/servers in our sme's are > 4gb standard, I will only use prevayler for smaller apps (Hell. some of my "big" apps, only use 50mb of ram for the DB anyway, so its not really much of an issue)
How does Prevayler
deal w/ an evolving object model? For instance, say I have a Person class w/ two fields: 'firstName' and 'lastName'. I persist an object of this class using Prevayler
. Down the road I update Person to have a 3rd field: 'middleName'. What will happen when I try to pull the older instance out of Prevayler
(the one w/o 'middleName') and cast it to my updated version of Person (the one that now has 'middleName')?
My understanding is that Prevayler
uses simple object serialization for taking snapshots. When I tested this (w/o Prevayler
...just serialized to text file) I got a "InvalidClassException: Person, local class incompatible: stream classdesc serialVersionUID = -5507413414685982915, local class serialVersionUID = -8332091257996572949"
One issue that necessarily happens in a prevalence system is that the operations have to be completely serialized, which creates a potential bottleneck at the prevalence layer that a persistence system doesn't have.
, only one command can be executing in the entire system at a given time. Many DBMSs
can update distinct rows concurrently - and pretty much all of them can at least update different tables concurrently. Prevayler
requires each command to be completely written to the transaction log and then completely executed before the next one can start. This could be pipelined a little (once the first is written and begins executing, the write for the next could start), but that's as far as it can go in an automated environment.
On the whole, it probably doesn't matter because of the improved performance you get from getting rid of the object/relational mapping code, but it's something to consider.
Running two systems in parallel, with the "hot" system *not* writing the transaction log could maybe improve this, too.
Just some random thoughts --- Scott Brickner
1.03, transactions are logged in parallel on several files. You forget that databases have EXACTLY the same thing: the redo log. Databases still have to write to the rollback segment and to the actual data-blocks (DBWR process in Oracle
, for example). They can do that in parallel, as you mentioned, but Prevayler
doesn't even have to do any of that. The Prevayler
equivalent is the Snapshot, which can be taken at an off-peak moment or even on a replica. See InstantaneousTransactions
Very enthusiastic about the concept of Prevalence
. Tested Prevalyer last night and coded a working demonstration up in 30mins. One question..
If you are coding nearly any real world system your business objects are likely to change fairly frequently. I coded a small business object and populated my system, then brought down the system and added a toString() method to the object. When I tried to re-populate the system Serialization complained, I presume because the object it was trying to de-serialize to was not the same as the object it had serialized from (my business object had changed). Any advice on avoiding this?