Ideas on Enterprise Information Systems Development

This blog is devoted to ideas on Enterprise Information Systems (EIS) development. It focuses on Lean Thinking, Agile Methods, and Free/Open Source Software, as means of improving EIS development and evolution, under a more practical than academical view. You may find here a lot of "thinking aloud" material, sometimes without scientific treatment... don't worry, this is a blog!
Every post is marked with at least one of Product or Process labels, meaning that they are related to execution techniques (programming and testing) or management techniques (planning and monitoring), respectively.

Thursday, August 23, 2012

Enterprise Information Systems Patterns - Part XIII

Why I believe that the Relational Databases are the biggest barrier to make EIS really flexible
We are now making EIS Patterns Relational Database (RDB) -aware through Django framework. Just as I expected, problems would start to appear when it is needed to represent Python's dynamic nature in an RDB.

It is a fact that while the technology behind RDB is really mature and extremely efficient and safe, the relational "way of thinking" was created in the 1970s to solve 1970s' problems! In my humble opinion, although mathematically sound, the way the relational model represents all-part, one-to-many, and many-to-many relationships is weird. Think of representing books in a relational way. You store all books in one room, while all their pages are in another room. Retrieving a given book's pages is a question of entering the pages room and asking which pages belong to that book. The book by itself is not able to identify its pages... In other words, it doesn't matter how elegant or representative is your object model, it will transformed into a set of unnatural, strange relationships.

So let's check an interesting situation where the dynamic nature of Python is used in EIS Patterns. The Process class controls the way objects collaborate to perform a given business process, wrapping and logging objects' methods executions using the run_activity method. For this discussion the interesting part of this method is its return clause:
return {'actor':actor, 'arguments': execution_arguments,  
        'result':activity_result, 'start': activity_start,
        'end':activity_end}

This clause returns a dictionary containing:
a) The execution actor: a Node of which the method is wrapped;
b) The execution arguments: a list containing the parameters for the method;
c) The result: the result produced by the execution;
d) The start and end: date and time of the starting and conclusion of the execution.

Objects in (a) and (d) are of known types (Node and DateTime), while (b) and (d) are collections of objects of unknown types, including, potentially, other complex collections. So, how to represent it on a RDB? One solution is using descriptors to detail the dictionary, proving for each returned object or method argument, the following record:


movement_oid
return_or_argument
type
reference_or_value
Movement object identifier
Boolean for marking as return value or argument
Object's type. Basic types, such as numbers and strings, have their value declared, complex types are referenced
Stores basic types values as strings, and references for complex types.

Complex objects are properly stored, normalized, in tables or, in raw, as blobs. Retrieval involves developing specific algorithms for each type, which can become a serious problem when dealing with a lot of Decorators. Besides that, a serious problem would be do deal with multidimensional collections. In other words, mapping this logging scheme to RDB would turn the framework non-flexible and too costly to adapt, descriptors wasn't the solution.

The solution was to use JSON, but with some investigation on how to do it in a flexible way, which I will explain in the next post.

No comments:

Post a Comment