Ideas on Enterprise Information Systems Development

This blog is devoted to ideas on Enterprise Information Systems (EIS) development. It focuses on Lean Thinking, Agile Methods, and Free/Open Source Software, as means of improving EIS development and evolution, under a more practical than academical view. You may find here a lot of "thinking aloud" material, sometimes without scientific treatment... don't worry, this is a blog!
Every post is marked with at least one of Product or Process labels, meaning that they are related to execution techniques (programming and testing) or management techniques (planning and monitoring), respectively.

Monday, March 28, 2011

Why Open Source for EIS?

Short Answer
It is safer. And cheaper. And auditable...

A bit longer answer
I could stop here because every IT professional heard this at least once in life, but companies still rely a lot on Windows in both client and server sides. Today I read from a big Brazilian newspaper that USA's Department of Defense estimates that 55,000 new computer viruses appear a day. Guess which operational system these viruses aim for?

If you are not convinced, I will summarize some figures I found here (as of June 2010):
• Windows:
-1 to 1.2 billion MS-Windows virus signatures
-Only 1 million signatures are checked by an antivirus => 0.1% can be scanned
• Others:
-270 Apple Mac (OS-9 or older) viruses
-No known Apple Mac OS-X viruses as of June 2010
-60 Linux viruses (of older distributions)
-No know Linux viruses as of June 2010
• And more:
-55,000 new Microsoft Windows malware per day (as of March 2010)
-Compare to 22,000 daily average in 2008 and 40,000 daily average in 2009.

Why people still use Windows and MS-Office in companies? The answer relies on legacy. Legacy technologies, and a terrible asset found in many enterprises: IT people with legacy minds.

Sorry for being so direct and a bit rude, but I challenge anyone to prove, unless recent heavy investments justifies, that Windows and Office are not replaceable by Linux and Libre Office on most enterprise areas. Libre Office does 95% of what Office does, and the other 5% probably you will find few users in your organization who really need it. Some other softwares, in special the graphical tools, are not Linux-friendly in general, however, even for companies with design bureaus - people that uses CAD and such -  again you will find many users that don't need Windows. And still, why use MS-Office?

The IT Legacy Minds
I think that with the figures above, it is very unlikely to justify "scientifically" the use of Windows in most organizations, even when we talk about previous investments. Think of the money spent on Windows, Office, and anti-viruses licenses. Sum up the cost of repairing infected machines, the loss of information, the resulting productivity losses, the private data disclosure, and you see that the cost of training people to use Linux and Open Office is much, much lower.

A real problem is with legacy systems, written to run in fat Windows clients. In the case of business systems, if they are more than 10 years old, it is justifiable, if not, it is strange that your developers or your software supplier haven't developed them for the Web, given that one of the things that were clear after the Y2K problem was that fat-client information systems were much more expensive to maintain. In other words, unless some clear necessity of using fat clients justifies, IT people made a wrong decision.

In some specific areas, such as industrial automation, fat clients are really necessary, however, even in these cases, you can develop in Linux. I myself took part of a team which developed, deployed, and supported, from 2005 to 2009, an industrial application made in LabView for Linux. In fact, we were one of the firsts to put in production PXI-SCXI hardware running Linux in the world, and we did it for an offshore oil producing plant - a quite severe environment.

Therefore, I feel free to conclude that the main barrier is really the lack of IT professionals to deal with Open Source solutions, and this problem is caused by  the number of IT professionals that simply don't want to study new things. It seems paradoxical, however, in my opinion, IT people are quite conservative!

There are also personal interests involved: if the organization adopts an IT structure that is safer and bring much less security problems, maybe some people will loose their jobs (you can find a fable on this type of employee behavior here).

Conclusion
Finally, I would like to mention a post from the Agile Scot blog: Department of Defense (DoD) is #1 using Open Source for Government. As the title suggests, the DoD uses a lot of Open Source, thus, even if your organization is bigger than the USA's DoD, do you think you have more tentatives of invading your systems than them?

Tuesday, March 22, 2011

Enterprise Information Systems Patterns - Part VII

Understanding how concepts are implemented
In Part V of this series, I used an UML Class Diagram to show the relationships among the framework's concepts. Although this diagram can help understanding these relationships, being it a class diagram, it also "says" that all concepts are "classes", which is not true. A better way for describing these relationships is using an ontology, as shown in Figure 1.
Figure 1: Ontology representing the relationships among framework's concepts

Operations, since they are immaterial resources, need to be realized by some active entity, in this case, by Nodes. Therefore, they are implemented as "functions" (in fact "methods"), instead of instances of a given class. Although in Python functions are also objects, one cannot define them as objects of a given business class. However, it is necessary to typify Operations somehow so that we can identify among some object methods which represent business operations. The solution for this is using Python Decorators, which are not exactly the same thing as the Decorator Pattern used in this framework. A Python decorator is a specific change to the language syntax that allows to more conveniently alter functions, methods, and classes.

Thus, it is possible to create a @operation(a_category) pythonic decorator which will store into method objects an attribute named category, with value a_category. In that way it is possible to mark methods as of Operation type, which is important to query a given class for its business skills. For instance, in a bank information system, it is possible to query a Credit_Analyst class to check which business operations objects of this type can perform. Or the contrary, using a keyword search discover, identify which methods of which classes may perform a given business operation. Moreover, I can create an ontology and navigate through it to check relationships among resources, nodes, and movements. With this ontology I can even suggest to a business analyst specific configurations for the system, given appropriate search terms.

Processes are sets of transformations and transportations, which in turn are used to encapsulate operations. Encapsulate means to forward calls to the appropriate Node subclasses methods - in fact to their Business Decorators methods - and log all information related to these calls, such as date and time, parameters and associated objects. Thus, Process objects implement the logic which coordinates how Nodes work. In other words, Processes implement workflows, which means, in turn, that the framework is process centric, or workflow centric.

Decorating, Decorating, and Subclassing
To make things more clear, let's call "decorators" the extension to Python syntax, and "business decorators", the classes created to decorate the framework concepts, as explained in Parts IV and V. Said that, let's check how the framework is implemented and extended:
a) Resources:
-Operations: methods, implemented into business decorators and marked with @operation decorator (yes, decorators for decorators).
-Work Items (formerly known as Materials): "classical" objects, for each new type of material, a new subclass of material is created.
-Kits: "classical" objects, for each new type of kit, a new subclass of kit is created. Given that kits can be composed by Operations, which are in fact methods, this composition is implemented through a list of references to these methods.
b) Nodes:
-All three subclasses are extend through business decorators. For instance, a person can be decorated by Developer (business) decorator to represent a software developer, or a Credit Analyst (business) decorator to represent a bank's credit analyst. No subclasses used. They can have methods related to @operations, to @rules of association (check Part VI), as well as methods representing internal machinery, in this last case, typically private.
c) Movements**:
-Transformations and Transportations are used as is, no subclassing, no pythonic decoration, no business decoration. They are used to build processes and store information related to operations calls, by encapsulating @operations during their configuration process.
-Processes are extended by mimicking workflow templates (current work), or by using workflow engines to make them work. A way of implementing processes is using Business Language Driven Development implemented through State Machine using decorators*.

For the reasons presented in Parts IV and V, subclassing is avoided at maximum - it is openly suggested only in the case of Work Items (Materials), which are considered passive "data bag" classes.

*Post-publishing note #1: we decided to develop our own State Machine code, and make it independent of this framework, check the Fluidity project.

**Post-publishing note #2: Transformations and Transportations are not classes anymore, as explained in Part XI.

Friday, March 11, 2011

Enterprise Information Systems Patterns - Part VI

A DSL for Rules of Association
The use of rules of association may facilitate configuration: by querying them it is possible to check, for a given concept, which responsibilities can be assigned to it. The definition of a Domain Specific Language (DSL) for creating rules of association would be interesting, because it would allow the construction of more readable rules. In fact, a branch of should-dsl can be used. Should-DSL is a DSL for assertions, which are commonly used by coding techniques based on design by contract.

In fact, should-dsl's Predicate Matchers and some of the already available "ordinary" matchers such as respond_to,  which checks if an object has a given attribute or method, and be_instance_of, which verifies if an object is of a given type, can be used for formatting rules. For instance, let's suppose we have a decorator that is supposed to work only with persons:

class A_person_decorator:

    def __init__(decorated):
        ...
        #checks if decorated is compliant to the rules
        self.check_rules_of_association(decorated)
        #if it is compliant, sets a pointer to it
        self.decorated = decorated
        ...

    def check_rules_of_association(decorated):
        try:
            #uses be_instance_of matcher to check if decorated is compliant
            decorated |should| be_instance_of(Person)
        except:
            raise RuleOfAssociation('Person instance expected, instead %s passed' %  type(decorated))     
     
    ...

In the example above, type checking is necessary given the nature of Python language, however, the focus is in creating more complex rules related to the business rules.

Another interesting possibility is rule querying:
(i) For a given decorator, what are its rules of association (list its rules)?
(ii) For a given decorated, which decorators it can use (list which responsibilities a given class can assume)?
Query results could be presented in both human and machine readable ways. In other words, it would be some mechanism for querying rules, for both automatically checking associable objects in a object base, as well as informing people what is needed for configuring a concept to receive a given type of decoration.

A funny situation is that should-dsl matchers are used in unit tests to check the correct functioning of the rules... written in should-dsl.

The next post will describe some more details that decorators need to implement.

Wednesday, March 9, 2011

Enterprise Information Systems Patterns - Part V

Decorators X Subclasses
In the previous post, I redefined the level of abstraction for the framework, and now it is time to discuss a bit about Decorators. In order to facilitate the communication, a simple UML Class Diagram is provided by Figure 1, representing the structure after this redefinition.

Figure 1: UML Class Diagram of the new structure

In Figure 1, each of the (now) three abstract concepts (resource, node, movement) has two "opposite" subclasses (operation X material, person X machine, transformation X transportation) and one aggregator subclass (kit, organization, process). It is important to note that  aggregators are Composite objects.

By (re)checking the Decorator pattern documentation, both on Wikipedia and on the classic book from Gamma and colleagues, we can find that:
- While subclassing adds behavior to all instances of the original class, decorating can provide new behavior, at runtime, for individual objects. At runtime means that decoration is a "pay-as-you-go" approach to adding responsibilities.
- Using decorators allows mix-and-matching of responsibilities.
- Decorator classes are free to add operations for specific functionalities.
- Using decorators facilitates system configuration, however, typically, it is necessary to deal with lots of small objects.

Therefore, by using decorators it is possible to, during a business process realization, create an object, associate and/or dissociate different responsibilities to it - in accordance to the process logic, and log all this. In that way, I have two main benefits:
i) The same object, with the same identifier, is used during the whole business process, there is no need for creating different objects of different classes.
ii) Given (i), auditing is facilitated, since it is not necessary to follow different objects, instead, the decoration of the same object is logged. Moreover, it is possible to follow a single object during all its life-cycle, including through different business process: after an object is created and validated - meaning that it reflects a real-world business entity - it will keep its identity forever.

Summarizing...
Thus, the benefits of using Decorators are:
i) More dynamic and flexible enterprise systems, through the use of configuration and pay-as-you-go features.
ii) Easier auditing, given the fact that objects keep their class and identification while get new responsibilities.

An example
To better understand this, I will use a simple example: a teacher & researcher career. Let's suppose in our institution we have two kinds of teachers, the 20-hour and the 40-hour. While the first one is supposed only to teach, the second is also a researcher, and therefore holds more responsibilities. Given that there is vacancy, a teacher can change from one to another category. Since both types are "teachers", the ordinary object oriented solution would be to create a basic class named Teacher, which would hold the common features of the teaching career, and two subclasses, named 20-hour Teacher and 40-hour Teacher. With this architecture, I can see no simple solution than creating objects of the different classes and copying attributes back and forth every time someone changed his/her option of working 20 or 40 hours.

Moreover, imagine that teachers can also be assigned to administrative positions, such as department dean or university president, with a lot of new responsibilities associated, while still teaching. Keeping track of all these assignments and withdraws of responsibilities would be complex and error-prone. Also, I like to think that being a dean is getting a new responsibility, instead of becoming a different type of employee, in special when we think that this is a temporary assignment.

Now imagine that we have Persons and I decorate them as they are assigned to  new responsibilities. In that way one can use the same object to register all career assignments: 20-hour, 40-hour, dean who still teaches, and even represent some administrative employee who teaches in a specific course - without loosing his/her attributions in the university's management. Teaching, researching, and administering would be decorators that could be associated and dissociated to objects as the business processes require (pay-as-you-go).

Another point is that each decorator must keep a set of rules of association, which is responsible for allowing or prohibiting objects to be assigned to responsibilities. Each "decorated" object is also responsible for keeping a record of its responsibilities (bi-directional relationship). If a given object respects the rules of association of a given decorator, it can be decorated by it, allowing a very flexible way of creating new "types" (mix-and-match).

This reasoning is also valid for other concepts, for instance, a given manufacturing cell (Machine), can run new operations as new tools are attached to it. The same is valid for organizations, as new machines and persons are associated to it.

An interesting point on using decorators is when we think of business entities ruled by laws and regulations, such as contracts, government agencies, or even governmental economic measures: as regulation or environment changes, things can be added and/or deleted from the entities' scope. Rule/Law decorators can be programmed to function during certain period of time, detaching themselves from the decorated objects when this period is over. As an example of this last case, let's think of a tax refunding mechanism that is valid only during recession periods and for specific types of sales.

Of course, a problem is that we have to deal with a potentially big set of decorators. However, given the mix-an-match possibilities of decoration, this number is smaller than the number of classes that would be created to map all possible combinations. Using an extreme example, if I have three concepts that can, each one, be assigned to four responsibilities, we would have 12 classes - and possibly by the use of multiple inheritance. By using decorators we would have 3 classes and 4 decorators, or 7 entities to manage (in 12 possible combinations, of course).

Said that, the next step is to discuss the rules of association and details for implementing decorators.

Thursday, March 3, 2011

Enterprise Information Systems Patterns - Part IV

Changing the abstraction level
As expected, the high abstraction level brought by the use of only 4 concepts to try to define a whole system would bring side effects*. The basic one is that, if on one hand abstract concepts are highly reusable, on the other, they need to be extended a lot in order to become useful for representing concrete concepts, even the basic ones, such as Person or Product. I mean, a Node can be the abstract concept behind a Person object as well as a Factory object. However, you will need to provide a lot of extra code to make these concepts work. Given that I don't want to create lots of subclasses, the alternative is to use Decorators. However, in the same way, it is not good to abuse of a given technique, even if this technique is a well known design pattern. What I was going to do was create a lot of decorators instead of creating a lot of subclasses. In other words, I would be going against my main goal, which is to create a framework extensible by configuration, instead of programming.

Therefore, I stopped a bit to rethink the framework's abstraction level. One thing I realized is that it would be very hard to configure too abstract concepts, thus, by creating one more class level, I could find less generic but easier to reuse concepts. A side effect is that I can get rid of one more of the original concepts - Path. In other words, I shortened the width and broadened the length of the class hierarchy. The idea now is to use three classes: Resource, representing the production resources; Node, representing the entities that transform and transport these resources; and Movements, which represent transformations and transportations of resources. Each of these abstract concepts has three subclasses, representing two "opposite" concepts and an aggregator of these first two. Thus, the structure will be like this:

1)Resource()
-Material(Resource): product, component, tool, document, raw material...
-Operation(Resource): human operation and machine operation, as well as their derivatives.
-Kit(Resource): a collective of material and/or immaterial resources. Ex.: bundled services and components for manufacturing.
Comments:
a)Alternative terms such as Object (material) and Action (immaterial) can also be considered, however, the term Object can bring a lot of trouble in programming.
b)Another point is information, which is not material, neither it is something like operation. Document is used as the physical representation of information. c)Structural and Behavioral would be alternatives, however, they seem to be too academic to be used.

2)Node()
-Person(Node): employee, supplier's contact person, free lancer...
-Machine(Node): hardware, software, drill machine...
-Organization(Node): a collective of machines and/or persons. Ex.: manufacturing cell, department, company, government.

3)Movement()
-Transformation(Movement): an "internal" movement. Ex: transforming raw material, writing a report
-Transportation(Movement): a movement of resources between two nodes. Ex: moving a component from one workstation to another, moving a document from one department to another
-Process(Movement): a collective of transformations and/or transportations, in other words, a business process.
Comments
a) There maybe some confusion between Transformations and Transportations and Operations, but they represent different things. Movements use operations to transform or to transport resources. For instance, an industrial transformation uses a given machine operation during a given period of time to transform a given quantity of raw material into a component.

The next step is to define how to extend the basic concepts to make them work properly in concrete business processes representations, which will be discussed in the next post of this series.

* As I said in my first post of this series, if you need a "real-world" and flexible Python framework, give ERP5 - from which the basic ideas for this framework were taken - a try. Remember that this framework is a didactic one, therefore, some assumptions are simplified.

(The Change Log maps this series of posts to commits in GitHub)