Event Soup and The Story of Amaldo

Continuing with our discussion on Complex Systems and CEP, let’s turn our attention, momentarily, to more scientific, or perhaps philosophical, discussions. Let’s review a bit of chaos theory via the Lorenz effect and talk about the “event soup”, a phrase I shamelessly coined in On the Maturity of CEP.

Edward Lorenz was using a computer model to simulate weather when he rounded .506127 to .506 and the result was a vastly different weather scenario.  This motivated Lorenz and others to discuss, mathematically, how small, seemingly insignificant events can have profound effects over time.  Most of you have heard of “The Butterfly Effect”, the popular notion that a butterfly flapping their cute little wings can have a dramatic effect on weather patterns far away (in space and time); this is a metaphor for the Lorenz effect.  Let’s ground this concept with a similar metaphor, The Story of Amaldo.

The Story of Amaldo.

A man promised his wife that he would stop smoking.   Worried about his wife’s nagging, the man sneaks out the back door and walks down an alley behind his urban home.  When he stops to light his cigarette his eyes meet the eyes of a small alley cat.  The frightened alley cat runs under the building into a coven of alley cats who scatter in many different directions.  We follow one particular cat who jumps up on a nearly ledge of a neighbor’s bedroom window.   At the same time, a man and a woman are making love and they woman sees the cat in the window, she turns, and accidentally knocks over a lamp.   Her lover, overly aroused and a bit drunk, gets angry and complains how she always ruins his mood.   They argue and there is no love making that night.   It so happens that because they did not make love and conceive a child, their son was not born, who fathered another son who turned out to be a great scientist, Amaldo, who (would have) discovered a cure for a devastating disease. Many people died because Amaldo was not born into this world.

For a lack of a more precise definition, let’s call Amaldo’s story causality, or simply cause-and-effect.    One of the factors that makes causality complex is that causality is vast and deeply inter-related.   For example, in our event-scenario above, we only followed one cat and a bit of causality of one cat’s journey and how it effected Amaldo.  It is easy to see that the cause-and-effect of events increases exponentially over time in most circumstances. Unfortunately, I don’t have a formal mathematical model at hand provide support to this claim.

Yet, without formal proof, I think most readers will agree that the universal set of events grows exponentially with each microsecond or nanosecond.   Every decision you make, every choice you pick, every mouse click, every stock transaction, is an event which can (and does) effect many lives.   Each decision you make follows the same principles as the Lorenz effect.  Each event also follows the same principles.  Naturally, some events are more “influential” or “consequential” than others and therefore have a larger effect.  My apologies for the lack of formality in this part of the discussion.  I hope this truth is self-evident without formal proof.

Events in computer networks have similar non-linear consequences.  The Lorenz effect was based on a simple computer rounding error.  Professor Luckham attempted to describe this in his formal CEP model as the “event cloud.”  It turns out Professor Luckham’s formal model of his event cloud, based on POSET theory, was in the right direction, albeit too simplistic.     For event processing, and particular complex event processing, we might be better off if we think of the “event soup” versus the “event cloud”.

When we think about events in electronic networks,  some events have an obvious profound effect on our world.   Some events, similar to the Lorenz effect, are seemingly insignificant and have a great effect as well.   No matter how we view this, the shear volume of events is increasing exponentially over time, today.   This trend will continue indefinitely unless something very dramatic happens and our planet freezes over or some other global disaster happens!

Note: An interesting question not addressed here, but obviously important; is will the exponentially increasing event soup effect a future global disaster?

There is some interesting research to be done on applying complexity theory to events and event processing.  The very nature of the expanding event soup demands that other researchers look at complexity, causality, Lorenz effects and more, in the field of event processing.

We need more formal models in this area, and I suggest to you in this post that there is a future Nobel Prize winner out there who will build formal [complex event] processing models that will help us get our minds around the every growing complexity and space and time causality of the exponentially expanding electronic event soup.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • Furl
  • Reddit
  • Spurl
  • LinkedIn

CEP in the 1960s: Air Traffic Control

Professor Luckham wrote about CEP and the future of global Air Traffic Control (ATC) in The Future Event Driven World: Global Air Traffic Management.   One of the first commercial applications of complex event processing was in the early 1960s in the field of commercial aviation, for example see the history of Air Traffic Control.

Although experimental use of computers in ATC had begun as early as 1956, a determined drive to apply this technology began in the 1960s. To modernize the National Airspace System, the FAA developed complex computer systems that would replace the plastic markers for tracking aircraft. Instead, controllers viewed information sent by aircraft transponders to form alphanumeric symbols on a simulated three-dimensional radar screen. By automating some routine tasks, the system allowed controllers to focus on providing separation. These capabilities were introduced into the ATC system during the ten years that began in 1965.

Applying a phrase like “complex event processing” to a subset of software on the market today certainly does not negate all the CEP applications that existed long before the phrase was coined or became popular.  Global ATC has been defined as a future use case for CEP.  Obviously, early ATC history, where processing complex events goes back as far as the early 1960s, is quite significant to our understanding of CEP/EP.

Event processing applications, including complex event processing applications, have been around for over 40 years. There has been four decades of both commercial and military event processing and CEP applications.  ATC is only one example of myriad historical commercial applications of complex event processing.


Note: This post was adapted from my post in the Earliest applications of commercial CEP.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • Furl
  • Reddit
  • Spurl
  • LinkedIn

Complex Systems and CEP

A complex system is defined as a system composed of related components that as a whole exhibit one or more properties not obvious from the easily observed properties of the individual parts.  This is certainly true of the CEP notion of the “event cloud” in network systems.   A modern energy or telecommunications network is composed of many systems and, to the casual observer, the properties of the individual events seem disconnected and unrelated.  Making sense of the myriad seemingly disconnected components and causal relationship in the “event cloud” is the purpose of complex event processing.

The complexity of a networked system may be of one of two forms: disorganized complexity and organized complexity. In a nutshell, disorganized complexity is the situation of a very large number of disparate, but possibly related, components.  Organized complexity is the situation of systems exhibiting emergent properties.   Examples of complex systems include biological systems such as ant colonies, cells, living things, human beings and nervous systems.  In fact, many systems of interest to humans are complex systems, including natural (climate, for example), biological systems (ant colonies, social networks) and man-made complex systems (telecommunications networks).

Complex event processing is the machine-machine and machine-human process of trying to make sense out of difficult to observe network-centric situations, both opportunities and threats, inherent in networked systems, as discussed in our eight part series, What is Complex Event Processing?

CEP is, therefore, like a type of network-centric microscope.  For example, when we look at a complex system such as a lake with our naked eyes, we are only able to observe a fraction of the system properties.  We see the surface of the lake, surface plant life and fish, and perhaps a boat with a fisherman.  We see the clouds and the sun and other related complex systems.  However, with our naked eyes we can not understand the complexity of the vast majority of activity in the lake.  Nor can we understand, with our naked eyes, the relationships between seemingly disconnected complex systems.

The same is true of man-made network systems.   With our naked eyes we see routers, hubs, switches,  and servers.  We see log files and computer screens.   We see performance graphs and visual results of constructed, but limited, queries.  However, we cannot see the myriad causal relationships in the network with our naked eyes.  What is required is the capability to process the events within the complex system.   That capability is not a single technology.  That capability is what we call “complex event processing”,  or CEP.

Earlier I posed the question, What Defines Complexity in Rule Processing? and received no responses from the many rules experts who are kind enough to frequent this corner of cyberspace.  In addition, I asked the rhetorical question, Should We Simply Rename CEP BRMS? because the capabilities of the self-described CEP vendors tend to mirror the capabilities of business rule management systems (BRMS), relatively speaking.

What is required in the evolution of our critical understanding of telecommuncations and data networks as a complex system is far beyond what we are seeing in the self-described CEP commercial marketplace.   What customers require are the capabilities to process complex events in and from complex systems.  This capability is what we call “CEP”, decribed in The Genesis of Complex Event Processing: Asymmetric Capabilities, CEP, Event Noise and Asymmetric Event Processing and The Motivation Behind Adaptive Analytics and CEP.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • Furl
  • Reddit
  • Spurl
  • LinkedIn

The Genesis of CEP Confusion

Opher Etzion responds to the onging confusion with On basic classification of terms.  First of all, there has been confusion in the CEP/EP community since the term “CEP” was coined, so the confusion is nothing new.  Second, one of the main sources of confusion is the Event Processing Technical Society (EPTS), chaired by Opher.   The EPTS definition of  “complex event”  is as the heart of the confusion, as follows:

Complex event: An event that is an abstraction of other events called its members.

The above definition is way, way, way too broad (did I repeat “way, way, way” enough?).  In fact, the defination is so amazingly broad, it is just about meaningless.   Using the EPTS defination of “complex event”, simply aggregating two or more events creates a “complex event”.  Moreover, using the EPTS definition, simply counting events results in a “complex event” because counting events creates another abstraction, called the sum of the members.

The genesis for the confusion should be really obvious to everyone. Because the EPTS has chosen to define “complex event” so broad as to classify just about everything under the sun as a “complex event”, the root cause of the confusion is not in “elephants and blind men” as Opher likes to say, it is simply an overly broad definition.   Defining a “complex event” as just an abstraction of other events is impossibly broad.  It is the genesis of the confusion, without a doubt.

The same problem exists with the EPTS defination of the term “complex event processing”, below:

Complex-event processing (CEP): Computing that performs operations on complex events, including reading, creating, transforming or abstracting them.

Basically the folks in the EPTS, including the steering committee vendors, academics and analysts, have created all of the confusion, because they chose (for marketing purposes, I assume) to define “complex event” and “complex event processing” in such a broad and all encompassing way.  In a nutshell, the terms “complex event” and, in turn, “complex event processing” has almost no actionable meaning, because it means just about everything.

The fact of the matter is that the EPTS has confused the market because, under the EPTS definition, the most simple operation mathematically possible on two events defines a “complex event” and “complex event processing.”


Disclaimer: See also - EPTS: Proposed Event Processing Definitions, September 20, 2006

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • Furl
  • Reddit
  • Spurl
  • LinkedIn

An Update on the Political Situation in Thailand

I am sure everyone has read the big news in Thailand.  Anti-government protesters have shut down the two main airports in Thailand for around 5 days, leaving the country in a state of chaos.   The current political crisis has been ongoing for a number of years and there is no possible end in sight.

The situation is both very complex and very simple.   In the simplest terms, the conflict in Thailand is a struggle against corruption, and in particular, the common practice in Thailand of buying votes.    Many elected politicians are in office because their campaign was able to buy more votes than the other side.  This, unfortunately, is the long standing political situation in Thailand, especially rural Thailand.

Ironically, however, both sides of the conflict cry out that their side represents “democracy” in a country where a US style of democracy is impossible because of the many poor people that live in rural areas.   The social economic conditions in Thailand make vote buying a political reality.    For that reason, Thailand suffers from a never ending social-political power struggle.

Because governments are formed in Thailand based on an underlying serious social problem (vote buying). governments formed in Thailand do not have the same political power as in a country where votes are not purchased.   Some would argue that vote buying exists, indirectly, in the US, but at a much higher level of abstraction, for example lobbying, media dominance and big business influence.  However, in the final analysis, most would agree that direct vote buying, providing money directly to people in exchange for their vote, especially the rural poor, undermines the core principles of a democratic society.

Many of us studied political science in high school or in our universities.   In our poly-sci classes we learned that, for democracy to work effectively, the electorate must be well educated, engaged and of a satisfactory social-economic condition.   Unfortunately, in Thailand, those elements of democracy do not yet exist nationwide.  For this reason, there is an epic political and class struggle in the Kingdom.

As a foreigner, the internal political affairs are none of my business.  I am only an observer who happens to be a friend of the Thai people and their nation.    Unfortunately, the business climate in Thailand continues to deteriorate for everyone, both Thais and foreigners.     The extreme behavior of shutting down the two major airports in Bangkok has left over 100,000 people trapped in Thailand and the social and economic losses are devastating.

The good news is that, so far, both sides of the conflict have shown considerable restrain.  There has been remarkably very little violence in such difficult circumstances.     There are few countries where the lives and limbs of the people are worth much more than the economics of business.   In most countries, pools of blood would have been in the streets if a mob of anti-government protesters closed the major airports.   However, in Thailand the blood of the people is worth much more than the modern luxury of air travel and fruits of business success.   This is remarkable in the modern “money-driven” society we live in.

In closing, please join me in wishing the very best and most peaceful outcome for the current situation in Thailand.   There is no easy political solution.   We can only hope that cool heads and hearts maintain the peace as much as possible under such difficult circumstances.


Note: For a different perspective, see CNN’s Explainer: Thailand’s political crisis

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • Furl
  • Reddit
  • Spurl
  • LinkedIn

Quintessential Event Processing: Signature Versus Anomaly Detection

Detection experts understand that the optimal detection design and architecture is generally a combination of both signature and anomaly detection engines.   In event processing, signature detection involves the real-time pattern matching analysis of events.   A core advantage of signature detection is that basic pattern matching models are easy to understand and develop when you know exactly what pattern you are looking for.   Designers then use a pattern, often called a signature, that searches for exact strings within an event object to detect, track, or observe an object of interest.  Pattern matching can be executed very quickly, efficiently and inexpensively on most all computing platforms today.

Howver, signature detection engines have weaknesses.  Generally speaking, signature detection engines can only detect known patterns based on a posteriori models.   An a posteriori signature must be created for every pattern under observation.  Therefore, unknown or modified patterns and situations will generally go undetected; hence, in practice, signature detection engines can suffer from both false positives and false negatives.

Pattern matching works well when detecting signatures in a known, deterministic model.  However, pattern matching does not work well against changing, self-modifying or adaptive behavior.   In addition, signature detection is made more difficult by advanced techniques that attempt to conceal the “real” pattern by generating easy to detect “decoy” patterns.   An example of this deception is chaff, small metal objects used to deceive signature-based radar detection.   The same type of deception is easily created in event processing networks.  Furthermore, the overall capability of a signature-based engine to scale upwards against adaptive and deceptive behavior is constrained by the fact that a new signature must be created for each variation, and as the rule set grows, the detection engine performance decreases.

In threat detection, signature-based detection often reduces, in practice, to a race condition between the malicious user (the threat) and the signature developers where the advantage goes to the threat because malicious users can develop new threats faster than new detection signatures can be written, tested and deployed to the detection engine.

On the other side of the detection coin, anomaly detection focuses on the concept of baselining normal behavior and detecting variations from the baseline.    The baseline is learned and/or specified by system designers.  Detected situations in an anomaly detection engine are created by any situation that falls outside the predefined boundary of the anomaly detection model.

A key component of baselining in anomaly detection is the capability of the detection engine to detect deviations from situational models at many different layers.  This means that anomaly detection engines are initially computationally expensive.  However, one trade off is that anomaly detection engines tend to scale better than a signature detection engines as the event data-set grows.  As a result,  designers often see fewer false positives in anomaly detection.

A known disadvantage of anomaly-detection engines, similar to signature detection, is the difficulty of predefining rules.   Even more challenging, the detailed knowledge of normal baseline situations must be constructed and transferred into the engine memory for accurate detection.  However, once a robust baseline has been established and normal behavior or situational pattern defined, anomaly detection engines tend to scale more quickly and easily than signature-based engines because a new signature does not have to be designed, tested and uploaded for every new variant that comes along.

Detection experts know that the optimal detection design is generally a combination of both signature and anomaly detection engines.  This is one reason that I tend to be critical of rule-based signature detection “only” in the CEP/EP market place.   Because anomaly detection engines tend to be adaptive, learning systems, the current trend is for anomaly detection engines based on statistical learning algorithms such as artificial neural networks or dynamic Bayesian networks.

In other words, it is well established in detection systems design that the optimal approach for most non-trivial detection-oriented problems is a combination of both signature and anomaly detection engines.   If you are buying a “CEP engine” and the engine is only capable of signature or pattern detection across a sliding time window, you will be using a suboptimal detection architecture.   This is a well established systems engineering principal in detection-oriented systems design.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • Furl
  • Reddit
  • Spurl
  • LinkedIn

Apologizes: RSS Feed Working Again

My apologies for not discovering the error sooner, but for most of November our RSS feed was broken. The problem was an embedded object, which I have deleted. I think the problem has been fixed; however, if you find any additional problems with our RSS feeds, please let me know.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • Furl
  • Reddit
  • Spurl
  • LinkedIn

CEP as a Service (CEPaaS) with MapReduce on Amazon EC2 and Amazon S3

Just as I was starting to worry that complex event processing community has been captured by RDBMS pirates off the coast of Somalia, I rediscovered a new core blackboard architecture component, Hadoop.

Hadoop is a framework for building applications on large commodity clusters while transparently providing applications with both reliability and data motion.  Hadoop implements  Map/Reduce, where an application is divided into many small components of work, each of which may be executed or re-executed on any node in the cluster.

There are a number of great articles on implementing Hadoop in the Amazon Elastic Computing Cloud (EC2), including this one, Running Hadoop MapReduce on Amazon EC2 and Amazon S3.  Hadoop provided the core component that permits a distributed agent-based architecture to become a manageable, simple-to-use service.   This, in turn, provides a framework, as a service, for solving complex distributed computing problems.

Another good article to read is Taking Massive Distributed Computing to the Common Man - Hadoop on Amazon EC2/S3. There is also a nice article on the Amazon EC2 on the Hadoop Wiki.

It is interesting to note that if you Google around you will find that the same RDBMS folks who have been hyping the term “complex event processing” are some of the most vocal Hadoop critics. Further reading, however, you will see that most of the critical comments by the RDBMS crowd have been answered.  It is very interesting to see the same debate in the MapReduce community as in the CEP community, the difference of course is that the MapReduce community is much larger than the CEP community.

However, there should be no doubt in anyone’s mind that MapReduce and the Hadoop implementation provide a way to accomplish CEP.  It is very refreshing to see this emerging CEP architecture on the rise.

Stay tuned for much more information related to MapReduce and CEP.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • Furl
  • Reddit
  • Spurl
  • LinkedIn

CEP by Apache Mahout via the Google MapReduce Framework

MapReduce is a software framework implemented in C++ with interfaces in Python and Java introduced by Google to support parallel computations over large (multiple petabyte) data sets on clusters of computers.  The Apache  Hadoop project is a free open source Java MapReduce implementation.  Mahout is an Apache project, based on Hadoop, with an objective to build scalable, Apache-licensed machine learning libraries.

The Mahout team is initially focused on building the ten machine learning libraries detailed in Map-Reduce for Machine Learning on Multicore by seven members of Stanford’s computer science department.   These libraries, some, if not all, critical for “real” complex event processing, include;

  1. Locally Weighted Linear Regression (LWLR),
  2. Naive Bayes (NB),
  3. Gaussian Discriminative Analysis (GDA),
  4. k-means,
  5. Logistic Regression (LR),
  6. Neural Network (NN),
  7. Principal Components Analysis (PCA),
  8. Independent Component Analysis (ICA),
  9. Expectation Maximization (EM), and
  10. Support Vector Machine (SVM)

Ready to move beyond rules and rule-based systems to process complex events? Interested folks should visit the Apache Mahout Wiki.

Note: See also, Map-Reduce: A Relevance to Analytics?

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • Furl
  • Reddit
  • Spurl
  • LinkedIn

Comprehensive Misinformation on Evaluating ESP Engines

Folks are worried about the future of CEP.

Vendors have spun so much misinformation around the term “CEP” that this three letter acronym (TLA) has begun to have little meaning other than to reflect a confusing web of solutions overhyped around a few relatively simple stream processing engines, used primarily in financial services.  Frankly speaking,  in the fields where real-time detection is critical and very difficult, for example network and security management, we can’t find a compelling selection criteria for any of these first generation ESP engines.   Many, in fact, have begun to express a preference for software solutions in more common and widely supported programming languages (like PERL, Java or C++) versus a proprietary vendor engine and their single-vendor languages.

We used to have a bit of faith in the site, Complex Events, because we enjoyed reading topics like how CEP can be used in “future world” event processing scenarios like global air traffic control or complex weather monitoring.    Unfortunately that site still provides no details on how to accomplish the grand vision of “CEP” other than to permit marketeers to use the term “CEP” as they see fit, for better-or-for-worse, mostly for worse.  The fact of the matter is that ESP vendors are using “CEP” in ways that have almost zero to do with the complex set of detection-oriented problems CEP was originally funded (by the US military) to address.   We certainly can’t find any compelling criteria to justify proprietary ESP engines, especially since we are not into algo trading, order routing, or simple rule-based compliance problems.  And, to make matters more worrisome,  I have been reading a number of discussions from algo trading and order routing experts who have the same doubts.

It is no secret that solution architects (like me) are becoming dissapointed in where things are headed in the CEP space.  When I first read Coral8’s Comprehensive Guide to Evaluating Event Stream Processing Engines over two years ago, I thought little about it, quite frankly, and basically dismissed it as Coral8 marketing material.  Opher Etzion wrote a great critique, On Evaluation Criteria for EP Products. I will not repeat Opher’s excellent critique in this post.   Responding to event processing misinformation, as I pointed out in an earlier post, is taking way too much time.   I will say that it is even more disappointing to have to respond to misinformation that is almost three years old.  Why, nearly three years after this marketing propaganda was published, did someone feel inclined to publish it again as front page news?

Admittedly, I am worried about the future of CEP.  The recent thread on the CEP Forum,  What’s the difference between a Blog and a Forum? did not help, frankly speaking.   Maybe I am wrong, but it seems to me we don’t really need to be discussing the differences between forums and blogs in a CEP discussion forum.    We have bigger fish to fry, as they say.  Furthermore, does Professor Luckham really need to publish three year old marketing collateral from Coral8 as front page news on the Complex Events “vendor neutral” site?

It is no secret, folks are worried about the future of complex event processing.

Share and Enjoy:
  • Digg
  • StumbleUpon
  • del.icio.us
  • Technorati
  • Facebook
  • Mixx
  • Google
  • Slashdot
  • Furl
  • Reddit
  • Spurl
  • LinkedIn