The Motivation Behind Adaptive Analytics and CEP

This is a continuation of The Genesis of Complex Event Processing: Asymmetric Capabilities and CEP, Event Noise and Asymmetric Event Processing where I have been discussing the motivation behind CEP and adaptive analytics in cyberspace.

Around the same time that Professor Luckham and his team was working on CEP applications in network management and security management, I was leading efforts to build network and security management control centers for the United States Air Force.  In the beginning, dating back to 1994, my Internet-related work was for Air Combat Command (ACC), working out of ACC headquarters at Langley Air Force Base.

In 1997, I led a technical team that developed countermeasures against an actual distributed Internet-based attack on the Langley AFB SMTP email infrastructure.  This attack was documented in a technical paper, E-Mail Bombs and Countermeasures: Cyber Attacks on Availability and Brand Integrity, IEEE Network Magazine, Vol. 12, No. 2, pp. 10-17, March/April 1998.  In addition, this attack and the countermeasures I designed was featured in Popular Science Magazine in an 1998 article, War.Com and other news channels.  I also published a number of related papers on this topic.

Our team used a rule-based approach for countermeasures against massive email bombs attacks on the Langley Air Force Base email infrastructure.   We called this rule-based system BombShelter, and it was written in PERL.  I developed both the original software architecture and the original working prototype for BombShelter (in two days) and then we turned the software over to our team who used the rule-based approach for daily attack countermeasures.

I watched for days, and then weeks, as my team designed rules, and the attackers wrote new attacks that circumvented the rules.  Some folks in the Pentagon used to say that I “led the effort to fight the first war in cyberspace”.   It might have have been the first cyberwar, I am not sure, but it was certainly the first publicly documented cyberwar.  There is no doubt about this.

Without getting into all the historical footnotes and significance of this cyberwar that was fought with experts and rule-based systems, I would like to jump to an important conclusion.

Rule-based systems are useful, but have limited functionality and scaleability in most complex event processing applications.

Rule-based systems are human resource intensive because rule-based systems cannot learn and adapt on their own, humans learn and then write new rules.  This is how rule-based systems work.

This is the motivation behind why I spend a lot of time to search for new, more efficient and adaptive methods as alternatives to rule-based systems.   After extensive research, I published a series of papers on the future of intrusion detection in the Internet.  Intrusion Detection Systems & Multisensor Data Fusion - Creating Cyberspace Situational Awareness [1], helped lead an evolution in Internet security, particularly in the area of network-based intrusion detection systems (IDS).

In my published research work, motivated by limitations with rule-based approaches, I used the same mature functional model that is used to process missile attacks, control global air traffic, and other complex event processing applications in physical space; but I applied these concepts to cyberspace.

Around the same time, Professor Luckham and others were working on similar problems, all related to real-time detection and response to threats in cyberspace.  They were also funded by the US government.

Sidebar: Stream processing of transaction- based systems (databases), another area of interest, was focused on a totally different problem, which was low latency straight-thru processing in database-oriented systems.   These stream processing systems were, and remain however,  rule-based systems.  The problems we were trying to solve in cyberspace, however, cannot be efficiently and pragmatically solved by rule-based systems alone.  Only relatively simple scenarios can be efficiently detected by rule-based stream processing systems.

The vast majority of complex event processing classes of problems require rules plus advanced algorithms that can learn and adapt in real-time.    I know this, not from reading papers or taking university classes on rule-bases systems, but from working on some very challenging operational problems in real-time.    This is why I remain interested in complex event processing and why I continue to elaborate on why rule-based systems have limitations.

Proxy Caches are a Challenging Threat to Internet Security

Proxy caches, combined with poorly written session management code, can easily leads to serious security flaws similar to what we highlighted in A New Security Breach in Google Docs Revealed.

Web developers have no control over proxy caches in the Internet. However, developers do have control of the code they write and their admin teams have configuration control of their web servers. Developers must assume the worst case Internet scenario with aggressive Internet cache management policies that serve cached data for economic and performance reasons.

As a consequence, this fact-of-life on the Internet sometimes results in multiple web clients being sent the same Set-Cookie HTTP headers, for example.  Caching proxy servers should obtain a fresh cookie for the each new client request. Ideally, proxy caches should not cache session management cookies and distribute cached cookies to multiple clients. However, application developers cannot assume that proxy caches are well behaved, especially for applications where security and privacy are required.

Web developers cannot know whether their content is consumed directly or via a proxy cache. Developers also cannot assume that the HTTP responses will be delivered to the intended browser. Moreover, developers cannot be sure that the intended browser even receives the intended content.  For example, a session ID issued to a client gets used while it is valid or until abandoned and expired. If it is served and delivered in response to an unencrypted HTTP GET request, there’s no guarantee it will be consumed by the intended web browser.

Ideally, SSL should be used on all web transactions that require confidentiality and privacy, including our recent Google Docs breach.  On the other hand, even SSL is not foolproof. For example, many web developers do not correctly set the “Encrypted Sessions Only” cookie property. These incorrectly configured “secure” servers will send HTTPS cookies in the open, unencrypted.

There be dragons …


Note: Reposted from the (ISC)2 blog.

OWASP AppSec Asia 2008: Proxy Caches and Web Application Security

Back to travelling a bit, I have accepted an invitation from Wayne Huang, Chapter Leader, OWASP Taiwan,  to give the following presentation at OWASP AppSec Asia 2008, October 27 - 28, 2008, in Taipei:

Proxy Caches and Web Application Security

Abstract:  Proxy caches, combined with poorly written session management code, can easily lead to serious Internet security breaches. Web application developers cannot know whether their content is consumed directly or via a proxy cache. Developers cannot assume that the HTTP responses will be delivered to the intended browser. Moreover, developers cannot be sure that the intended browser even receives the intented content. Consequently, proxy caches are a serious theat to web application security.  In the presentation, we will discuss the recent security breach Tim found in Google Docs and review web application security and session management topics related to proxy caching.

Here is the link to the conference schedule (mostly in Chinese).

Modelling The Global Financial Meltdown

Yesterday I received a call from Penny Grosman, Senior Editor, Wall Street & Technology.   Penny was interested in my opinion, “Will risk management applications be the next killer app for CEP” on Wall Street.    I enjoyed talking with Penny.  She caught up with me leaving a tailor’s shop in Chiang Mai, so I hope she did not mind hearing my stories of buying unique Northern Thai cotton fabric and designing my own casual shirts in the economic turndown.

We read many stories on the net where folks claim that the current financial crisis could have been avoided with more or better use of technology.     This is expected, as software companies and IT professionals will often try to piggy-backtheir business development strategy on the “crisis of the day” to sell more goods and services.    Honestly, in this current situation, the main technology that we needed was simple, accurate financial models.

For example, in the chart above, the US economy was doing quite well with US federal funds rates low.   Housing prices in the US were skyrocketing and there was a concern about inflation.    There was an understandable concern the sustainability of that economy.

So, in perhaps one the most ill-advised Federal Reserve actions of many decades, the folks at the helm of the Fed decided to raise their lending rates around 500 percent over a two year period.

As we all know, primarily because of the action by the Fed, the world faces perhaps the worst economic disaster in modern times, while the US Executive Branch and the Congress fight over how to spend $700 Billion taxpayer dollars to inject liquidity into the markets to try to head off a global financial disaster.

It is amazing to me that the US Federal Government, or their advisors, does not have simple financial models with cause-and-effect analysis such as:

  • Homeowners with adjustable rate mortuages will not be able to make payments;and
  • Housing prices will fall dramatically; then
  • Homeowners will default on loans where the collateral is much less than the asset value, and
  • Banks will suffer great losses, and
  • Lending will come to a halt, then
  • Banks will collapse, then
  • Wall Street will exit the markets in panic
  • … and more trouble….. !!

There are and continue to be a lot of discussion and opinions about how risk management needs improvement. and I agree.   We will also read folks talk about how technology can be used to help solve this problem, including CEP/EP and related software (see also Capital Market CEP Fantasy Land). However, as much I would be pleased to see more CEP/EP applications and use cases, I do not believe that event processing technology is really very useful to solve the core problem of the current financial crisis.

The core problem is, seemingly, that our “financial experts” do not even have simple models that will illustrate what will or could happen when you raise the fed lending rates 500 percent in two years in an economy pregnant with adjustable rate mortgages.

To me, this does not appear to be rocket science.  The negligence by the US Federal Reserve and their advisors is astonishing.

CEP, Event Noise and Asymmetric Event Processing

In The Genesis of Complex Event Processing: Asymmetric Capabilities I introduced the abstract concept of “asymmetric processing capabilities” to describe the foundations of complex event processing.   If you take a few moments to review the first CEP projects from Stanford University, you will see that the application of CEP was toward  solving myriad asymmetric event processing problems in distributed networks.    These applications included challenging problems such as:

In each of the CEP application examples above, the amount of event information available to software developers can be staggering; however, despite all the available information, the capability to sense-and-respond to threats and opportunities is crude, at best.

Folks who work in network and security management, for example, are bombarded with event information.  However, this deluge of event information is, for the most part, “noise” that is difficult to understand.   In network management one of the most difficult things to accomplish is to find the root cause of an outage or performance problem.   This is why researchers at Stanford were funded to focused on research topics such as (above), the Analysis and Debugging of Distributed Systems.

These are the classes of asymmetric event processing problems that define complex event processing, or CEP.   Processing events by mediating events, routing events, or running a rule-set against events and making a processing decision are all perfectly valid event processing applications.   However, the core reason to have “complex event processing” is to solve event processing problems where there exists a significant asymmetry between the deluge of “event noise”  (Professor Luckham called this phenomena the “event cloud”) and detecting business-relevant, actionable complex events in an climate of uncertainty and noise.

In my next post on this topic I will briefly the review motivation behind my 1999 ACM paper, Intrusion Detection Systems and Multisensor Data Fusion, where we were working on solving complex distributed security challenges based on real-world experiences with the problems of asymmetric processing capabiilities.  I will discuss why we evolved from an early rule-based expert system model to a more advanced inference model that was not dependent solely on rule-based thinking.   I will also explain why other researchers and developers experienced in complex event detection applications have come to the same conclusion.

The Genesis of Complex Event Processing: Asymmetric Capabilities

More often than not, folks working in the field of complex event processing do not truly understand CEP.   We often see the same folks try to position and mischaracterize CEP as business process orchestration, business process management, event-driven architecture or even an evolution of service-oriented architecture.    Well-intended, this mischaracterization of CEP is often for sales and marketing purposes.  However, sometimes the mischaracterization of CEP is from a lack of understanding of what CEP was designed to accomplish.  These mischaracterizations have very little to do with the original intent of complex event processing.

Originally, researchers in CEP were not trying to solve a problem of streaming data or streaming events.   Often we read this mischaracterization by folks in the database/streaming domain, as they were focused on the low latency processing of streaming events.   A natural extension of this research has been stream processing software (often called “engines”) that process streaming data with continuous queries, for example market data feeds for algo-trading or best market order execution.  This mischaracterization is partly responsible for why we see many order processing applications in market data stream processing mislabled as “complex event processing” applications.

The genesis of complex event processing was not the stream processing need for “feeds and speed” but the processing capability to solve what can be characterized as the “problem of asymmetric capabilties”.   The term “asymmetric” has been used in the military domain. For example we often hear the term “asymmetric warfare.”  However, in general the concept of “asymmetrical processing capablities” is the true genesis for CEP and related processing concepts and domains.   It is this genesis that distinguishes CEP from EDA, SOA, SOR, and so many other technology oriented concepts.

In order to illustrate what I mean by “asymmetrical processing capablities” we will take the example of the evolution of rocketry.    In the early days, scientists learned how to make rockets, I assume with gunpowder and similar chemical compounds to launch rockets.   Over many years the application of rocketry advanced much faster than the ability to understand the situations created in the sky.    In other words, folks could fill the skies with rockets long before they had the capability to track and identify (or sense and respond to)  the rockets in real time.

Therefore, the concept of “asymmetrical processing capablities” is the situation where there is a capability, such as “launch a rocket, sense-and-respond,” that is asymmetric in nature.    In other words, the capability to detect multiple rocket launches creates an asymmetric situation where it is easy to launch rockets, but hard to detect and defend against those launches.

The same concept can be applied to everyday air travel.   If we could only fly airplanes, but did not have the capability to track the planes, understand situations in airspace, and then respond to changing situations, air travel would be quite difficult.   Lucky for us, the global traveller, there is symmetry in the capabilities to build and fly aircraft and the capabilities to detect, track and follow the evolving situations in the sky.

The genesis of CEP was to solve the problem of asymmetry in cyberspace, or if you prefer, distributed data networks.   The folks who identified, early on,  the problems associated with asymmetry in cyberspace were folks working in the field of network and security management.    This is because there has been, and is currently, a great asymmetry between the capablities to “launch a process or transaction” in cyberspace and the capabilties to detect and track what is going on in the same domain.

In my next post on this topic, we will go into some details of this asymmetry and review the first CEP projects from Stanford University in the context of asymmetric processing capabilities in cyberspace.

The 10 Top Cybersecurity Threats for 2008, AMCHAM & OWASP Thailand

Last year, in collaboration with IT security experts from (ISC)2 and the LinkedIn professional network, I published The Top Ten Cybersecurity Threats for 2008.  In a joint meeting with interested AMCHAM Thailand guests from the Open Web Application Security Project (OWASP), Thailand, Chapter, we will review the 2008 top 10 cybersecurity threats and facilitate an open discussion on these threats, including how these cybersecurity threats could impact AMCHAM members.  The presentation will be at the J. W. Marriott on October 21, 2008 (details to follow).

CEP, Politics, and Decision Making

I have changed my mind about injecting presidential politics into The CEP Blog.  I thought about linking complex events and politics into a discussion on complex events and the decision making process.

However, this approach risks alienating folks who take their politics serious or have other concerns.   For that reason,  I am going to go another, less political, direction on The CEP Blog.  I will not blog on the US presidential election here.

In my next series of  blog posts I will discuss how asymmetric event processing and asymmetric situational awareness was the genesis for complex event processing.

Plan-based Complex Event Detection across Distributed Sources

Here is an interesting 2008 paper, Plan-based Complex Event Detection across Distributed Sources.

Abstract

Complex Event Detection (CED) is emerging as a key capability for many monitoring applications such as intrusion detection, sensorbased activity & phenomena tracking, and network monitoring. Existing CED solutions commonly assume centralized availability and processing of all relevant events, and thus incur significant overhead in distributed settings. In this paper, we present and evaluate communication efficient techniques that can efficiently perform CED across distributed event sources.

Our techniques are plan-based: we generate multi-step event acquisition and processing plans that leverage temporal relationships among events and event occurrence statistics to minimize event transmission costs, while meeting application-specific latency expectations. We present an optimal but exponential-time dynamic programming algorithm and two polynomial-time heuristic algorithms, as well as their extensions for detecting multiple complex events with common sub-expressions. We characterize the behavior and performance of our solutions via extensive experimentation on synthetic and real-world data sets using our prototype implementation.

IDC: TIBCO Leads Fast-Growing CEP Space

Quote from TIBCO Press Release (see reference below):

TIBCO Software Inc. (Nasdaq: TIBX ) continued its market leadership in the fast-growing Complex Event Processing (CEP) space, according to a new report from IDC. TIBCO thus marked another year as the undisputed CEP leader, with a market share of 40.2 percent — twice the share of its closest competitor-while experiencing 52 percent year-over-year growth, according to the IDC study.

“CEP is the fastest-growing segment of the global event-driven middleware market,” according to Maureen Fleming, director of IDC’s BPM and middleware research program. “We expect this growth to continue as enterprises build event-driven applications that, in essence, act as real-time navigation systems for business.”

Reference (PRNewsWire):

TIBCO Leads Fast-Growing CEP Space, Says Leading IT Analyst Firm

Copyright © 2007-2008, The CEP Blog, All Rights Reserved.