What is Complex Event Processing (CEP)?
CEP is a technique to analyze stream of disparate events occurring with high frequency and low latency. In Oil & Gas industry we can imagine it to be sensors data coming from drilling equipment or sensors data from upstream assembly sending information about the temperature, pressure etc. It is very important to analyze the variation patterns to get notified in real time about any change in regular assembly. CEP includes understanding the patterns across the stream of events, sub-events and their sequence. CEP helps identifying meaningful patterns, complex relationships among unrelated events and sending notifications in real and near real time to avoid any damage.
CEP includes various challenges as
There are various solutions available in the market. With Big Data technology advancements, we have multiple options like Apache Spark, Apache Samza, and Apache Beam etc. Out of which in this article we are going to talk about one upcoming framework called Apache Flink. Apache Flink is an open source platform for distributed stream and batch data processing. Flink's core is a streaming dataflow engine that provides data distribution, communication, and fault tolerance for distributed computations over data streams.
In order to build a Complex Event Processing engine, I am proposing following architecture
In above system, we will have sensors from various systems sending data to our engine in real time and the events would be persisted in Kafka queues. Kafka being reliable persistent distributed messaging framework, guarantees storage and delivery of messages. Apache Flink's stream engine would listen to the topics and then let the streams pass through. We can define the certain patterns and try to match the patterns against the stream of events. If the pattern matches, notification can be generated.
Following are some code snippets
Matching patterns with input streams
Use of CEP in Oil & Gas
There are various places where the above mentioned system can be used