Preventing Coordinated Attacks Via Distributed Alert Exchange

Attacks on information systems followed by intrusions may cause large revenue losses. The prevention of both is not always possible by just considering information from isolated sources of the network. A global view of the whole system is necessary t…

Authors: Joaquin Garcia-Alfaro, Michael A. Jaeger, Gero Muehl

Preventing Coordinated Attacks Via Distributed Alert Exchange
Pr ev enting Coordinated Attacks V ia Distrib uted Alert Exchange Joaquin Garcia-Alfaro 1 , Michael A. Jaeger 2 , Gero Mühl 2 , and Joan Borrell 1 1 Autonomous Univ ersity of Barcelona, Dept. of Information and Communication s E ngineering, Edifici Q, 08193 Bellaterra, Spain {jgarcia,jbor rell}@deic.uab .es 2 Berlin Uni versity of T echnology , Institute for T elecommun ication Systems, Communication and Operating Systems Group, EN6, Einsteinufer 17, D-10587 Berlin, Germany {michael.jaeg er,g_muehl}@ac m.org Abstract. Attacks on information systems followed by intrusions may cause large rev enue losses. The prev ention of both is not always possible by just con- sidering information from isolated sources of the network. A global vie w of the whole system is necessary to recognize and react to the dif ferent acti ons of such an attack. The design and deplo yment of a decentralized system targeted at detect- ing as well as r eacting to information system a ttacks might bene fit from th e loo se coupling realized by publish/subscribe middle ware. I n this paper , we present the adv antages and con venience in using this communication paradigm for a general decentralized at tack pre ven tion framew ork. Furthermore, we present the design and implementation of our approach based on e xisting pub lish/subscribe middle- ware and e valuate our approach for GNU/Linux systems. K eywords : Network Security , Attack Prev ention System, Publish/Subscribe, Mes- sage Oriented Middle ware, IDMEF 1 Intr o duction When attackers gain acc ess to a co rporate network by co mpromising autho rized users, computer s, or app lications, the n etwork and its resources can become an acti ve part of a globally distrib u ted or c oordinate d attack. Such an attack might be a coor dinated p ort scan o r d istributed denial of service attack against third party networks—or even against computer s on the same ne twork. Both, d istributed and co ordinated attack s, rely on the combinatio n of actions perf ormed by a ma licious adversary to v iolate the secur ity pol- icy o f a target comp uter system . T o prevent these attacks, a global view o f the sy stem as a who le is necessary . Hence, different e vents and specific infor mation must be gath- ered and co mbined from various source s to detect patter ns of such a d istributed attack. This compr ises, for example, informa tion abou t suspiciou s c onnection s, initiation of processes, and the creation of new files. W e alr eady pr esented an attac k pr ev e ntion f ramework that is targeted at detecting as well as reacting to d istributed an d coor dinated attack scenarios [8]. It relies o n gath- ering an d co rrelating infor mation held b y mu ltiple sou rces. In th is appr oach, we a pply a d ecentralized sch eme based on message passing to share alerts in a secure co mmu- nication infrastructur e. Th is way , we can detect and p rev ent those a ttacks p erformin g detection an d reaction processes based o n the knowledge g ained throug h alert cor re- lation. In this p aper , we fo cus on the commu nication infrastru cture o f th e attack pre- vention framework. W e discuss the design o f its constituting elem ents and the f ormat of the exchanged messages. W e finally ev aluate a first implementation o f the in fras- tructure d eployed on a GNU/Lin ux system. The main motiv ation of our work a ims at fostering the collaboration between the dif ferent comp onents of a protectio n fr amew ork composed by security componen ts in order t o achie ve a more comp lete v ie w of the sys- tem in whole. Once achieved, one can detec t and react on the different actions of a coordin ated o r distributed a ttack. The structure of this paper is the fo llowing. W e start in Section 2 with a discussion of related work. Section 3 g iv es an introd uction to the publish/sub scribe c ommunic ation model an d elabo rates the co n ven ience in using this mod el for ou r pro blem domain. In Section 4, we briefly overview the Intru sion Detection Me ssage Exchan ge Format (IDMEF), wh ich is the fo rmat we u se f or the exchang e of audit infor mation in o ur system. W e then pr esent the cooperation scheme of the system com ponents in Section 5 including a presentation o f th e cur rent state of o ur implem entation b ased on xm lBlaster , an ope n source publish/subscribe me ssage oriented m iddlew are [ 21]. Section 6 presents first experime ntal results on th e perfor mance obta ined with a fir st deploym ent of ou r implementatio n. W e clo se this paper with con clusions and a discu ssion of future work in Section 7. 2 Related W ork T r aditional client/server solutions for security mo nitoring and pro tection of large-scale networks rely o n the d eployment of mu ltiple sensors. Th ese sensor s locally collect a u- dit data and forward it to a cen tral server, where it is f urther an alyzed. E arly in trusion detection systems such as DI DS [22] and ST A T [1 2] use this architec ture an d process the m onitoring da ta in one ce ntral n ode. DIDS (Distributed Intru sion Detection Sys- tem), for instance, is one of the first systems referr ed to in the literature that is using monitorin g architectu re [22]. The main co mponen ts o f DIDS are a centra l an alyzer compon ent called DIDS director, a set of host-based sensors in stalled on eac h moni- tored h ost within the protected n etwork, and a set of network -based sensors installed on each broad casting segmen t of the ta rget system. The communication channels between the cen tral analyze r and the d istributed sensors are b idirectional. This way , th e sensors can push their r eports asyn chrono usly to the central analyze r while the director is still able to acti vely request more details from the sensors. The issue of sensor distribution is the focus of NetST A T [26], an application of ST A T (State T ran sition Analy sis T echniq ue) [12] to network-b ased de tection. I t is ba sed o n NST A T [13] and comprises sev eral extension s. Based on the attack scenarios a nd the network fact modeled as a h yper-graph, NetST A T automatically choo ses places to pro be network acti v ities and applies an analysis of state transitions. Th is way , it is able to decide wh at informa tion is needed to co llect within the protected network. Althoug h NetST A T collects network e vents in a d istributed way , it analyzes them in a centralized fashion similarly to DIDS. The main limitatio n of both DIDS a nd NetST A T is that their excha nge of audit d ata can quickly becom e a bo ttleneck due to saturatio n problem s associa ted with their central- ized analyzer s. The ir mo nitoring schemes are straightf orward as they simply push the data to a central nod e and p erform the co mputation there. B oth approaches try to r educe the audit data sent over the network to the central analy sis un it by filtering removin g informa tion of no interest from t he audit s tream and applying compression sch emes af- terwards. Unfortun ately , an efficient data reduction scheme capable of forwarding only relev an t data for arbitr ary threat scenario s seem s in feasible. T hus, those systems are not ab le to a void unnecessary overhead which m ay lead to an overload on the c entral analyzer in case too many sensors are dep loyed. Furthermo re, h aving o nly one single analyzer also ind uces issues with respect to availability: If the cen tral analyzer cra shes or beco mes th e victim of a d enial of service ( DoS) attack, the wh ole system is com - pletely blinded . Some approaches published later try to solve th ose disadv antages. GrI DS [24], EMER- ALD [ 19], an d AAfI D [2 3], for example, pr opose the use of layered stru ctures, wher e data is locally pre- processed and filtered, and furth er analyzed by intermediate compo- nents in a h ierarchical fashion. The compu tational and network load is distributed over multiple analyzers and managers as well as ov e r different do mains to analyze. The an a- lyzers and man agers of each domain perfo rm th eir detection for just a small par t of the whole network . They forward the pr ocessed infor mation to the en tity that is on th e top of the h ierarchy ,i.e. , a m aster no de wh ich fina lly an alyzes all the r eported in cidents o f the system. On th e one ha nd, GrIDS (Gra ph-based Intrusion Detectio n System fo r large networks) is an ev o lution of DIDS [22] a nd aims at large-scale d istributed systems. I t perfor ms detection o f distributed scans an d worms by aggregating co mputer a nd n etwork in for- mation into activity gr aphs [24]. In contr ast to th e centralized appr oach of DI DS, GrIDS allows the construction of activity graphs that only represen t hosts and the network ac- ti vity b etween th em. Eac h node of the g raph rep resents a sing le host or a group of nodes, and the edges represent network traffi c between nodes. The audit data of GrIDS is c ollected by m eans o f bo th h ost- an d n etwork-based sensors, and then f orwarded to the graph manager, wh ich further feeds the collected in formatio n into the graph. The whole sy stem deploys several graph s an d gr aph managers in a hierarchical fashion in order to increase the scalability of the whole system. Therefor e, eac h manager controls just a subset o f the whole graph. Unfo rtunately , only little d etails are provided regar ding the c ommunica tion infr astructure for the exchange of information between componen ts which makes it hard to further analyze this system. Similar to GrIDS, EMERALD ( Event Monitorin g En abling Respon ses to Anomalou s Liv e Disturb ances) e x tends the work o f IDES (In trusion Detection Expert System) [1 6] and NIDES (Next-Gen eration Intru sion D etection Expert System) [1] by i mplementin g a recursive fr amew ork in which generic b uild ing blocks can be deployed in a hierarchi- cal fashion [1 9]. It combin es host- and network-based sensors as well as an omaly- and misuse-based analyzers. EMERALD focuses on th e protection of large-scale enterp rise networks that are divided into independ ent d omains, each one of them with its own se- curity p olicy . The au thors claim t o rely on a very efficient commu nication in frastructure for the exchang e of in formatio n be tween the system com ponen ts. Unfo rtunately , th ey also provide only fe w details r egarding their implementatio n. T hus, a gener al statemen t regarding the p erform ance of their infrastructure cannot be made. The AAfID (Ar chitecture for Intrusion Detectio n using Auton omous Agents) also pre- sents a hierar chical ap proach to r emove the limitation s o f centralized appr oaches and , particularly , to provide better resistance to denial of ser vice attack s [23]. It co nsists of fou r main com ponents called agen ts, filters, transceivers, a nd mon itors organized in a tre e structure , where child and pa rent compo nents comm unicate with each oth er . The commun ication subsystem of AAfID exhibits a very simplistic design and do es not seem to be resistant to a d enial of service attack as in tended. Althoug h the set of ag ents may c ommunica te with each oth er to ag ree upon a commo n su spicion level regarding ev ery host, all relevant data is simply fo rwarded to mon itors via transceivers and demand s for h uman interaction in order to detect distrib uted intrusions. Although hier archical appr oaches mitigate some weakn esses inheren t to centralized schemes, they still do not av o id b ottlenecks, scalability problem s, and fault toleran ce issues du e to vuln erabilities at the roo t level. T he first reason for this lies in the mas- si ve amo unt o f a udit data forwarded to the hig her level com ponents which canno t b e reduced significan tly throug h p re-filtering within small network domain s. The second reason is the c entralized ro ot dom ain compo nent which m ay cr ash or become unav a il- able, r endering the whole system unusable this way . In order to solve these p roblems with both central and hierar chical da ta analysis, a decentr alized schem e f ree of dedi- cated processing nodes is needed. Some d ecentralized me ssage p assing design s try to r emove th e lim itations and disad- vantages of centralized an d hie rarchical appro aches identified above. Their ap proach of distributing the detection pro cess has some advantages compared to c entralized an d hierarchica l appr oaches. Mainly , d ecentralized architec tures have n o single p oint of failure and bottlenec ks can th us be av o ided. Some message passing d esigns such as CSM [27] and Quicksand [ 14] try to elim inate th e need for ded icated elem ents by in- troducin g a peer-to-peer arch itecture. In stead of having a central monito ring station to which all d ata has to be fo rwarded, ther e are indepe ndent u niform working entities at each host perfo rming similar basic o perations. T o detect coordin ated and distributed attacks, th e different entities collabor ate on the de tection activities and cooper ate to perfor m a d ecentralized correlation algorithm. These designs seem to be a promising technology to implemen t decentralize d archi- tectures for th e detection of attacks. Howe ver , th e presen ted system s still exhib it very simplistic designs and suffer fro m se veral practical limitations. For instance, in some of them , every node h as to have complete k nowledge of the system: All no des have to be conn ected to each other which can make the matrix of th e connections that are used for pr oviding the alert exchang ing service grow e xplosively and become very costly to control and maintain. Anoth er important disadvantage present in this design is that the different en tities always need to know wher e a r eceiv ed n otification h as to b e fo rwarded to (similar to a q ueue m anager). T his way , when th e n umber of possible destinations grows, th e network view can become e xtremely complex limiting the scalability of this approa ch. Other designs are based on floo ding which fu rthermo re makes th e sy stem easier to maintain on the cost o f scalability as the message com plexity q uickly grows with the numb er o f nodes in the system. Most of these limitations can be solved e fficiently by using a distributed pub lish/sub- scribe middleware. The advantages of pu blish/subscribe commun ication for our prob- lem domain over other co mmunicatio n parad igms is that it keeps the produce rs of mes- sages decou pled fr om the con sumers and that the com munication is information-d riv en. This w ay , it is p ossible to av oid prob lems regardin g the scalability an d th e management inherent to other designs by means of a network of publishers, brokers, and subscribers. A pu blisher in a p ublish/subscrib e system d oes no t nee d to hav e any k nowledge about any of the entities that consume th e pu blished info rmation since th e com munication is anonymou s. Likewise, the subscriber s do not need to know anything about t he publish- ers. Ne w services can simply be added w ithout any impact on or interruption of the service to other users. In [10,9], we pr esented a n inf rastructure inspired by the decen- tralized architectur es d iscussed with th e f ocus on r emoving the discussed limitations. In the following sections, we p resent further details on our work. 3 Publish/Subscribe Model The pub lish/subscribe commun ication mod el implies many- to-many communicatio n and is often implemented asynch ronou sly . It is well suited for d istributed systems [6] and often used in situations whe re a message ( often referred to as a notifi cation in the literature) published b y a sing le en tity is sen t to mu ltiple r eceiv ers that expressed their interests previously . Pub lish/subscribe system s allow f or e fficient and comfortable informa tion dissemination to recei vers that ma y have individual interests in arbitrary subsets of th e messages published . In contrast to multicast co mmunicatio n, clients ha ve the possibility to d escribe the events th ey are interested in mor e flexible than with sub- scribing to a multicast gr oup (e.g., based on the contents of th e notification) . Clien ts acting as subscribers can cho ose to subscribe an d later unsubscrib e to filter s ma tching a set of messages as time g oes b y , while all su bscribers are indep endent of each o ther . Clients that publish notification s ar e called publishers . 3.1 Publish/Subscribe Systems A pub lish/subscribe system that imp lements the publish/sub scribe m odel consists of clients an d a notifica tion service , the clients ar e con nected to. The latter is resp onsible of for warding notifications from publishers to all inter ested subscribers and consists o f at least o ne br oker in a centralized implem entation. For scalab ility reason s, it is com- mon to imp lement the notification service in a distributed fashion with a b r oker overlay network th at con sists of multip le brokers that coo perate to p rovide the notificatio n ser - vice. The n otification service p rovides a distributed infr astructure for no tification routin g which includes the managemen t of sub scriptions and the dissemination of notifications in a possibly asyn chrono us way . Clients can publish notifications and subscribe to fil- ters that are m atched against the notifications for warded throug h the broker network . If a b roker receives a new notification it che cks if ther e is a local client that has sub - scribed to a filter that m atches this notification . If so, the message is d eli vered to this client. Ad ditionally , the b roker forwards the m essage to neighbo r brokers according to the applied routing algor ithm. W e refer to [18] for mo re details on publish/sub scribe systems. An examp le of a basic centralized publish/sub scribe system is shown in Figure 1(a). Here, five clients are connected to a single broker: three clien ts that ar e pu blishing no- tifications an d two clien ts tha t ar e subscr ibed to a su bset of the n otifications pu blished on the broker . Subscrib ers can choose to subscribe to the notification s available throu gh the bro ker or cancel existing sub scriptions as need ed. The br oker matches the no tifica- tions it received fro m th e publishers to the subscrip tions, e nsuring th is way that every publication is delivered to all interested subscr ibers. Publisher 1 Broker Publisher 3 Subscriber 1 Subscriber 2 Publisher 2 (a) Simple publish/subscribe system. Publisher 1 Broker Publisher 3 Subscriber 1 Subscriber 2 Publisher 2 Broker Subscriber 3 (b) Extended pub/sub system. Fig. 1. Examples of simple pub lish/subscribe topologies. This very basic p ublish/subscrib e setup ca n be extended by connecting multiple b ro- kers (cf. Figur e 1 (b)), enab ling them to excha nge m essages. The extende d design al- lows subscrib ers on one o f the br okers to receiv e message s that have be en published on another bro ker , f urther freeing the subscriber fro m the c onstraints of conn ecting to the same broker the publishe r is connected to. Most av ailab le implemen tations make this tran sparent fo r the pr ogramme r by keeping th e same interface opera tions as in the centralized design. T his way , an applicatio n can easily be distributed. In Figure 2, for instance, we sho w a distrib uted pu blish/subscribe topo logy , wh ere a client p p ublishes a notification n that is matched by filter F client s subscribed to. T he notification service then takes care of forwarding the notification properly o ver t he links drawn solidly . Fig. 2. Exam ple of a distrib uted pub lish/subscribe topology . Regarding th e subscription s, clients are able to fo rmulate their interests based on th e contents of the notifications or a special attribute the y carry . T o pic-based publish/su bs- cribe system s r epresent the first v aria nt of the publish /subscribe commu nication mod el. Here, pu blishers pub lish messages with respe ct to a topic and subscr ibers specify their interest in a to pic and receive all messages published on this to pic. T opic- based sub - scriptions a re, in tu rn, e asier to handle than co ntent-based su bscriptions. Since topics can be seen a s g roups in g roup commun ication [20], topic-based subscription may ef- ficiently be built on top of a group commu nication mec hanism such as, fo r example, IP multicast [5]. Thu s, topics are equiv alent to channels . An extension of the topic- based app roach is sub ject-based publish/sub scribe . T ak ing this app roach, it is possible to ar range topics in a hierarchy (subject tree) such that subscriptio ns not on ly match notifications if the topics are the sam e, b ut also if the topic of th e subscrip tion is an ancestor of the n otification topic in the subject tree. In this case, a su bject becomes equiv alent to a theme . In type -based pub lish/subscribe, no tifications are e quiv alen t to objects which a llows for an easier integration in to object- oriented prog ramming lan- guages. Furthermo re, it is po ssible with this appro ach to su pport multiple in heritance (depen ding on th e programm ing lang uage). Content-ba sed publish/su bscribe systems allow filers to work o n th e con tent o f no tifi- cations. This way , in conten t-based selection the structur e of a subscription is not re- stricted to a topic or a su bject—it can be any f unction over the content of a no tification. A subscr iption can, thus, be fo rmulated extremely fine-grain ed b ased on th e content of notifications using a query la nguage that can b e arbitrarily complex. Moreover , there does n ot need to b e a system-wide agree ment on the set of topics as it is prac tically required for topic based routin g. Content-based subscr iptions u sually depend on the stru cture of th e message. This ca n be binar y data, n ame/value p airs, semi-structu red data, or even programmin g lan guage classes co ntaining executable code. A subscription is often expressed in a subscrip tion languag e that sp ecifies a filter expression over messages. For ou r work, we u se content-b ased subscription over messages with semi-structur ed data. W e propo se th e use of XML for the stru cture of a message as well as the app lica- tion o f XPath as the subscription lang uage to specify filter expression s (cf. Section 5 ). In the following, we g i ve an ou tlook o n the m ain p roperties o f the f ormat built on top of the XML structure of our messages. 4 Repr esentation of Messages In o rder to exchange aud it infor mation in a stand ard mann er , two main spec ifications have bee n considered in our job. The Comm on I ntrusion Specificatio n Languag e (CISL), on the one hand, which was initially pro posed to allow th e com ponents of the Com- mon In trusion Detection Framework (CIDF) to exchan ge data in semantically we ll- defined ways [7]. The In trusion Detection Mess age Exchang e Format (IDMEF), on the other h and, was proposed by the IETF’ s Intrusion Detection E xchang e Format W or king Group (IDWG) to a ccomplish similar purposes [3]. Our approach is based on the I DMEF fo rmat f or th ree m ain re asons. First, this form at is the basis for the similarity oper ator u sed on the aggregation and fu sion ph ases of our aler t correlation app roach presented in [8]. Secon d, there is a significant nu mber of tools and implementations based on the IDMEF format, such as [1 7], which reduces the efforts of integrating it into our work. Th ird, th e exchange of messages between the comp onents of ou r framework is complian t with the intrusion de tection fram ew or k propo sed b y the ID WG. Besides th at, ID MEF allows the specification of m essages gen - erated by d ifferent n etwork secu rity compo nents, such as fire walls and network intru- sion de tection systems (NIDSs), and it can be exten ded to incorpor ate additional d ata informa tion, s uch as diagnoses and counter-measures, inside their proposed format. Up to n ow , IDMEF is an internet draft approved by the IESG (Interne t Steering Group) as an I ETF RFC ( Request For Commen ts). It is rep resented in an ob ject-oriented fash- ion. Th e class h ierarchy of IDMEF has b een represented by usin g the Ex tensible Markup Languag e (XML). The rationale for ch oosing XML is e xplained in [3] , as well as some examples of u sing IDM EF to describe ID S’ s alerts an d the IDMEF’ s associate Docu- Fig. 3. The IDMEF’ s message class. ment T yp e Definition (DTD)—altho ugh o ne may still find the curren t versio n of I DMEF defined by using DTDs, the authors also offer a new definition that u ses XML Sch emas instead of DTDs. In Figure 3, we sho w the two main typ es of messages supp orted by IDMEF: heartbeats and alerts . Hear tbeats, on the o ne hand , a re period ic messages between compon ents, in o rder to in form each o ther that th ey are o perational. Alerts, o n the o ther han d, carry audit info rmation, such as the comp onent that pr oduced it, the classification of the d e- tected activity , the source and target ports related to this activity , and other optional data. In the follo win g, we discuss the main properties o f IDM EF’ s alert cla ss, r egarding aspects re le vant to our work such a s determ ining the co mponen t which cr eated the mes- sage, th e time in which the message was created, an d th e kin d o f activity the message is pointing out. W e start b y giving an overview of the analyzer class which identifies the c ompone nt from which the message originates. Only one compone nt is en coded for each message, i.e., th e one from wh ich the message origina ted. The class is co mposed, in turn , of three aggregate classes: node , which includes inform ation about t he nod e on which the compon ent resid es; pr oc ess , which h olds info rmation about the process in wh ich the compon ent is executing; an d an alyzer , which carries information about oth er com po- nents which, in turn, forwarded the original information. The rationale behind th e r ecursive aggregation of compon ent’ s referen ces within the IDMEF’ s an alyzer class is tha t when a com ponent receives an IDME F alert and wants to fo rward it to another compo nent, it n eeds to substitute the o riginal com ponent in- formation with its own since, as we poin ted out ab ove, just one compon ent is encod ed for each message. This way , and in o rder to preser ve the original compon ent informa- tion, it may b e included in the new compo nent definition as a referen ce to the previous compon ent. This m echanism will allow co mponen t path tracking . The class ana lyzer has eight attributes: ana lyzerid , na me , m anufactu r er , mo del , v ersion , class , ostype , and osversion . The manu factur e r , model , version , and class attributes’ contents are vend or-specific, but may be used tog ether to iden tify different types of compon ents. T he ostype and osversion attributes’ conten ts are, respectiv ely , th e oper- ating system name and the op erating system version in which the compon ent’ s process is executed. Finally , the analyzerid a nd name attributes’ contents provide, respecti vely , the unique identifier and the explicit name for the component in the system. Regarding the timestamps of a message, th e IDMEF standard defin es the following three dif ferent classes to repre sent time: ( 1) CreateT ime , which is the time when the message is created by a comp onent; ( 2) DetectT ime , which is the tim e when th e event or ev ents that caused the creation of a message were detec ted; (3) AnalyzerT ime , wh ich is the time when the orig inal c ompon ent forwarde d th is message. The final object fo r each instance con tains information such as the nu mber of second s since the epoch , the local GMT o ffset, and the nu mber of mic roseconds. Even thou gh all the three tim estamps can b e provide d by each compon ent wh en gen erating a message, ju st the one d efined by the Cr eateT ime class is con sidered mandatory by the IDMEF standard. The classes so urce and target contain, re spectiv e ly , inf ormation abo ut the po ssible ori- gin and d estination of the events that moti vated the generation of the message. An e vent may have more than o ne source (e.g ., a distributed denial o f service attack ), more than one target ( e.g., a port sweep). Both, sou rce an d target c lasses, ar e com posed of info r- mation about the node , the user , the pr o cess , and the network service that moti vated the message. The target class includes, moreover , a list of affected fi les . Ref erring to their attributes, bo th source and target classes h a ve th e following two common attributes: (1) ident , which is a un ique identifier for either the source or target class; ( 2) inter - face , which may be used by a compo nent mu ltiple inter faces to indicate which interface this so urce o r target was seen on. Fu rthermor e, the class source includes th e a ttribute spoofed , which indicates wh ether the source is, as far as the co mponen t can deter mine, a spoo fed add ress. Similarly , the class target in cludes the attribute decoy , to indicate whether the target is, as far as the analy zer can determine, a decoy . The classification class contains th e na me of the event that motiv ated the creation of a message, or o ther informa tion which allows the components to deter mine wha t the mes- sage is poin ting o ut. It is com posed of one aggr egate class, th e class r eference , wh ich contains inf ormation abo ut external docu mentation sites, th at will provide backgr ound informa tion about such an event. Similarly , the assessment class is used to provide the compon ent’ s assessment of an e vent, and it is c omposed of informa tion ab out the im- pact , actio ns that may be taken in respo nse, a nd a me asurement of the co nfidence the compon ent h as in its e valuation of the event. Finally , the IDMEF’ s alert class ca n be au gmented with addition al in formation by means of the agg regate class es Add itionalData , Corr ela tionAlert , T oolAle rt , and Over- flowAlert . The info rmation ag gregated by those c lasses is often useful in or der to as- sociate d ifferent messages po inting ou t to similar activities—and r eported b y d ifferent compon ents—as well as to e xtend the standard IDM EF model with addition al fea tures, such as complex data types a nd relation ships. The Add itionalData class, first, inclu des informa tion th at do es not fit into the IDMEF’ s data model. This may b e an atomic p iece of data, or a large amount of data. The Corr elationAlert class, on the second h and, may include ad ditional infor mation related to the correlation proce ss in which this message is inv olved. T he OverflowAlert and T o olAlert classes, o n th e third han d, include, respec- ti vely , information related to b u ffer overflow attack s, and in formation related to the use of attack tools or other malev olen t programs (e.g., tr ojan h orses , r ootkits , an d so on). 5 Communication Infrastructur e In this section we g i ve an outloo k to the ope rational details of th e comm unication in- frastructur e pre sented in [8,10]. As our motiv ation is not targeted at developing a new publish/sub scribe system, w e tr y to reuse as much av ailable co de an d tools as pos- sible. For ou r experiments (cf. Section 6) we used x mlBlaster , an o pen sourc e pub- lish/subscribe message orien ted m iddlew are [ 21]. It conne cts a set of nod es th at build up the infr astructure for exchanging alerts using the interface operations offered by the underly ing middleware. Each xmlBlaster message consists o f a header fil tering that can be applied to, a body , and a system control section. The b ody o f an xmlBlaster message is formulated using IDMEF format (cf. Sec tion 4). Filters are XPath expression s th at are evaluated over the message header to decide if a message has to be d eli vered to a subscriber . W e d iscuss the essential in terface o perations offered by xmlBlaster in the following section. 5.1 Interface Operations Conceptually , the alert communication infr astructure offered th rough xmlBlaster can be viewed as a black box with an interface (cf. Figu re 4). It offers a num ber of operations , each of whic h may take a number of p arameters . Clients can in voke inpu t operations from the outside, an d the system itself in vokes o utput operations to de li ver info rmation to clients. T o pu blish alerts, clients inv oke the pub ( a ) ope ration, giving th e alert a as parameter . T he published aler t can potentially be deli vered to all c lients co nnected to the system v ia an output oper ation called no tify ( a ). Clients register their in terest in specific kinds of alerts b y iss uing subscriptions via the sub ( F ) operation, which takes a filter F a s parameter . Each client can have mu ltiple acti ve subscriptions which must be rev oked separately by using the unsub ( F ) op eration. Fig. 4. Black box view of a publish/subscribe system. All these operation s ar e instantaneous and take parameters from the set of all clients C , set of all a lerts A , an d the set of all filters F . Formally , a filter F ∈ F is a mapp ing defined by F : a − → { true , false } ∀ a ∈ A (1) W e say that a notifi cation n ma tches filter F ∈ F iff F ( a ) = tr ue. W e requir e th at each alert can o nly be pu blished once and that every filter is associated with a unique identifier in order to enable the aler t c ommunica tion infrastructu re to identify a specific subscription . 5.2 Components and Interactions As sh own in Figu re 5, and a ccording to the general fram ew o rk introdu ced in [8], each node o f the architecture is made up of a set of loc al analyzers (with their respecti ve detection units or sensors), a set of alert man ager s ( to perfor m alert pro cessing an d manipulatio n func tions), and a set of local r eactio n un its ( or effectors). These co mpo- nents, the in teractions betwe en them, and the alert comm unication infrastructu re, are described in the following. Fig. 5. Overview o f the main compon ents an d their interactions. Analyzers are local elements which are responsible for processing lo cal au dit data. They p rocess the info rmation gather ed by associated sen sors to inf er p ossible alerts. Their task is to identify occurrence s which are relev ant for the execution of the dif- ferent steps o f an attack and pass this in formatio n to the correlation manager via the publish/sub scribe system. They are interested in local alerts which are detected in a sensor’ s in put stre am an d pu blished thr ough the publish /subscribe system by in voking the pub ( l a ) operation, gi ving the local alert la as p arameter . Local alerts are exchanged using IDMEF messages (cf. Section 4). Each notification la has a unique classification and a list o f attributes with their respec- ti ve types to iden tify the analyz er th at or iginated the alert ( An alyzerID ), the time the alert was created ( Cr eateT ime ), the time th e event(s) leading up to the aler t was detected in the sensor’ s input stream ( DetectT ime ), the curren t tim e on the analyzer ( An alyzer- T ime ) , and the source(s) an d target( s) of the event(s) ( Sou r ce an d T ar get ). All p ossible classifications and their r espectiv e attrib u tes m ust be kno wn b y all sy stem componen ts, i.e., sensors, an alyzers, a nd managers, an d all an alyzers ar e cap able o f p ublishing in - stances of local alerts of arbitrary types. Managers are the co mpone nts in charge o f perfor ming ag gregation an d correlation of local alerts and e xternal e vents. As pointed out in [8], the use of multiple analyzers and sensors together with heteroge neous detectio n tech niques increases the detection rate, but it also increases the nu mber o f in formation to process. In o rder to reduce the nu mber of false n egati ves an d distribute the load that is impo sed by th e a lerts, ou r arch itecture provides a set of aggregation and correlation ma nagers , which perform aggregation and correlation of both, local alerts (i.e., messages p rovided by the no de’ s an alyzers) and external messages ( i.e., the in formation received fro m o ther collab orating nodes). In the following, we describe the basic interactio ns o f the two main man agers: aggr e gation and corr e lation managers. Aggr egation Man ager . T he basic fun ctionality of e ach agg regation m anager is to cluster alerts that corresp ond to the same occurr ence of an action [8]. Each aggr egation man- ager registers its in terest in a sub set L A of local alerts published by analyzers on the same nod e by in voking the sub ( LA ) op eration, which takes the filter L A as par ameter , with LA ( a ) =  true , a ∈ L A false , otherwise. (2) Similarly , the ag gregation manager also registers its interest in a set of r elated external alerts E A by in voking the sub ( E A ) operation with filter E A as parame ter , and E A ( a ) =  true , a ∈ E a false , otherwise. (3) Finally , it registers its interest in local correlated alerts C A by in voking the sub ( C A ) operation with C A ( a ) =  true , a ∈ C A false , otherwise. (4) Once su bscribed to the se three filter s, th e com munication infrastru cture will no tify the subscribed manag ers of a ll matching alerts via th e o utput operatio ns notify ( la ), no- tify ( ea ) and notify ( ca ) with l a ∈ L A , ea ∈ E A and ca ∈ C A . All notified alerts are proc essed and, depending on the clustering and synchr onization mechanism, the aggregation manage r ca n pu blish glob al and externa l alerts by in voking pub ( g a ) and pub ( e a ). Fin ally , it can revoke active subscription s sep arately b y using the operations unsub ( C A ), unsub ( E A ) an d unsub ( L A ). Corr elation Manager . Th e main task of this ma nager is the correlation of alerts de- scribed in [8,2]. It operates o n the set o f glo bal alerts G A published by the a ggregation manager . T o register its intere st in these alerts, it inv o kes sub ( GA ), wh ich takes the filter GA as parameter with GA ( a ) =  true , a ∈ G A false , otherwise. (5) The n otification service will th en notify th e co rrelation manag er o f all matched a lerts with the outp ut operatio n n otify ( g a ), g a ∈ G A . Each time a n ew aler t is receiv ed, th e correlation mechan ism fin ds a set of action mod els that can be correlated in ord er to form a scenario leading to an objective. It then inclu des this information into the Corr e- lationAlert field of a n ew IDMEF message and pub lishes the corre lated alert by inv ok- ing pub ( ca ) , giving the n otification ca ∈ C A as parameter . T o rev o ke the subscription , it uses unsub ( GA ). The corr elation manag er is a lso r esponsible for reactin g on detec ted security viola- tions. The algorithm used is based on the anti- correlation of actio ns to select approp ri- ate counte r-measures in order to reco nfigure, fo r instance, the security po licy [4]. As soon as a scenario is identified , the correlation mechanism may look for possible actio n models that can be anti-correlated with th e indi v idual actions of the supp osed scenario, or ev en with the goal objecti ve. The set o f anti-cor related actions r epresents the set of c ounter-measures a vailable f or the observed scen ario. The definition of ea ch anti-corr elated ac tion contain s a description of the c ounter-measures which should be inv o ked (e.g., harden ing the security po licy). Such counter-measure s are included into the Assessment field of a new IDMEF message and published by in voking pub ( aa ), using the assessment alert aa as parameter . Finally , a policy manager will register and re voke its interest in these assessment alerts by inv o king sub ( AA ) and u nsub ( AA ). Once n otified, the p olicy manager may p erform the p ost-processing of th e received alerts befor e sending them , fo r example, to a set of associated policy reconfiguration ef f ectors. 6 Deployment and Evaluation In or der to evaluate the p erform ance of our pro posal, we dep loyed a set of analyz ers and manager s publishing and receiving IDME F messages based on the DARP A Intrusion Detection Evalu ation Da ta Sets [15]. This ev alu ation data set contains more than 3 00 instances of 38 different autom ated a ttacks that were lau nched a gainst victim hosts in se ven weeks of training data and tw o wee ks of test data. The co mplete set of messages were p ublished as local and external alerts th rough the notification service of xmlBlaster , and then processed an d republished in turn to the set of sub scribed man agers. The exchan ge of alerts proved to be satisfactory , ob taining a throug hput pe rforman ce higher than 150 messages per second o n an Intel-Pen tium M 1.4 GHz proc essor with 512 MB RAM, analy zers an d man agers on the same machine runnin g Linux 2. 6.8, using Ja va HotSpot Client VM 1 .4.2 fo r the Ja va-based broker . Message d eli very did n ot b ecome a bottlenec k as all messag es wer e pr ocessed in time and the saturation point has never been reached . The impleme ntation of both , pub lishers and subscrib ers, was based on the libidmef C library [ 17] in order to build an d parse c ompliant IDMEF m essages. In tur n, libidmef is built ov er the libxml library [25]. The libxml library pr ovides two interfaces to par se XML data: a DOM style tr ee interface, and a SAX style event-based inter face f or our implementatio n. U p to now , we ar e using the DOM interface due to its easin ess of use. Its main d rawback is, h owe ver, that its memory u sage is pr oportion al to the size of th e XML data. For this reason, we are curr ently r e writing our implem entation to use the SAX-based interface. This would help us to d ecrease the amoun t of memo ry th at is currently necessary to maintain the entire XML tree in memory . 0 5 10 15 20 25 30 0 200 400 600 800 1000 Usage (%) Number of messages Brokers’ CPU usage Brokers’ Memory usage Subscribers’ CPU usage Subscriber’s Memory usage Publisher’s CPU usage Publisher’s Memory usage Fig. 6. Processing and memo ry consumption. The com munication be tween an alyzers a nd m anagers thr ough xm lBlaster brokers was based on th e xmlBlaster intern al socket protoco l a nd implemen ted u sing the C socket library [ 21] fo r xm lBlaster , wh ich p rovides asynchron ous callbacks to Java-based bro - kers. The man agers f ormulated their subscriptions using XPath expressions, filtering the messages they wished to recei ve from the broker . In Figure 6, we show the processing time and m emory space used by xmlBlaster bro kers during the exchange of aler ts. The first curve r epresents the percentage of CPU load used by each b roker . The second curve repre sents the quantity o f memory used by each broker . As we can n otice in the first curve, the perc entage o f pr ocessing time used by the brokers is quite stable an d negligible for the no rmal perform ance of a normal system. The secon d curve reflects, howe ver , that the cost in me mory is quite h igh. W e consider that this consum ption is due to the message m anagemen t and we hope that the new version o f o ur p rototyp e based o n a more efficient XML parsing and building schem e will lower it as discussed ab ove. 7 Conclusions W e presented a message passing desig n for the exchange of audit in formation between the secu rity comp onents of a p latform f or the d etection of and reactio n on co ordinated attacks. The design is based on a pub lish/subscribe mo del. I nstead of h aving a ce ntral or master monitor ing station to wh ich all d ata has to be forwarded, there are in depen- dent unifor m workin g entities at each host perf orming similar basic operation s. The informa tion g athered by each entity is dissemin ated to other interested entities th rough a notificatio n serv ice based on a publish/sub scribe b roker network w hich allows mes- sages to be sent via a p ush or pull data exchang e. The main advantage o f this model for the exch ange of aud it in formation b etween co mpone nts is, on the o ne h and, that it keeps the pro ducer of messages sep arated fr om th e con sumers and , on the other hand, that the communica tion is infor mation-d riv en. This way , it allows us to av o id problems regarding th e scalability a nd the m anagemen t inh erent to other designs, by m eans of a network of publishers, brokers, and subscribers. A pu blisher in a publish/sub scribe system does not n eed to ha ve any kno wledge abo ut any of the entities that consume the published in formation . Likewise, the sub scribers do not n eed to k now anythin g ab out the p ublishers. Services can b e add ed without any imp act on or in terruption of th e ser- vice to other elements. In Section 4, we d iscussed the main p roperties of the In trusion Detection Message Ex - change Format ( IDMEF) as the fo rmat that is built on top of the XML struc ture of th e messages exchanged between the co mpone nts o f ou r platf orm; we presented in Sec- tion 5 the oper ational details (interface op erations and interaction) of our comm uni- cation infr astructure; and we discussed the initial results o f a first p rototyp e of our approa ch in Section 6. W e thin k that these results give us good h ope that the use o f a publish/sub scribe system fo r th e commu nication infrastruc ture ind eed inc reases th e scalability o f the propo sed architecture. Motivated be y the high m emory u sage, and as pointed out in Section 6, w e are actually moving our curren t imp lementation to the SAX interface, since it does no t maintain th e entir e XML tree in memory , which m eans that the load will considerab ly d ecrease. As an extension of the work p resented in this pa per , we may first consider to s ecure the commun ication pa rtners by utilizin g the SSL p rotocol. This w ay , each n ode will recei ve a priv ate and a public k ey . The p ublic ke y of e ach node will be sign ed by a cer tification authority (CA) that is respon sible for the pr otected network. Hence, the p ublic key o f the CA has to be distributed to every node as well. Th e secur e SSL channel will allow the commun icating peers to communicate pri vately and to authenticate each o ther , thus preventing m alicious nodes from im personating legal one s. The implications com ing up with this new feature, such as com promised key manag ement or certificate revocation, would b e p art of this fu ture work. W e may also con sider as further work a more in-d epth study abo ut p riv acy mech anisms b y exchan ging alerts in a pseu donymou s m anner . By doing this, one may provide the destination an d o rigin inform ation of alerts ( So ur ce an d T arg et field of IDMEF messages) without violating the p riv acy o f p ublishers an d sub - scribers located on different dom ains. Ou r stud y ma y cover the design of a pseud ony- mous identification scheme, tryin g to find a balance between iden tification and pr i vac y . This also represents further work that remains to be done. Acknowledgmen ts The collaboration be tween J. Garcia- Alfaro, F . Cuppens, and F . Autrel sharpened many of the argu ments pr esented in th is paper . Th e au thors grac iously ack nowledge the fi- nancial sup port re cei ved from the following o rganizations: Spanish Ministry of Science and E ducation, and the Catalan Government’ s Agency for Ma nagement o f University and Research Grants (A GA UR). Refer ences 1. D. Anderson, T . Friv old, A . V aldes. Next-g enera tion Intrusion Detection Expert System (NIDES): a summary . SRI International, Computer Science Laboratory , 1995. 2. F . Cuppens, F . Autrel, Y . Bouzida, J. Garcia-Alfaro, S. Gombault, and T . Sans. Anti- correlation as a criterion to select appropriate coun ter-measures in an intrusion detection frame work. Annals of T elecommunications , 61(1-2):192 –217, 2006. 3. H. Debar , D. Curry , and B. Feinstein. Intrusion detection message exchange format data model and extensib le markup language. Request for Comments 4765 , March 2007. 4. H. Debar , Y . Thomas, F . Cuppens, and N. Cuppens-Boulah ia Enabling Automated Threat Response through the Use of a Dynamic Security Policy . Jo urnal in Computer V ir ology (JCV) , 3(3):195-210 , August 2007 . 5. S. Deering. Host Extensions for IP Multicasting. ST D 5, RFC 1112, Stanford Univ ersity , May 1988. 6. P . Eugster , P . Felber , R. Guerraoui, and A. K ermarrec. The many face s of publish/subscribe. ACM Computing Surveys , 35(2):114 –131, 2003. 7. R. Feiertag, C. Kahn, P . Porras, D. Schnack enberg, S. Staniford-Chen, and B. T ung. A Common Intrusion Specification Language. CIDF working group docu ment, 1999. 8. J. Garcia-Alf aro, F . Autrel, J. Borrell, S . Castil lo, F . Cuppens , and G. Na v arro. Decentralized publish/subscribe system t o prev ent coordinated attacks via alert correlation. In Sixth Inter- national Confer ence on Information and C ommunications Security , volume 3269 of LNC S , pages 223–235 , Málaga, Spain, Octob er 2004. Springer-V erlag. 9. J. Garcia-Alfaro, J. Borrell, M. A. Jaeger , and G. Mühl. An alert communication infras- tructure for a decentralized attack prev ention framewo rk. In IEEE Internationa l Carnahan Confer ence on Security T echnolog y , pages 234 –237, Las Palmas de G.C., Spain, 2005. 10. J. Garcia-Alfaro, M. A. Jaeg er , G. Mühl, and J. Borrell. Decoupling Components of an Attack Preven tion System using Publish/Subscribe. In 2005 IFIP International Confer ence on Intelligence in Communication Systems , pages 87–98 , Montréal, Canada, 2005. 11. J. Hochberg , K. Jackson, C. Stallins, J. F . McClary , D. DuBois, and J. Ford. N ADIR: An automated system for detecting network intrusion and mi suse. In Computer and Security , volume 1 2(3), pages 235–248 . May 1993. 12. K. Ilgun, R. A. Kemmerer , and P . A. Porras. State transition analysis: A rule-based intrusion detection approach. IEEE T ransactions on Softwar e Engineering , 21(3):181–199, 1995. 13. R. A . Kemmerer . NST A T : A model-based real-time network intrusion detection system. T echnical Report TRCS97-18, Reliable S oftware Group, Department of Computer Science, Univ ersity of Calif ornia Santa Barbara, 199 7. 14. C. Krue gel and T . T oth. Distributed p attern d etection for intrusion detec tion. In Network and Distributed System Security Symposium Confer ence Pr oceedings : 2002 , 1775 W iehle A ve., Suite 102, Reston, V irginia 2019 0, U.S. A., 200 2. Internet Society . 15. R. Lippmann, J. Haines, D. Fried, J. K orba, and K. Das. T he 1999 D ARP A of f-line intrusion detection ev aluation. Computer Networks , (34):579–595, 2000. 16. T . L unt, A. T amaru, F . Gilham, R. Jagannathan, P . G. Neumann, and C. Jalali. IDES: A progress report. In 6th Annual Computer Security Applications Confer ence , T ucson, AZ, USA, 1990. 17. A. C. Migus. IDMEF XML library version 0.7.3. http:// sourceforge.net /pro- jects/libidme f/ , March 2004. 18. G. Mühl. Lar ge-Sca le C ontent-Based Publish-Subscribe Systems . PhD thesis, T echnical Univ ersity of Darmstadt, 2002. 19. P . A. P orras, and P . G. Neumann. EMERALD: Event monitoring enabling responses to anomalous liv e disturbances. In 20th National Information Systems Security Confer ence , pages 353–365 , 1997. 20. D. Powell. Group communication. Communications of the ACM , 39(4 ):50–53, 1996. 21. M. Ruf f. XmlBlaster: open source message oriented middle ware. White paper [on-line]. http://xmlbla ster.org/ , 2000. 22. S. R. S napp, J. Brentano, G. V . Dias, T . L. Goan, L. T . Heberlein, C. Ho, K. N. L e vitt, B. Mukhe rjee, S. E. Smaha, T . Grance, D. M. T eal, and D. Mansur . DIDS (distributed intrusion detection system) - motiv ati on, architecture and an early prototype. In P r oceedings 14th National Security Confer ence , pages 167–176, October, 199 1. 23. E. H. S paf ford, and D. Zamboni. Intrusion detection using autonomo us agents. Computer Networks , 34(4):547–57 0, 2000. 24. S. Staniford-Chen, S. Cheung, R. Crawford, M. Dilger, J. Frank, J. Hoagland K. Le vit t, C. W ee, R. Yip , and D. Zerkle. GrIDS – a graph-based intrusion detection system for large networks. In 19th National Information Systems Secu rity Confere nce , 1996. 25. D. V eillard. The XML C l ibrary for Gnome (libxml). htt p://www.xmlsoft .org , 2006. 26. G. V igna and R . A. Kemmerer . NetST A T: A network-based intrusion detection system. J ournal of Computer Security , 7(1):37–71, 1999. 27. G. B. W hite, E. A. Fisch, and U. W . P ooch. Cooperating security managers: A peer-b ased intrusion detection system. IEEE Network , 7:20–23, February 1999.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment