Rule Generalisation in Intrusion Detection Systems using Snort

Intrusion Detection Systems (ids)provide an important layer of security for computer systems and networks, and are becoming more and more necessary as reliance on Internet services increases and systems with sensitive data are more commonly open to I…

Authors: Uwe Aickelin, Jamie Twycross, Thomas Hesketh-Roberts

Rule Generalisation in Intrusion Detection Systems using Snort
Int. J. , Vol. x, No. x, xxxx 1 Copyright © 200x Inderscience Enterprises Ltd. Rule Generalisati on using Snort U Aickelin, J Twy cross and T Hesketh- Roberts School of Co mputer Science a nd IT University of Notti ngham NG8 1B B UK E-mail: uxa@cs. nott.ac.uk, j pt@cs.nott.ac.u k, tmhesket@ fish.co.uk Abstract: Intrusion Detection Sys tems (IDSs) provide an importa nt lay er of s ecurity for computer systems and networks. An IDS ’s responsibility is to d etect suspicious or unacceptable system and network activity a nd to alert a syste ms ad ministrator to this activity. T he majority of IDSs use a set o f signatures that define what suspicio us traffic is, and SNORT is one po pular and activel y develop ing open-source I DS that uses such a set o f si gnatures k nown a s S NORT r ules. O ur aim i s to identif y a way in whic h SNORT could b e d eveloped further by gener alising rules to id entify novel attacks. In par ticular, we atte mpted to relax and va ry the condition s and parameters of c urrent S NORT rules, using a si milar app roach to classic r ule learning opera tors suc h as generalisa tion a nd specialisation. W e demonstra te the effective ness of our a ppro ach through experi ments with standar d d atasets and sho w that we ar e ab le to detect pre viously undetecte d variant s of various attack s. Keyword: anomaly detectio n, intrusion detectio n, Snort, Sn ort rules Reference to this p aper should be made as follo ws: U we Aickelin, Jamie Twycross a nd Thomas Hesketh-Roberts (xxxx ) ‘Rule Genera lisation in Intrusion Detection Systems using S NORT ’, Internation al Journal of Electronic Security and Dig ital Forensics (IJESDF) , Vol. x, No. x, pp .xxx–xxx. Biogra phical notes: Uw e Ai ckelin is a Reade r and Advanced EP SRC Research Fello w in the Sc hool of Computer S cience & IT at the University of Nottingham. His research interests are mathe matical modelling, heuri stic optimisatio n and artificial immune systems applied to co mputer securit y pro blems. J amie T wycross is a Researc h Associate and is curren tly worki ng on a large i nterdiscipli nary pro ject investigatin g t he applica tion of immune-inspired approaches to computer securit y. His research interests include biologicall y-inspired appro aches to computing, co mputer securit y and networ king, and robotics. T homas Hesketh-Ro berts is a student in Co mputer Science. 1 Introduction Computer attacks, e. g. the use of spec ialised methods to circumvent the sec urity polic y o f an organ isation, are becoming more and more co mmon. IDSs are installed to identi fy such attacks and to react by usually ge nerating an alert or blocking suspicio us activity. Author IDSs co me in m any for ms whic h we o verview i n the following sectio n. The w ork presented here is based on a p opular network intrusion det ection system (NID S) called S NORT (2006 ). S NORT detects atta cks by comparing live I nternet traffic against signatures that define known attacks. S NORT is a n open- source GNU (200 6) NIDS and an e xample of a s ystem t hat u ses signatures, i n thi s case kno wn as S NORT rules. The ai m of thi s p aper is to determine the effectiveness of general isation w hen applied to th e matching of I nternet traffic ag ainst S NORT ’s rule signatures. In this pap er we i ntroduce a novel rule generali sation op erator for creating new rules. In par ticular, we prese nt t wo generali sation operato rs, inver t and content, which can b e used to either ge neralise or spe cialise S NORT r ules. Analysi ng t he res ults found b y these new r ules genera tes an improved under standing of attack pa tterns. Subsequent ly, b etter rules can be created b eyond the classic learning operators based on blind addition, deletion or negation of rule co nditions as f irst suggested by Mitchell (19 97). In the next section, we will talk about the curren t state of the art in ID S and highlight some po tential shortco mings. We will t hen go on to explain S NORT a nd S NORT rule generalisation in section 3 and 4. Details o f our system ar e p resented in section 5 and results using real -world da ta are in section 6. Finally, the paper concludes b y disc ussing the effecti veness and appr opriate us e o f our rule generalisation in IDS sig nature processing. 2 Current State-o f-the-Art in IDS According to Crothers ( 2003 ), intrusio n de tection technolog y is technolog y de signed to monitor co mputer acti vities f or the p urpose o f finding securit y violations. An ID S is a system t hat i mplements such technolog y. The meaning of a securit y viola tion will var y between s et-ups. To some, t he d efinition of a security violatio n may b e limited to activities b reaching co nfidenti ality and/or resultin g in do wntime of ser vices. Ge nerall y, a security violatio n w ould be an y deliberate activity t hat is n ot w anted by the victi m. T his typically include s denial o f service a ttacks, po rt scans, gaining o f syste m ad ministr ator access and exploiti ng system security holes suc h as the pro cessing HTML forms on the server. IDSs co me in many differe nt forms, and their method of finding security violations varies. Fo llowin g Nor thcutt (20 02), one division is o ften mad e in te rms of I DS placement: Host-based (HIDSs) that d etect a ttacks by a nalysing s ystem log s or Net work- based (NIDSs) that detect atta cks b y d irectl y a nalysing network p ackets in r eal-time, e. g. Snort. Here we will concentra te on misuse detectio n NIDSs. Techniques used b y such NIDSs still ha ve a lot o f roo m to evolve. Northcutt (2 002) , Nin g a nd Xu (20 04) a nd Kim et al (20 07) identify a number of prob lems associated with current misuse NIDSs: • They cannot full y detect novel attacks; • Variations of k nown attacks ar e not full y detected; • They generate a lar ge a mount of alerts, as well as a large n umber of fal se alerts; • Existing IDSs foc us on lo w-level attac ks or ano malies and do not identify logica l steps or strategies be hind thes e attacks. Title Our work here mainly focuses o n the second p oint. Signature sets are not effective against varied attacks if t hey a re written to identify pr ecisely e ach currentl y k nown issue. Conversely, using signatures with m ore general matching criteria results i n a higher proportio n o f legiti mate network traffic generating false alerts. We add ress this issue by systematically i mplementin g generalised rule s and alerts. There are a number of alter native methods of identifying new or variation s o f kno wn attacks that are c urrently u nder investigatio n. The inter ested reader is referred to Axelsson (200 0), who o ffers a surve y o f t hese techniq ues. Here we br iefly summarise two related are as: Gomez et al (2003 ) a nd Espo nda et al (2004 ) use ideas b ased on the H uman I mmune System to b uild an artificial i mmune s ystem. T he artificial immune system i s then used to evolve c ompetiti vely new rule sets. Thi s a llows the genera tion of rules that characterize the non-self space (abnor mal) by j ust takin g self (nor mal) samples as i nput. T he difference to our work is that we use crisp or fixed rules de rived by generalisi ng S NORT rules. The scenario ap proac h Nin g a nd Xu ( 2004 ) addresses the pr oblems of lar ge a mounts of alerts and lack of attack strategy co nsideratio n b y pro posing correlation of related alerts. The princip le is that ce rtain a ttacks would have a likely prerequisite, such as scanning for the e xistence of an o pen port before attacking it. In this wa y, for example, the port scan a nd the atta ck ma y be be correlated into a single alert a s part o f the sa me attack pr ocess. B urgess ( 2006) also uses statistical met hods. I n his case a filter based on a time-series p rediction d etects the si gnificance of d eviation. The e xtent o f the deviation determines how the syste m should r espond. Ho wever, r esearc h in to these co rrelations still in the be ginning as cor rrelating attack s is often neit her obvious nor easy. 3 Snort and Snort Rule s S NORT is o ne o f the most p opular NIDS. S NORT is Open Source, which means that the original pr ogram so urce c ode is available to a nyone at n o charge, a nd this has allo wed many peo ple to c ontribute to and analyse the progra ms c onstruction. S NORT uses the most common op en-source li cence kno wn as the GNU G eneral Public License. Rece nt research issues addressed with S NORT i nclude alert vis ualisation by Hoa gland and Staniford (2 003) and a utomated p ort-scan detection b y Stanifor d et al (20 02). Lawton ( 2002 ) d iscusses t he a dvantages and disadvanta ges of sec urity software bei ng open-source. In the article, Lawton introduces the ar gument that the a vailability of open source software code makes i t easy for hackers to figure o ut how to d efeat the sec urity. Lawton also weighs up the counter -argume nt that closed -source securit y systems are stil l compromised and t hat co de being ope n-source allo ws securit y holes to b e clo sed a s soon as they ar e ide ntified, as well as enablin g cod e to be customised for indi vidual securit y needs. On bala nce, we believe that open-so urce is an adva ntage for co mputer securit y. S NORT , like most NIDSs, use s a set o f signature s to de fine what constitute s an attac k. S NORT signat ures are re gularl y updated on t he S NORT website, usuall y several ti mes a day, whic h can be co nfirmed by period ically c hecking the ti mestamps next to available downloads S NORT . S NORT is flexible in how it can be utilised, as ( Figure 1 ) begins to demonstrate. A file contai ning p reviously logged traffic ca n be used as input to S NORT , in exactl y the same way as li ve traf fic. S NORT also supports a ran ge of o utputs, such a s saving aler ts to files or databases, or cr eating a network tra ffic log of all received traffic Author for later pr ocessing in the ca se of live tra ffic capture. T he flexibility exists for S NORT to support virtually any o utput met hod, due to an ability to s upport both in-house a nd t hird party output plu g-ins. Figure 1: Data-Flow Diagram demonstrating the flexibility in utilising S NORT . Since a clear understandin g of S NORT ’s ru les is cr ucial for our researc h, a d etailed explanation follo ws. A s ummar y of this in formation can be found in T able 1 below. S NORT ’s signature sets, whi ch a re used for identifying se curity violations, are called S NORT rules. Groups of S NORT rule s are referre d to as a .rules file, each of which ca n be selectively included into the S NORT c onfiguration file snort.co nf. A .rules file i s a pla in text file in which ea ch line h olds a sep arate rule. T he foll o wing notes on S NORT ’s rule format were put to gether using the S NORT Users Manual , for full detail see S NORT ’s website. A r ule is for mally defined a s shown belo w. T ext in w ould be replaced with the appr opriate c ompulsory variable, without angled b rackets present. Text in [square brackets] is optional a nd either represents nothin g or represents the te xt itself, in either case without the squa re bra ckets. [!]< source ip > [!] example: a lert tcp any any -> any 25 [create an alert fo r any inco ming traf fic send to port 25] The ter m rule ac tion describ es what re sponse i s made i n c ases when the conditio ns in the rule match w hen compared a gainst an Internet p acket. Most commonly, th e rule action is aler t, which u sually means saving alert d ata to a file or database f or later retrieval o r for another application to p rocess. Alert generating packets are also logged. Another action includes lo g, usef ul when it is inapprop riate to generate an aler t, but the Title traffic is of so me i nterest. Other a ctions include pass (allo w the packet t hrough) and activate (start other r ules or actions). For the pro tocol and po rt statements, please refer to S NORT (2006 ). For the IP statements, S NORT uses an IP /CIDR (Classle ss Inter-Domai n Routing) b lock number after the I P add ress (see Fuller et al (1993)) . T he packet data m ust ident ify itself as coming from or going to the IP address range given. An optional exclamatio n mark can be p laced in front o f t he IP ad dress to i nvert the meaning of the rule. Val ues ca n b e gi ven as ranges of IP/CID R s tatem ents, e.g. 192.1 68.1.0 /24. All p ossible IP add resses can be represented b y using the ke yword any. The packet data must id entif y itsel f as co ming fro m (goi ng to) the Internet port (or port r ange) given (t he Source/De stination Po rt stateme nts). An o ptional e xclamation mark can be placed in front of the p ort statement to invert the meaning o f the rule. A specific port n umber or a ran ge ca n be stated by using a c olon (:) to separate the lowest and the highest port n umber, e. g. 1:1024. Alternativel y, all po ssible port numbers can be represented b y using the keyword any. T he direction stat ement speci fies whet her the packet is from the source to the destinatio n or vice versa. There are also ad ditional rule o ptions, including further conditions for the r ule to match, the message to be used in aler ts and o ptions for activate r ules. The i nterested reader is r eferred to the S NORT website for more details. Ru le op tions are separated fro m each other usi ng semi-colo ns. So me options have a par amete r value associate d with the m, in which case a colon separ ates the option na me and op tion value. 4 Rule generalisation We propose to generate new rules b y generalising S NORT rule s. Give n a n I nternet packet that contai ns a variation of a known attac k, t here s hould be so me auto mated way to identif y the p acket a s nearl y matching a NID S attac k signature. If a partic ular statement has a set of co nditio ns against it, an item may matc h so me of the conditio ns. Whereas B oolean logic would give the value false to t he qu ery ’do es t his ite m matc h t he conditions’, our logic could allow the item to match to a lesser exten t rather than not at all. This pr inciple can be applied when comparing an Int ernet packet agai nst a set of conditions in a S NORT rule. O ur hypothesi s is that if all but one of the co nditions are met, an alert with a lower p riorit y can b e issued against the Inter net pa cket, as the packet ma y contain a variatio n of a kno wn attack. In our imple mentation, genera lisation i n the case of matching net work pac kets against rules, involves allo wing a p acket to genera te an alert i f: • The conditio ns in the rule do not all match, yet most of t hem do; • The onl y conditions that do no t match exactl y nearl y match. As an e xample, as sume a c ertain r ule states that a n alert s hould b e generated if a packe t is a p articular length, on a part icular port and contai ned a certain bit pa ttern. Using our generalisation a pac ket matchi ng those cr iteria, e xcept perha ps on a differe nt port, or with a slightly di fferent bit pattern, would still co unt as matching, and a ( modified) aler t would be generated. Author 5 Implementation Our implementatio n is made up o f three components (Fi gure 2) : • The first pro gram, called Fuzz Rule, processes t he Snort r ules (.rules files) and creates two ne w sets of rules u sing two genera lisation princi ples (Invert and Content); • The second program, AlertMe rge, merges alert files generat ed from the o riginal rules with alert files generated fro m the generalised rules; • The third p rogram, the FuzzR ule post-proc essor, summarise s alerts give n. By checking this su mmary, we ca n identif y where large n umbers of false p ositives are being generated. Thus, we ca n adj ust FuzzRule to red uce false alert r ates. 5.1 FuzzRule The FuzzRule program, which Figure 2 p rovides a diagrammatic o vervie w o f, meets t he following speci fications: Given a .ru les file, the applica tion saves a back- up of the or iginal file befor e replacing it. For e ach S NORT rule in t he origi nal .rules fi le, the ap plicatio n includes the original rule in the ne w .rules file and follo ws this with each variatio n of the rule generated using our generalisatio n. Each generalised variation of a n origin al rule is ge nerated either by inverting or re moving the meaning of one of the rule para meters. Thus, when co mparing the p roper ty o f a p acket against the generali sed rule, the packet should matc h all case s that are similar to the ori ginal rule. As we will see in the next section, t here is a dif ference between re moving and inverting, and the correct behaviour can o nly be ach ieved b y inverting rule op tions. Based on a n i nitial se t o f e xperi ments, we iden tified t he follo wing rule par ameters as being good candid ates for generalisation. An y o f these present in a r ule will be generalised usin g the stated method ( more details later in t his section): Inversion: Po rt, IP ad dress, Direction, P rotocol, Content, UR I Content; • Special Inversio n: Depth, Of fset; • Generalisation o f Content: Content, URI Co ntent; • Both original ru les and genera lised variation s of rules have t heir alert message tagged so that it ca n be id entified in what wa y a matching ru le has been generalised (if at all); • A program optio n is pro vided giving each generalised varia tion of a r ule a lower priority setting. A priorit y is a numerical value from o ne upward s, one being the highest prior ity and repre senting the most se vere attack a nd any larger n umber being less severe. T he prio rity setting is not used b y Snort, but ser ves as an i ndicator for the operato r browsing the alert file. Title Figure 2: Data-Flow Diagram for Overall S ystem. Author 5.2 Generalisation by Ru le Inversion When de signing our pro gram, generalisatio n was at first app lied by re moving a single rule option to for m a generalised rule. For example, give n the o rigi nal rule below, one generalised variation i s gi ven afterwards. B y r emoving the offset par ameter, more packets will matc h against the generalised r ule than the or iginal rule: alert udp any any - > any 69 ( msg:TFTP GET Admin .dll; content: |0001|; of fset:0; d epth:2; co ntent:adm in.dll; o ffset:2; nocase; cl asstype:s uccessful- admin; re ference:u rl, www.cert.o rg/adviso ries/CA-20 01-26.htm l; sid:12 89; rev: 2;) alert udp any any - > any 69 ( msg:TFTP GET Admin .dll; content: |0001|; of fset:0; c ontent:adm in.dll; o ffset:2; nocase; cl asstype:s uccessful- admin; re ference:u rl, www.cert.o rg/adviso ries/CA-20 01-26.htm l; sid:12 89; rev: 2;) Using the above removal gen eralisation principle means that if a packet matches an original rule, it t ypically al so matches all generalised varia tions of t he sa me rule. B y design, S NORT produces at most one a lert per packet. When we first tried the removal appro ach, we expecte d t hat S NORT matched the origi nal rule by default, d ue to i t appearing be fore the generali sed variations i n the .rules file . However, d uring run-ti me tests, alerts were only ge nerated from genera lised rules. Clo ser investigation r evealed that S NORT place s r ules into an efficient binary tree -style system for quic ker pr ocessing a nd traverses the tree by matching lo wer-cost matc hing rules first. T hus, rules with fewer options, like our r emoval genera lised r ules matched be fore their or iginal counterp arts. Changing the pr iority and/or S NORT id of the alerts ca nnot change this beha viour in any way. Therefore, a ne w pr inciple h ad to be applie d. Instead o f removing rule op tions, we inverted t hem. T he p rinciple of inverti ng a rule option is d efined as matching a packet in only those ca ses that ar e similar but wher e the or iginal rule option w ould not have matched. The d ifference betwee n the re moval generalisatio n principle and t he inver sion generalisation p rinciple is made clear in T able 1 using a rule with four conditio ns A, B, C and D. The removal principle means that on ly if not all original rule conditio ns hold , a maximum of one ge neralised rule matches. T he same is tr ue of the invers ion principle. However, if all origi nal rule co nditions match, all ge neralised rule s under t he re moval principle will also match. In co ntrast, under the inversio n principle, if a ll the origi nal rule conditions matc h, none o f the generalised r ules will. The latt er is the d esired o utcome a nd hence our choice for impleme ntation. Rule Removing Inverting Original A B C D A B C D generalised 1 A B C - A B C not D generalised 2 A B - D A B not C D generalised 3 A - C D A not B C D generalised 4 - B C D not A B C D Title Table 1: Different Generalisation Principles Demonstrated Using Conditions A-D. Using the inversion pr inciple, app lying generalisation is str aightforward for most rules, e.g. invertin g po rts, I P addresses, p rotoco ls, traffic d irection or negating co mplete con tent or URI content stri ngs. Unfortunately, for some rule o ptions finding the g enerali sed counterpart is more complica ted. As a n exa mple, let u s have a lo ok how we cr eated generalised versio ns of the dep th and offset r ule options. The depth and offset o ptions affect which pa rt of the packet data the c ontent o ption is matched. An of fset value means that the co ntent strin g is not compared agai nst until a n ’offset’ nu mber of bytes i nto the p acket data. A depth value dicta tes ho w many b ytes from the star t o f t he o ffset (or start of the packet if no of fset is gi ven) a co mparison between the packet d ata a nd the co ntent stri ng should b e made for. The p rinciples b y which the depth a nd offset op tions are genera lised are as foll ows: • In the generalised variations o f the rule, the re gion(s) of the packet heade r not compared agai nst in the origi nal rule, ar e compared agai nst, meaning t hat it should find a match in some cases when it would not have done wit h the original r ule. • To compare p acket data b efore the region c urrently being co mpared, all bytes s hould be compared prior to the offset, plus the le ngth ( minus 1) of the conte nt string to match b ytes into the offset. T his maximises t he chance to match what would not have previously b een found b ecause the co ntent string co uld partiall y exist within and outside t he region. • To compare p acket data after t he region or iginally bei ng compared, a ll bytes s hould be compared after depth chara cters fro m the start of the o riginal search re gion, plus the length ( minus 1) of the co ntent string to match bytes i nto the end o f the original search region. T he sa me principle ap plies as in the offset cas e. Finally, we need to discuss an effecti ve method o f conte nt generalisation, since this is often the key to matchi ng a rule against traf fic p atterns. T he content optio n sp ecifies a string to sear ch for in packets. Applying generalisation to the content o ption means individual c haracters in t he co ntent are replaced with a q uestion mark (? ) to rep resent an y character during a match. Additional ly, t he conte nt optio n value can be shorte ned slightly, which could allo w a match i f start or terminati ng c haracters in the attac k sequence differed. T his type o f generalisatio n is applied to a ll rule s with a co ntent (actual content) and uricontent (e .g. web addresses) option. In all cases a number o f generalised rules are made b y substitutin g one characte r in turn with a ?. 5.3 AlertMerge The AlertMerge progra m is s hown diagra mmaticall y in Figure 3 . T he pro gram accepts two ale rt files, bo th generated by S NORT against the same t raffic. One o f t he alert files i s generated using the ori ginal a nd the ot her ge neralised r ules. These files are r eferred to as original alert a nd generalised aler t files respecti vely, from no w o n. Three output files are generated, each one with the same file n ame as the alert file, but w ith a particular extension app ended to the end o f the na me. One file with a .merged extensio n, contai ning some alerts fro m the generalis ed alert file and all alerts fro m the original a lert file. As discussed previousl y, S NORT may aler t agai nst a less vit al generalised rule instead of an original rule i f a packe t matches b oth. T hus, the aler t file fro m generalised rules Author alone may i mply that t he tra ffic i s les s severe than it rea lly i s. T he merging process ensures that in a merged alert file only o ne alert per d ata packet is record ed. If two alerts, one fro m each of the t wo given alert files, a re generated fro m the same p acket, then o nly the alert from t he original al ert file is saved to t he .merg ed file. T he alert s are kept i n chronological or der. One file with a . fuzz extension, which co ntains all alert s generated by the generalised rules that were accep ted into the . merged file. Finally, o ne file with a .rejec ted_fuzz extensio n, whi ch contai ns all alerts from the generalised alert file that did not make it into the . merged file. T his file is use ful for i dentifyin g which generalised rules are bein g matched with prec edence over o riginal rules. Figure 3: Data-Flow Diagram Overview of AlertMerge. 5.4 FuzzRule Post-Proces sor The FuzzRule post-pr ocessor program, summari ses the ale rt file generated b y Snor t. T he file is summarised re gardless whether it was taken directl y fro m Snort, or whether the alert file was o ne o f the four po ssible output s fro m AlertMer ge. T hen an alert summary is given as the total n umber of o ccurrences of ea ch alert and the to tal number of occurrences of each general isation method i mplemented, including the number o f occurrences of or iginal alert s. 6 Experiments and A nalysis of Results This section re ports upo n program perfor mance, as well as analysing ho w effective Title various atte mpts at a ppl ying generalisatio n have b een. T o ensure fair co mpariso ns, all analysis will b e p erformed under the same testing enviro nment, which is a 200 0 MHz AMD-powered P C with 10 24Mb RAM running the Lin ux Mandr ake 9.0 op erating system and S nort 2.6. 0. When te sting S NORT r ules and alert files generated from them, tcp dump traffic data is used f rom Lincoln Labor atory I DS test data sets avail able fro m the Massach usetts Institute of Tec hnology (MI T data 1 999). Although we are aware of so me o f the limitations of these data sets mainly due to their age, they were chosen due to the deliberate mix of typical legiti mate traffic ( including co nsideratio n for dif ferent p orts and client platfor ms), with attacks, bo th of a known and novel n ature and of vario us levels of severity. Most importantl y, these d ata sets are still the most realistic publicly a vailable with a full list o f actual attack s. Tcpdu mp is a utilit y (generall y available on most UNIX s ystems) that ca n save ra w packet data to a tcpdump binary file. NIDSs such as S NORT ca n op tionally process archived traffic data fro m tcp dump files rather than live tra ffic data. In this case, S NORT deals with the data i n exactl y the same way as if it were live ( with the e xception t hat tcpdump data is processed at the rate that the CP U allo ws, whereas live da ta must b e processed real-time and is v ulnerable to packets not being anal ysed by t he NIDS if it can not keep up with processing). The tcpdump file used for analysis in eac h case, was the outside. tcpdump file fro m Thursda y, week 4 o f the 1 999 data sets. T he 1999 data sets were chose n o ver 199 8, since they reflected a more up-to -date range o f a ttacks, and week 4 was chosen si nce this dat a contained a r ange of a ttacks among nor mal tra ffic for the very purpose o f testin g. T he attacks conta ined in t he test data are also listed b y Lincoln Laboratory. We used this li st to confirm that our s ystems found all previo usly known atta cks in the data file. When implementi ng generali sed Inversio n, the execution time was 1 second to process and convert the o riginal 1,325 rules into a total of 6,975 rules. The generalised Content execution ti me was 2 seconds to process and convert the sa me 1,3 25 original rules, into a total o f 18,26 5 rules. These execution time s would easil y b e accep table for most pote ntial use s, such as e ach ti me the S NORT rule s were d ownloaded for signature updates. The increase in the number of rules affected the ti me sp ent p rocessing network traffic data as follo ws: • Using the origi nal rules, Snort took ap prox 100 seconds to proc ess 1,635,2 67 packets; • Using the generalised (inverte d) rules, Snor t took app rox 400 seco nds to pro cess the same packets; • Using the generalised content rules, Snor t took app rox 1,000 seconds to proc ess the packets. The change in S NORT ’s proc essing time i s a n i ncrease of around four to ten t imes and roughly in line with the i ncrease in the n umber of rules. W e believe that such a processing time increase is not a problem and still well w ithin real-time proc essing require ments. On our moderate s ystem, a whole d ays worth of data is proce ssed in less than two hours. Author 6.1 Content rule genera lisation In a second experimen t, we used ge neralised co ntent o n the co ntent and uricontent options. T he generalised cont ent p rinciple pr oduced far less generalised alerts. Out of 1,635,2 67 packets, only 50,081 packets or 3% generated an alert. Nearl y 38,000 (more than 75 %) of these alert s were generated against just o ne r ule (WEB-MISC IC Q Webfront HTT P DOS). T hus, this rule can p robab ly be ignored as creating false alar ms. A detailed further anal ysis of these results is more co mplex a nd beyond the scope of this paper . Ho wever, briefly one can note that generalisin g the co ntent o ption (T ypes: cor,rx+) is responsible for a la rger pro portio n of (prob ably false) alerts co mpared with the uricontent option (Types: urr,rx+). Ignorin g the rule mentioned abo ve, o f the to p four generalised ale rts, three ar e generated fro m rules generalised b y the content o ption, compared to j ust o ne b y the urico ntent option. As for i nversion, in our opinion the most interesting cases are those ap pearing the least o ften, e.g. le ss than 10 o ccurrences. Out o f these, the most intere sting are t hose app earing only four or six times. T his makes it unlikel y that t hese a re false alerts. For i nstance, t hese un usual alert s matc hed when allowing for the d estinatio n p ort or destination IP address to be different fro m t he original a lert. T his co uld be an i ndication that the pa rticular S NORT rule s that generated the alerts were to o stringent in their criteria. To red uce the number o f false positives, we ignore tho se > 25 , as they are very likel y to be false or trivial alerts or already covered by the or iginal S NORT rules. Rule ID Frequency Class Found by original r ules 250 3 False Alert No 255 3 True Alert No 323 1 Additional Infor mation Yes 530 1 Additional Infor mation Yes 1201 10 Additional Infor mation Yes 1377 9 FTP B eta Software Used No 1378 1 Additional I nformation Yes Table 2: List of Attacks Generated b y Content Rule Generalisation Let u s have a look at so me of the abo ve i n mor e de tail to g et a feeli ng for the usef ulness of the generalisatio n. Genera lised rule 25 0 gives a false alert: the co ntent ge neralised gives false alert s as it place s a wildcard agai nst t he o nly content character there is ( see original rule be low). He nce, we pick up a ny traffic to said port, which in al most all case s is har mless. A s imple solutio n to this pr oble m is to alter the r ule ge neralisatio n a lgorithm to not allo w content rule gener alisation i f there is only o ne content c haracter. Signature alert tcp $HOME_NET 15104 -> $EXTERNA L_NET an y (msg:"DDOS mstream handler to client"; flow:from_ server,es tablished; content: ">"; reference: cve,2000- 0138; clas stype:att empted-do s; sid:2 50; rev:4;) Rule 2 55 finds ne w T rue P ositives! These three p ackets were not picked up with t he original rules. The generalis ed rule below alerts on the following three packets and Title associates them with the DNS zone trans fer T CP attack. T he y a re id entified as attac ks in the Lincoln Lab ’solutions ’ MIT d ata. 03/31wed-1 8:00:32.6 37334 194. 7.248.153 :2076 -> 172.16.112 .20:53 04/02fri-1 5:53:24.0 50418 194. 7.248.153 :1238 -> 172.16.112 .20:53 04/02fri-1 8:49:30.2 35173 195. 73.151.50 :7332 -> 172.16.112 .20:53 Original Rule 25 5: Signature alert tcp $EXTERNAL _NET any -> $HOME_ NET 53 (msg:"DNS zone tran sfer TCP"; flow:to_ server,es tablishe d; content:"| 00 00 FC| "; offset: 15; refer ence:arac hnids,21 2; reference: cve,1999- 0532; clas stype:att empted-re con; sid:255; r ev:11;) Content generali sed Rule 255 : Signature alert tcp $EXTERNAL _NET any -> $HOME_ NET 53 (msg:"DNS zone tran sfer TCP"; flow:to_ server,es tablishe d; content:"| 00 00 |?| |"; offset :15; refe rence:ara chnids,2 12; reference: cve,1999- 0532; clas stype:att empted-re con; sid:255; r ev:11;) The DNS zone tra nsfer attac k explo its a b uffer over flow in B IND version 4. 9 releases prior to BIND 4.9.7 and BIND 8 r eleases prior to 8.1.2. An improp erly or maliciously formatted inverse quer y on a TCP stream d estined for the named service can crash the named server o r allo w an attacker to gain roo t privileges. Generalisations of r ules 3 23, 530 , 1201 and 1378 : T hese four generalised r ules correctly pick up attacks. T hese attacks had alread y been spotted with the original S NORT rules. Ho wever, the ad dition al packets found with the generalised rules pr ovide t he system ad ministrator with bett er insight into the attac ks b y highlighting additio nal events that might be of i nterest. We will use r ule 1201 (ntinfoscan) a s an e xample of how t his occurr ed. T he original rule reads: alert tcp $HTTP_SER VERS $HTTP _PORTS -> $EXTERNA L_NET an y (msg:"ATTA CK-RESPON SES 403 Fo rbidden"; flow:from_ server,es tablished; content: "HTTP/1.1 403"; depth:12; classtype :attempted -recon; s id:1201; rev:7;) This was turned into the follo wing generalised rule: alert tcp $HTTP_SER VERS $HTTP _PORTS -> $EXTERNA L_NET an y (msg:"ATTA CK-RESPON SES 403 Fo rbidden"; flow:from_ server,es tablished; content: "HTTP/1.| ?| 403"; depth:12; classtype :attempted -recon; s id:1201; rev:7;) Author Using the o riginal r ule alone, Snort did not pick up the additio nal p ackets/e vents bec ause it was looking for HTT P/1.1 , whereas the attac ker used HTT P/1.0. Her e is a p art of some of the offendin g events: GET /scrip ts/ HTTP/ 1.0 HTTP/1.0 4 03 Access Forbidden (Read Ac cess Deni ed - Thi s Virtual Di rectory d oes not al low objec ts to be read.) Content-Ty pe: text/ html

HTTP/1.0 403 Access Forbidde n (Read A ccess De nied - This Vir tual Dire ctory does not allo w objects to be read.)

Finally Rule 1 377: T he original rule 1 377 looks for t wo piec es of content ’~’ and ’[’. T he generalised rule looks for only ’[’ and find s t his because ’[’ is used to describe the beta version number as someo ne u ses a beta ver sion o f wu ftp. This co uld be o f i nterest to t he system ad ministrator as the use of be ta-software might be against po licy a s is p otentiall y introduces additio nal securit y risks. Original Rule 13 77: alert tcp $EXTERNAL _NET any - > $HOME_N ET 21 (ms g:"FTP w u- ftp bad fi le comple tion attem pt [";flo w:to_serv er, establishe d; conten t:"~"; con tent:"["; distance :1; reference: bugtraq,3 581; refer ence:bugt raq,3707; reference: cve,2001- 0550; refe rence:cve ,2001-088 6; classtype: misc-atta ck; sid:13 77; rev:1 4;) Generalised Rule 1 377: alert tcp $EXTERNAL _NET any - > $HOME_N ET 21 (ms g:"FTP w u- ftp bad fi le comple tion attem pt [";flo w:to_serv er, establishe d; conten t:"|?|"; c ontent:"[ "; distan ce:1; reference: bugtraq,3 581; refer ence:bugt raq,3707; reference: cve,2001- 0550; refe rence:cve ,2001-088 6; classtype: misc-atta ck; sid:13 77; rev:1 4;) Part o f the actual event: 220 hobbes FTP serv er (Versio n wu-2.4. 2-academ[ BETA-15] (1) Sat Nov 1 03:08:32 EST 1997) ready. US ER anonym ous 331 Guest logi n ok, sen d your com plete e-m ail addre ss as password. 6.2 Content Generalisa tion or Catching Varian ts of the BugBear Virus To see the p otential be nefit of content rule generalisation, let us consid er a sp ecific real- life e xample using the follo wing generali sed content rule s against the Bu gBear Troj an Title virus. First, the S NORT ru le shown b elow was created t o identify a set of b yte cod e within the virus. However, the BugBear virus (correc tly kno wn as the W32.Bugbear. B@mm worm) creates var iations of itself a s it spreads. We had both a ’.scr’ and a ’.pif’ variatio n to test against, but only the .scr variant was ide ntified by the original rule. T he .pif variation is as d angerou s as the origi nal and spr eads just as q uickly. alert tcp any any - > any any (msg:Poss ible BugB ear B Attack; co ntent:|3b 63 e7|; d size:>21; ) Applying the generalised content FuzzR ule progra m to the S NORT rule created three variations. Using t hese generalised variations, a match was then made a gainst t he .p if virus variatio n that d id not previousl y escape detection. The generali sed r ules are shown belo w: alert tcp any any - > any any (msg:Poss ible BugB ear B At tack FuzzRuleId cor(\'|| ?| 63 e7|\ '); conte nt:||?| 6 3 e7|; regex; dsi ze:>21;) alert tcp any any - > any any (msg:Poss ible BugB ear B At tack FuzzRuleId cor(\'|3 b |?| e7|\ '); conte nt:|3b |? | e7|; regex; dsi ze:>21;) alert tcp any any - > any any (msg:Poss ible BugB ear B At tack FuzzRuleId cor(\'|3 b 63 |?||\ '); conte nt:|3b 63 |?||; regex; dsi ze:>21;) 7 Summary and Conclu sions In thi s paper we sho wed how, using simple ge neralisation, alert rules can be modified to show up new variants o f o ld a ttacks. U sing this met hod, we w ere ab le to ide ntify ru les that had too strin gent criteria a nd also fou nd new varian ts of a kno wn Troj an. Currently, onl y t he surface has been scr atched regardin g generalised NI DS rule matching and it is difficult t o make any d efinitive conc lusions. Ho wever, so me o f the more unusual matches agai nst generalised rules ha ve shed lig ht o n ho w ge neralisation may a id S NORT , or in deed any N IDS, in finding undefined attacks. The t echniques researched, develop ed a nd analysed have bro ught up a large number of false aler ts. Once these alerts are eliminated, some p otentially i nteresting alert s shine thro ugh. Further investigation is req uired to d etermine fully how effec tive ge neralisatio n ca n be. Fo r insta nce, it is i mportant to work o ut how to disting uish mor e auto matically false positives alert s fro m genuine new ale rts genera ted b y gener alised r ules. Fro m the r esults and analysis in this pape r, it seems that in particular ap plying genera lisation to the content and urico ntent S NORT rule parameters s hould be in vestigated further. One hy pothes is as to why app lying g eneralisatio n to the uricontent op tion string appears more usef ul is that U RI (e .g. web page ad dress) strings could easily var y acros s attacks. An attack invol ving a URI string may have the sam e effect if a slightly different directory name is used , especially where sta ndard directo ry names may vary ac ross web server installatio ns. Author References S Axelsson (2 000) ’In trusion Detection S ystems: A Survey and Taxono my’, Chalmers University Tech Report, 99-15. M Burgess (20 06) ’Probabilistic ano maly detectio n in distributed computer networks’, S cience of Computer Programming, vol 60, pp 1-26. T Crothers (2003) ’Implementing Intru sion Detection Sy stems’, Wile y. F Esp onda, S Forrest and P He lman (2 004) ’A formal f ramework f or p ositive an d negative detection schemes’, IEEE Transactions on Systems, Man, an d Cy bernetics-Part B: Cybernetics, 34(1), pp 357-373. V F uller an d T Li and J Yu and K V aradhan (1993) ’Classless Inter-Domain Routing (CIDR): an Address Assignment and Aggregation Strategy’, RFC 1519. Free Software Foundation Inc (2006) ’GNU’, http://www.gnu.org/licenses/licenses.html F Go mez and F Gonzalez and D Dasgupta (20 03) ’A n i mmuno-fuzzy app roach to anomaly detection’, Proc. of the IEEE International Conference on Fu zzy System s. J Hoagland and S S taniford (2 003) ’Viewing IDS alerts: Lessons from SnortSn arf’, http://www.silicondefense.com/research/whitepapers/index.php Kim J, Bentley P , Aickelin U, Greens mith J, Tedesco G, Twycro ss J (2007): ’Immune S ystem Approaches to Intrusion Detection - A Review ’, Natural Computing, Sp ringer, forthcoming. G Lawton (2002 ) ’Open Source Security: Opport unity o r Oxymoro n? , Insti tute of Electrical and Electronics Engineers Inc, http://www.computer.org/computer/co2002/r3018abs.htm Lincoln Lab ’MIT data’ (1999), http://www.ll.mit.edu/IST/ideval/docs/1999/ T Mitchell (1997) ’Machine Learnin g’, McGraw Hill. P Ning and D Xu, ’Hypothesizing and Reasoning abou t Attacks Missed by Intrusion Detection Systems’, ACM Transaction s on Information an d System Security ( TISSEC), V ol. 7, No. 4, pp 591-627. S Northcutt ’Network Intrusion Detection ’ ,New Riders Publishers. Sourcefire In c, M Ro esch and C Green (200 6) ’SNORT Users Manu al - SNORT Release: 2.6.0 ’, http://www.snort.org S Staniford, J Hoagland a nd J McA lerney (2002) ’Practical Automated De tection of Stealthy Portscans, Journal of Computer Security, vol 10, no 1.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment