Analyzing covert social network foundation behind terrorism disaster
This paper addresses a method to analyze the covert social network foundation hidden behind the terrorism disaster. It is to solve a node discovery problem, which means to discover a node, which functions relevantly in a social network, but escaped f…
Authors: Yoshiharu Maeno, Yukio Ohsawa
Int. J. Services Scien ces, Vol. x, No. x, xxxx 1 Copyright © 200x Inderscience Enterprises Ltd. Analy zing covert social net w ork foun dation behind terrorism disaster Yoshiharu Maeno Graduate School of Systems Manageme nt, Tsukuba Un iversity, Ots uka 3-29-1, Bunkyo -ku, To kyo 112-0012 , Japan E-mail: maeno. yoshiharu@ni fty.co m Yukio Ohsawa School of Engi neering, Univer sity of T okyo, Hongo 7-3-1, Bunkyo-ku, Tokyo 113 -8563, Jap an E-mail: ohsa wa@q.t.u-tok yo.ac.jp Abstract: This paper add resses a method to an alyze the covert social network foundation hidden behind the terrorism di saster. It is to so lve a node discovery problem, w hich means to discover a node, w hich functions relev antly in a social network, but esc aped from monitorin g on the presence and m utual relationship o f nodes. The method aims at in tegrating the expert investigator’s prior understanding, insight on the terrorists' social network na ture derived from th e co mplex graph th eory, and computational data processing. The social network responsible for the 9/11 attack in 2001 is used to execute simulation experiment to evaluate the performance of the method. Keywords: communication, complex network, graph th eory, node discovery, social network, terrorism Reference to thi s paper should be made as follows: Maeno, Y. ‘Analyzing a social network foundation behind the terrorism disaster’, Int. J. Services Sciences , Vol. X, No. Y, pp.000–000. Biographical notes: Yoshiharu Maeno receiv ed th e B.S. and M.S. d egrees in physics from the Universit y of Tokyo, Tokyo, Japan. He is currently working toward the degree at the Tsukuba University, Tokyo. He is with NEC Corporation. His rese arch interests lie in non-linear p henomena, complex networks, social interactions, h uman cognition, and innovation. He is a member of the IEEE (S ystems Man & C ybernetics, Computational In telligence, Computer, an d Technolo gy Management Societies), APS, and INSNA. He received the Young Researchers’ Award from the IEICE in 1999. Yukio Oh sawa r eceived the P h.D. degree i n communication and in formation engineering from the University of T okyo, Tokyo, Ja pan. He was w ith the Graduate Schoo l of Busin ess Sciences, Tsukuba University, Tokyo. In 2005, he joined the School of Engineering, Uni versity of Toky o, where he is currently an Associate P rofessor. He initiated the research area of chance discovery as well as a series o f international m eetings (conference sessions and workshops) on chance discovery, e .g., the fall symposium of the American Association of Artificial In telligence (2001 ). He co-edi ted books on chance discovery published b y Springer-Verlag and Advanced Knowledge Intern ational, and also Y. Maeno and Y. Ohsa wa special i ssues o f jo urnals such as New Generation Computing. Since 2003, his activity as Director of the Chance Discovery Consortium Japan has linked researchers in cognitive science, in formation sciences, and business sciences, and business people to chance discovery. It also l ed to the in troduction o f these techniques to researchers in Japan, the U.S., the U.K., China, Taiwan, R.O.C., etc. 1 Introduction Terror ism is a man-made d isaster. It causes great econo mic, social and environ mental impacts. It is different from t he e mergence arisi ng fro m natural disasters (ear thquakes, hurricanes etc.), in that a ctive non-routine respo nses are always nece ssary as well as t he disaster recover y management. The short-ter m target of the responses includes interpretation of the hidde n intentio n of the terrorism and arrest of the terrorists responsible for the disa ster. T he long-ter m target is identifi cation and weakening o f the covert foundation which raises, enco urages, and helps terror ists. For example , a conspirator, nam ed Mustafa A. Al-Hisawi, h ad attempted to help terrorists enter the United States (a ccording to Wikiped ia), and p rovided Mohamed Atta and t he hijackers responsible for the 9/1 1 attack in 20 01 w ith financial support w orth more than $300 ,000 (accord ing to New Yor k T imes). Future terroris m disasters are mitigated a nd e liminated by dismantlin g such a covert social network founda tion existing b ehind the terro rism. This pap er addresses a m ethod to analyze the co vert soc ial network fou ndation existing be hind the terror ism disaster. Mathematicall y, the obj ective of the analysis is to solve a node discover y p roblem. T he prob lem means to d iscover a node, which functions relevantly in a co mplex social net work, but e scaped from mo nitoring o n the p resence and mutual relationship o f nodes either i ntentionally or accide ntally. Practicall y, the proble m is diffic ult to sol ve be cause of the 2 reaso ns. First, t he terro rism disaster is infreque nt a nd non-routine, and conseq uently does not ta ke a fi xed form. Second, the intell igence and surveillance (co mmunication lo gs and meeting re cords are examples) on the co vert social network is limited, or still worse, missing co mpletely. We c an not, therefor e, rel y on conventional machine lear ning and probab ilistic inference techniques under such a condition. Instead, our method ai ms at integrating t he expe rt investigator ’s prior understanding, insight on the terr orists’ social net work nature d erived from the complex graph theory, a nd co mputational data p rocessing. The app roach to solve the n ode disco very p roble m is develope d in section 2. T he social networ k of the hijackers and co nspirator s in the 9/11 attack is revie wed in sec tion 3. In section 4, the network in se ction 3 is used to execute simulation experi ment to d iscover a covert co nspirator b y the ap proach p resented in sec tion 2. Related works are summarized in sect ion 5. Con cluding re marks are presented in sectio n 6. 2 Approach 2.1 Problem definition Before presenting our ap proach, we de fine the node discovery p roble m and describe assumptions. The nod e d iscovery proble m in a co mplex net work is new in two senses. Analyzing covert socia l network found ation beh ind terrorism disaster First, the proble m has not attra cted much attention from r esearchers. I t is in contras t to that a link d iscover y pro blem is studied to pred ict un known chemical r eactio n bet ween 2 molecules in bio-i nformatics intensive ly. Seco nd, the nat ure of covert social network foundation behind the terrorism i s n ot understood w ell, despite th e f act th at many organizations and human relationships are described by s cale-free networks or sm all worlds. This problem is illustrated in Figure 1. The inset (a) represents the observed records on the organization under investigatio n. Geographicall y d istributed p ersons are likely to use the Internet to j oin th e organizational decision-ma king process, to determine the attack plan, a nd to give instr uctions to the terrorists. I n the example, the records a re sets of p articipants of e mail-based o n-line group disc ussions. Four persons (p 0 , p 1 , p 2 , p 3 ) joined the first discussio n (subj ect 0). W e can gather a n umber of r ecords auto matically i f we assu me si mply that an individual discussion is indicated by the same e mail subject. The reco rds are in the for m of a market basket sho wn by eq.(1 ). ) 1 | | 0 ( } { − ≤ ≤ = b i p b j i . (1) The order of records and the or der of pe rsons in a r ecord are not significa nt here . The proble m may be extended to a time-sensitive or causalit y-sensitive situation wh ere the orders provide us with a significant clue to solve the prob lem. Such a situation is for future stud y. Cluster struct ures can b e e xtracted from the re cord s. The clu ster is a group of persons, between whom com munication i s acti ve. In t he exa mple, t wo c lusters c 0 (p 0 , p 1 , p 2 , p 3 ), and c 1 ( p 4 , p 5 , p 6 , p 7 ) can b e extracted. T hey are visualized o n t he social network diagra m in the blue bo x. The diagra m i s an undirected grap h. The black nod es denote persons, and black l inks between the n odes denotes the presence of acti ve communicatio n. T he links a re dra wn accor ding to t he d egree of ac tiveness bet ween t wo nodes at the end of the link. The links are not directed be cause t he communic ation is bi- directional. The cluster is not necessari ly a clique ( a co mplete graph where link s exist between ever y possible pair of the nodes). The inset (b) represents the latent structure behind the obser ved records. In this example, the latent s tructure is a co vert participant (or participants) who used telephone to tell persons in the sep arate cluster s to encourage the organizatio n-wide communicatio n, and to adjust the directio n o f decision. T he perso n escap ed from the e mail sur veillance in (a). T he fifth r ecord in (a) is not cons istent from the viewpoin t of t he overa ll c luster structure of the o rganization. This is a c lue. T he unobserve d p erson p x may b e hidde n in the e mpty spac e b etween the gateway p ersons (p 0 , p 4 ) in the c lusters. The perso n is indicated by a r ed node and red lin ks connecting the clusters in the s ocial network diagram. T he red node is a hypothetical ca ndidate of the la tent struct ure. Our aim is to reveal clues to i nfer (b) fro m (a). No te that the id entity of the red nodes can not be derived fro m the ob served r ecords in (a ) auto maticall y, but is inferred with the aid of the e xpert in vestigator ’s knowledge. O ur pri mary interest here lies in d rawing a social network d iagra m to inven t hypothesis on t he la tent structure which is rea dy for testing. The interacti ve pro cess for this purpo se is presented in the follo wing. Y. Maeno and Y. Ohsa wa Figure 1 Inset (a) represents the observed records on th e participants (p i ) of email-based on -line group discussions. A record is a list of person s who join an individual discussion indicated by the same email subject. Two clusters c 0 (p 0 , p 1 , p 2 , p 3 ), and c 1 (p 4 , p 5 , p 6 , p 7 ) can be extracted from the five records. Black nodes denote p ersons. Black links between the nodes denotes the p resence of communication. Inset (b) represents the latent structure (a covert participant) b ehind the observed records. The f ifth record i s a clue to infer that an unobserved person p x may be hidden in the empty space between the gateway persons (p 0 , p 4 ) in the clusters. The unobserved person may use telepho ne to foster communication between th e gateway p ersons. Ou r aim is to reveal (b ) from (a). 2.2 Interactive process We propose an interactive pr ocess starting fro m the intel ligence, surveilla nce, and the prior kno wledge of e xpert investigators to ward the hypothesis on t he late nt structure. Figure 2 shows the process. The algorithm, used in the co mputational data processing shown i n the dashed grey bo x, visualizes the observed record s o n co mmunication in the form of eq.( 1) into a social network diagra m. It consists of cluster ing a nd ranking proced ure. The clustering pro cedure eval uates the activeness o f co mmunication b etween the persons, and uses the p rior kno wledge such as the number of groups o r the known group leaders. The ranking proc edure calculates li keliness of the suspicious inter-cluster relationships, w hich origi nates in the u nobserved per son hidden in the e mpty spots between the clus ters, and indi cates the positio n of the per son as a red node. The expert investigators explo re the difference between t he visualized social network diagram and t he pr ior understandin g. The difference is exp ected to be a trigger to notice something ne w. The expe rt can upd ate the prior understanding, iterate the a bove proced ures, and finally i nvent a hypothe sis o n the latent structure (Mae no, 2007). The details of the a lgorit hm are pr esented in t he follo wing. The e ssence of the a lgorit hm is the ranking function to calculate li keliness o f the suspicious inter -cluster rela tionships. Analyzing covert socia l network found ation beh ind terrorism disaster Figure 2 Interactive process from the in telligence, surveillance and prior knowledge of the expert investigators toward the hypothesis on the latent structure. The compu tational data processing in the dashed grey box visualizes the o bserved records on communication in the form of eq.(1). It consists of clustering using the prio r knowledge, and ranking of suspicious inter-cluster relation ships which originates in the unobserved person. The expert explores the difference between the visualized so cial network diagram and the prior understanding, which is the basis to invent a hypoth esis. 2.3 Computational data processing Our algor ithm focu ses o n inter -cluster relatio nships in a social net work (O hsawa, 2005). Examples of t he inter-c luster relatio nships incl ude sharing of infor mation on the guard s ystem a mong the hijacker groups via a conspirator, or efficient multicast of a directive to the groups from a c onspirator. T he input of t he a lgorithm is the observed record s in eq.(1). T he output is t he ranking o f the ind ividual records ( indicating suspicious i nter-cluster relations hips or unobser ved p ersons pla ying a catalyst r ole among the clus ters), and the persons in the clusters p laying a gateway role to t he unobserved person. T he output is further processed to dra w a social networ k diagram. As a prep aration, we de fine a si mple B oolean function B(s) by eq .(2). It returns 1 if the statement s i s true, and 0 otherwise. = otherwise true is if 0 1 ) ( s s B . (2) At first, the a ll per sons app earing in the ob served record s bi in eq.(1 ) ar e grouped into clusters c j . T he number of clusters |c| d epends o n the prior kno wledge. Mutuall y close persons form a cluster. T he measure of closeness b etween a pair of persons is evaluated by Jacca rd’s coef ficient. It is defined by eq.( 3). T he functio n F(p i ) is the occurr ence frequency of a p erson p i in the record s. The clo seness means activeness of the communicatio n if the record is a set of the persons appearing together in the e mails, conversations, or m eetings. Jaccard’s co efficient is used widely in link discover y, web mining, or text pr ocessing. Y. Maeno and Y. Ohsa wa ∑ ∑ − ≤ ≤ − ≤ ≤ ∈ ∨ ∈ ∈ ∧ ∈ = ∪ ∩ = 1 | | 0 1 | | 0 )) ( ) (( )) ( ) (( ) ( ) ( ) , ( b k k j k i b k k j k i j i j i j i b p b p B b p b p B p p F p p F p p J . (3) Here, w e e mploy the k-medoids clusteri ng al gorithm ( Hastie, 2001). It is an EM (expectation- maximization) algorithm similar to the k-means algorithm f or numerical data. A medo id ) ( j c p med locates most centrall y within a cl uster c j . It correspo nds to the center of gra vity in t he k-me ans algor ithm. T he modoid persons are sele cted at rando m initially. The other |p|-|c| per sons are classified into the cluster s whose medoids is the closest. A ne w medoid is selected w ithin an individual cluster so that t he sum of Jaccard ’s coefficients between the modoid and persons in the cluster can b e maximal (M(c j ) defined by eq.(4 )). This is rep eated until the medoids co nverge. ∑ ≠ ∧ ∈ = )) ( ( ) ( ) ), ( ( ) ( j i j i c p p c p i j j p c p J c M med med . (4) Other simple algorith ms such as hierarchical clustering, or advanced algorith ms for unsupervised lear ning, such as self-organizi ng mapping, can also be employed. Then, we e valuate the likeliness of the r ecords as a candidate to include unobserved persons with a ranking func tion I(b i ). The ranking function calculates the degree of strength at which the record attracts persons b elonging to mul tiple clusters, which originates in a n unobserved pe rson hidden i n the recor d. The unobser ved perso n is assumed to be a catalyst to fo ster the inter-cluster relations hip. W e pr esent a few ra nking functions. T he most simple r anking function Ia v(b i ) is d efined by eq.(5 ). It is the degre e of c ontribution of a person p k (belonging to the cluster c j ) to the record b i , averaged over the clusters. The r ecord s having lar ger value are ranked as more likely. T he a lgorithm retrieves the records in the ord er of likeliness. The number o f retrieved records ret m can be set arb itrarily (fro m 1 to |b|). ∑ ∑ − ≤ ≤ − ≤ ≤ ∈ ∈ ∈ = 1 | | 0 1 | | 0 ) ( ) ( max | | 1 ) ( c j b l l k i k c p i b p B b p B c b I j k av . (5) Eq.(5) can b e converted to a simpler for m in eq.(6) . ∑ − ≤ ≤ ∈ ∧ ∈ = 1 | | 0 ) ( ) ( ) ( min | | 1 ) ( c j k b p c p i p F c b I i k j k av . (6) A gateway person ) , ( j i c b p gtw in the cluster c j for the record bi is calculated b y eq.(7) . It is the person who maximizes the ter m to be a veraged in eq .(5). ∑ − ≤ ≤ ∈ ∈ ∈ = 1 | | 0 ) ( ) ( max arg ) , ( b l l k i k c p k j i b p B b p B c b p j k gtw . (7) Standard deviation is a n alternative to calculate the likeliness. Isd(b i ) defined by Eq.(8) i s emplo yed instead of eq.(5) or eq .(6). The recor ds havi ng smaller v alue are ranked as more li kely. Analyzing covert socia l network found ation beh ind terrorism disaster 2 1 | | 0 1 | | 0 2 )) ( ) ( ) ( max ( | | 1 ) ( i c j b l l k i k c p i b I b p B b p B c b I j k av sd − ∈ ∈ = ∑ ∑ − ≤ ≤ − ≤ ≤ ∈ . (8) The average of two of th e largest values Itp(b i ) is an alternative, instead of th e average over the all clusters in eq.(5) or eq.(6). T his is d efined by eq .(9). The reco rds having larger value are ranked as more likely. 2 ) 2 , ) ( ) ( max ( ) 1 , ) ( ) ( max ( ) ( 1 | | 0 1 | | 0 ∑ ∑ − ≤ ≤ ∈ − ≤ ≤ ∈ ∈ ∈ + ∈ ∈ = b l l k i k c p b l l k i k c p i b p B b p B T b p B b p B T b I j k j k tp . (9) The function T(x j ,k) in eq. (9) picks up the k-th ele ment fro m x j sorted in descendin g order. More formally, it is de fined rec ursively by eq.(10 ). ) , 1 , 0 ( max ) , ( ) ( )) , ( ( K = = < ∧ ∉ k x k x T j k l l x T x j j j . (10) Finally, the retr ieved r ecords and gateway per sons are v isualised into a socia l network diagram. The unobserved pe rson in the record bi is labelled as DEi, and drawn as a red node. The red node and the gate way perso ns ) , ( j i c b p gtw are connected w ith red links. A social network d iagra m like the inset (b ) in figure 1 is drawn in thi s way. 3 Social network We b riefly review t he social network respo nsible for the 9/11 attack in 2 001 (Krebs, 2002). The stud y p rovides us with a n insight on the covert socia l network foundatio n behind t he te rroris m d isaster. T he socia l net work is also used in the simulation is section 4. (Krebs, 2002) and (Morselli, 20 07) studied the so cial net work con sisting o f the 19 hijackers b oarding o n the 4 crashed a irplanes (A A11, AA77 , AA17 5, a nd UA93) a nd the revealed 18 conspirators. The net work is s hown i n figures 3 a nd 4. Figure 3 shows the hijackers. Figure 4 includes t he conspira tors. Y. Maeno and Y. Ohsa wa Figure 3 Social n etwork diagram represent ing the observed 19 hijackers responsib le for the 9/11 attack (Krebs, 2002). The flight number of the h ijacked airplanes such as AA11 is shown after “@” after the hi jacker names. Figure 4 Social n etwork diagram represent ing the observed 19 hijackers responsib le for the 9/11 attack in figure 3 with the revealed 18 covert conspirators (Krebs, 2002). The overall network top ology is studied. The nodal degree averaged over the all nodes is 6 . 4 ) ( = d µ . Gini co efficient o f the nodal degree is 0 .33. The clustering coefficient a veraged over the all nodes is 6 . 0 ) ( = c µ . It is 3.2 times lar ger than t hat in the Barab asi-albert model (Ba rabasi, 1999) , a scale-free networ k, ha ving t he sa me Gi ni coefficient. Large cl usterin g coeffic ient i ndicates t hat clu sters exist as a core structure, but the net work ta kes a less co mpact for m. As qua litatively sug gested b y (Kle rks, 2002 ), the ter rorists po ssess a cluste r-and-brid ge structure, rather than a center-a nd-peripher y Analyzin g cove rt social n etwork found ation be hind terrorism disaster structure. It is in a gree ment with t he ob servation t hat the Al Qae da net work is a flexible tie-up of iso lated cliques (Popp , 20 06). Note that a bridge is an essentia l c omponent to make clusters rendezvou s to form a social net work. T he absence of hubs o verco mes t he dra wbacks of a scale-free net work, where the hubs result in vulnerability to attacks (Albert, 2000) and easy expo sure by th e efficient search over the network (Ada mic, 2001). 4 Simulation 4.1 Test data We p resent quantitati ve per formance eval uation o f the propo sed method. The test data, as an input to our method , is c ommunicatio n recor ds simulated on t he 9/1 1 so cial network in section 3, and config ured to include a convert conspirato r a s a latent structure for si mulation purpose. T he r ecord s are generated in the 2 st eps belo w. In t he sec ond step, a latent str ucture is c onfig ured b y de leting a conspirator from the r ecord s (Maeno, 2006 ). Note that the la tent structure does not chan ge the co mmunicatio n pattern in t he social network, but c hange s observa ble co mmunication. The first step is to c ollect the si mulated c ommunica tion into r ecords. Communication is assumed to be inf or mation dissemination over links from a n initiator. It i s like a conversatio n ta king place unde r the subject the initiato r concerns. Co mmunication transmits o n a link at a pr obabilit y of t. I t repr esents co mmunicatio n streng th. Communica tion re aches ) ( d t µ × persons by a hop on the avera ge. T he m aximal transmission distance is limite d to 2 -hop lo ng beca use 3- hop long co mmunic ation co vers most perso ns due to the small net work size . An initiator is selecte d unifor mly. P ersons, whom co mmunication reache s, are grouped into a reco rd. Hij ackers and co nspirator s a re not distinguis hed here. The average number of perso ns included in a basket is | b i |=6.5, 10.1 , 13.7, and 1 7.1 at t=0.4, 0.6, 0.8, an d 1.0. T he number of baskets used in the evaluation is | b|=3 70. T he follo wing is e xample record s init iated b y Abdul A. Al-O mari, Mustafa A. Al-Hisa wi, Waleed Alshehri, and Fayez A hmed. • b 0 ={Abdul A. Al-Omari, Marwan Al-Shehhi, Mo hamed Atta, W aleed Alshehri}. • b 1 ={Mustafa A. Al-Hisa wi, Mar wan Al-Shehhi, M ohamed Atta, Fa yez Ahmed, Waleed Alshehri}. • b 2 ={Walee d Alshehri, Abdul A. Al-Omari, M ustafa A. Al-Hisawi, W ail Alshe hri, Satam Suqa mi}. • b 3 ={Fayez Ahmed, Moha nd Alshehri, Hamza Alghamdi}. The second step is to configure a co vert conspira tor as a latent structure. A latent structure is configured to the record s by deletin g the co nspir ator ( target to be inferred in the simulatio n) fro m t he data. As a result, the d eleted co nspira tor and the rela ted links become i nvisible. The records, where the c overt c onspirato r is hidden b ehind, ar e the input to the algorith m. The follo wing is exa mple recor ds where Mustafa A. Al-Hisa wi i s configured to be a c overt co nspira tor. T he a lgorithm is e xp ected to retrieve b2’ a nd b3 ’, which are different fro m b 2 and b3 . Such clues are used to start investigatio n on Waleed Alshehri who is included in b oth baskets. Y. Maeno and Y. Ohsawa • b 0 ’={Abdul A. Al -Omari, Mar wan Al- Shehhi, M ohamed Att a, Wale ed Alshehri} =b 0 . • b 1 ’={Marwan Al-Shehhi, Mohamed Atta, Fa yez Ahmed , Walee d Alshehri}. • b 2 ’={Waleed Alshehri, Abdul A. Al-Omari, Wail Alshehri, Sata m Suqa mi}. • b 3 ’={Fayez Ahmed, Mo hand Alshe hri, Ha mza Alghamd i}=b 3 . 4.2 Performance evaluat ion In infor mation retrie val, pr ecision a nd recall are used as evaluatio n cr iteria. Precisio n p is the fraction o f rele vant d ata among t he all data returned by search. The relevant data here is t he recor ds where the co vert conspirato r has b een delete d in t he seco nd step. Recall r is t he fraction o f t he all relevant da ta that is retur ned by t he search a mong the all relevant data. The y are define d by eq(1 1). and eq.(12). ret ret m b b B p m i i i ∑ − ≤ ≤ ≠ = 1 0 ) ' ( . (11 ) ∑ ∑ − ≤ ≤ − ≤ ≤ ≠ ≠ = 1 | | 0 1 0 ) ' ( ) ' ( b i i i m i i i b b B b b B r ret . (12 ) Besides, F value is useful as a geometric mean of pr ecision and recall. It is d efined b y eq.(1 3). r p pr r p F + = + = 2 ) 1 1 ( 2 1 1 . (13 ) F value gain g F is d efined by eq. (14). It is the r atio of the F value of the algorith m to the F value o f the ra ndom retrie val. rd F F F g = . (14 ) Performance of the algorithm is evaluated with the test data under several co nditions. Figure 5 shows p recision and r ecall to retrieve the record s where a covert c onspirator , Mustafa A. Al-Hisa wi, has b een hidden. Mu stafa A. Al -Hisa wi was a big fina ncial sponsor to the hijackers, a s mentioned in sectio n 1.T he number of cluster s is |c|=4. The prob ability o f c ommunicatio n transmis sion is t=0.8. T he horizontal axis is the ra tio of the number of r etrieved basket d ata to the number of the whole ba sket data ( | | / b m ret ). The records retrieved as top 10% r anking are correct. The algorithm outputs correct informatio n. T he ranking function Isd(b i ) seems to show a little b etter perfor mance t han Iav(b i ). Isd (b i ) is e mployed i n the follo wing st udy. P recision is 100 % when the top 10 % of the bas kets are retrie ved. The algor ithm work s fine. P recision is 0 .45 when the all baskets are retrieved. The pr oblem here includes many corr ect answ ers. It is not so difficult because the networ k is s mall. (Maeno, 2006) studies the performance for a network con sisting of 4 00 nod es Analyzin g cove rt social n etwork found ation be hind terrorism disaster Figure 5 Precision p and recall r to retrieve the records where a covert conspi rator, Mustafa A. Al- Hisawi, has been hidden: (a) p using Iav(b i ), (b) r u sing Iav(b i ), (c) p u sing Isd(b i ), (d) r using Isd(b i ), (e) p u sing Itp(b i ), and (f) r using Itp(b i ). The number o f clusters is |c|=4. The probability of communication transmission is t=0.8. The hori zontal axis is the ratio of the number of retrieved basket data to the number of the whole basket data (mret/|b|). Figure 6 shows prec ision and recall at |c|=2, 4, 8, and t=0 .8. T he value of | c| depe nds on the p rior kno wledge of t he socia l network structure. T he case w here | c|=4 is a reasonable choice, based on the kno wledge t hat 4 airplane s were hijacked . It actually shows the best p erfor mance. W ith the wrong prio r kno wledge, |c|=2, the perfor mance degrades. Performance degrada tion at |c|=8 is sm all because the p ractical number of groups includi ng co nspirators may be clo se to, b ut a little lar ger than 4 . Figure 6 Precision p and recall r to retrieve the records where a covert conspi rator, Mustafa A. Al- Hisawi, has been hidden: (a) p at |c|=2, (b) r at |c|=2, (c) p at |c|=4, (d) r at |c| =4, (e) p at |c|=8, and (f) r at |c|=8. The simulatio n condition is that t=0.8, and Isd(b i ) is used. Y. Maeno and Y. Ohsawa Figure 7 shows F value gain a t |c|=4, and t=1.0, 0 .8, 0.6, 0.4. At t= 1.0, 0.8, the performance is stab le (the c urve is smooth). At t=1.0, th e gain is small because the increasin g input i nfor mation and longer r each co mmunication make the p roble m eas y. At t=0.6, the p erfor mance begins to be unstable (t he curve be gins to fluctuate) . At t= 0.4, the algorithm fails to work b ecause the input infor mation is too poo r to extract inter-cluster relationship. Figure 7 F value gain to retrieve the records where a covert conspirator, Mustafa A. Al-Hisawi, has been hidden: (a) t=1 .0, (b) t=0.8, (c) t=0.6, and (d) t=0.4. The simulation condition is that |c|=4, and Isd (b i ) is used. Figure 8 sho ws F val ue gai n for a variety o f co vert conspira tors. The algorith m works for Lot fi Rai ssi, or Rayed M . Abdullah. Lotfi Rai ssi was under suspicion of traini ng t he pilots who hijacked the AA77 and flew it into the P entagon. Ra yed M. Abdullah trained with Hani Ha njour who hij acked the AA77. Their position in t he social network i s similar to Mustafa A. Al-Hisawi. Fo r Ramzi B. Al-Shibh, or Said B ahaji, the algorithm a lso works, althoug h a little degradatio n is o bserved. For Osama Awadallah, or Raed Hijazi, the p erfor mance beco mes less stable and worse. Man y times, Osama Awada llah m et Nawaf Al-Haz mi who hijacke d the AA77. Raed Hijazi was said to have co nnectio n to Osama bin Laden, and to prep are the explosives for the Millenniu m p lot in Jor dan in 2000 . The de gradation may arise because Osa ma Awadallah and Raed Hijazi are at the bord er o f the network. Their ab sence in the re cords do es not affect the overa ll clusterin g structure, and is not easy to disc over. T he algorithm suf fers from li mitation for such covert co nspirators. Analyzin g cove rt social n etwork found ation be hind terrorism disaster Figure 8 F value gain to retrieve the records where a covert conspirator has b een hidden. The covert conspirator is (a) Mustafa A. Al-Hisawi, (b ) Lotfi Raissi, (c) Rayed M. Abdull ah, (d) Ramzi B. Al-Shibh, (e) Said Bahaji, (f) Osama Awadallah, and (g) Raed Hijazi. The simulation condition is that |c|=4, t=0.8, and Isd(b i ) is used. Figure 9 shows F val ue gain to retr ieve the re cords where a covert conspirator , Raed Hijazi, has be en hidden. Iav( b i ) and Itp(b i ) are employed again a s in Fi gure 5 . Itp (b i ) shows better perfor mance alth ough it is still a little unstable and may not be su fficient for a pra ctical use. T he p erfor mance may b e improved b y focusing o n the r elatio nship between 2 clusters, r ather than bet ween the all cluster s. Figure 9 F value gain to retrieve the records where a covert conspirator, Raed.Hij azi, has been hidden: using (a) Iav(b i ), (b) Isd(b i ), and (c) It p(b i ). The simulation condition is the same as in figure 5 (|c|=4 and t =0.8). Y. Maeno and Y. Ohsawa 4.3 Social network visual ization A social net work dia gram is dra wn fro m the obser ved recor ds acco rding to t he proce ss in fig ure 2. T he unobser ved perso n in a suspicio us r eco rd is dr awn as a r ed node . The re d node and the gate way persons ) , ( j i c b p gtw are co nnected wit h red links. Figure 10 sho ws the socia l ne twork diagra m. The condition i s t he sa me a s i n figure 5, where a covert co nspirator, Mustafa A. Al-H isawi, has been hidden and the target to discover. T he 4 terrorist group s ar e inter-con nected with 10 of the highly ranked r ed nodes, DEi, corr esponding to M ustafa A. Al-Hisa wi hidden in the suspic ious rec ords. T he bottom left cluster, including Nawaf Alhazmi, Mohamed Atta, and Ha ni Hanjour, is isolated and not connected to the red nodes. T errorists who appear more frequentl y are less e mphasized because of th e d enominator of eq .(5) or eq.(6). T his is not a prob lem, but good news. W e are inclined to overlo ok unco mmon and un expected clues b y pa ying to o much atte ntion to something frequent and conspicuou s. O n the o ther hand, foc using o n something in frequent is a double-edged sword. We may confuse the clues observed infrequentl y with r ando m noise. M ajed M oqed, Mo hamed Abdi, and Ah med K . I. S. Al- Ani are pr obab ly noise. They a re distant fro m Musta fa A. Al-Hisa wi in fig ure 4. It is, ho wever, r emarkable that Waleed Alshehri and M ohand Alshehri ar e retr ieved as neighbor per sons of the red nod es indicatin g the existe nce of Musta fa A. Al-Hi sawi. They are close to him. Wale ed Alshe hri, one of muscle h ijackers, helped Mo hammed Atta hij ack the AA11 and fly i t into the No rth T ower o f the Wo rld Tr ade Center. Mo hand Alshehri hijacked the AA175 and flew it into the South Tower of the W orld T rade Center. Waleed Alshehri is co nnected with 6 li nks. He i s the ke yston e perso n for the in vestigators to gat her infor mation o n r elatives, friends, and associa tes t o approach to Mustafa A. Al- Hisawi. Figure 10 Four clusters and ten of the highly ranked red nod es corresponding to Mustafa A. Al- Hisawi hidden in th e suspicious records. Waleed Alshehri and Mohand Alshehri are retrieved as neighbor person s of the red nodes. Analyzin g cove rt social n etwork found ation be hind terrorism disaster 5 Related works Existing terr orist or criminal s ocial net works are studied emp irically. (B atallas, 20 06) applied centrality (Freeman, 1979) and b rokerage (Cusumano, 2000 ) to analyze an aircraft engi ne develo pment project, and suggested r elevance of a n infor mation leade r team, which could be either a bo ttleneck or an innovatio n d iffuser. (Ke ila, 200 6) applied factor analysis to study email exchange i n Enro n, which e nded in bankruptc y due to the institutionalized accou nting fraud. (Klerks, 2002) p oints out that cri minal organizatio ns tend to be strings of inter -linke d small groups that lack a central leader, but to coord inate their activi ties alon g logistic trails and thro ugh bonds of friends, and that hypot hesis ca n be b uilt by pa ying atte ntion to r emarkable white spots and hard-to-fill po sitions in a network. ( Krebs, 2002) investigates th e 9/1 1 ter rorist n etwork, a nd r eveals that t he relevance of conspir ators who red uce the distance betwee n hijackers and en hance communicatio n efficientl y. (M orselli, 2007) investigates Kr eb’s net work fro m t he viewpoint of ef ficiency and security tr ade-off, and su ggests that more sec urity-orie nted structure ar ises from longer time -to-task o f the terror ists’ obj ectives, and that co nspirators improve co mmunicatio n effici ency, preser ving hija ckers’ sm all visibilit y and expos ure. Complex network, graph t heory, and lea rni ng help us get a n insight o n the d ynamics of a so cial network, in additio n to summarizin g and visualizin g a network (S hen, 2 007 ), and anal yzing a cogniti ve ne twork (K rackhard t, 19 87). Scale-free networks (Bar abasi, 1999) and small w orlds (Watts, 199 8) present u s much insight on the struct ure and evolution of a social net work: scientist s’ collab oration, actor s in movie s etc. A p ower law in the nodal d egree d istributio n govern s the scale-free networ k. (Fen ner, 20 07) proposes an expo nential c utoff mec hanis m to modify t he po wer law. Error attack toler ance ( Albert, 2000) and search efficiency (Adamic, 200 1) are of particular interest for p ractical applicatio ns. Link di scover y is ap plied to pr edict collaboration b etween scie ntists from the published co -author ship (Libe n-No well, 200 4). (Ada mic, 200 3) prop oses a technique to infer frie nds a nd nei ghbor s fro m the infor mation a vailable on the web. ( Singh, 2 004 ) applied a hidden Markov mod el and a Bayesian net work to predict the behavior of terror ists. Learning of a B ayes ian networ k i s extende d to st udy the pr obabilistic nature o f latent variables. ( Silva, 2006) studied le arning of a structu re of a linear latent variab le graph. (Fried man, 1998) studied learning of a str ucture of a dynamic probabilistic network. T he princip led analytic ap proac h often suffers fro m complexity p roble m. The complexity incl udes bi-direct ional and cyclic in fluence amon g the many observed an d latent nod es (beyond a triad: 1 latent nod e influencing 2 observed nodes). 6 Concluding remark In this paper, we demonstra te the prop osed method to analyze the cover t social network foundatio n hidden behind the terr orism disaster . T he method integrate s the expert inve stigator’s p rior understandin g, insight o n the ter rorists' social network nature derived fro m the comple x graph theo ry, a nd computatio nal data proc essing. It is effectiv e to discover a nod e, which functio ns releva ntly in a socia l network, but escap ed fro m monitoring o n t he p resence and mutual r elationship of nodes. Pr ecision, r ecall, and F value characte ristics o f the al gorith m are evaluated i n the si mulation e xperi ment using the social networ k respon sible for the 9/11 attack in 20 01. Y. Maeno and Y. Ohsawa There are still r emaining is sues. Ho w high is the qualit y of the hypot hesis invented from t he social net work dia gra m indicat ing unobserved p er sons? We need to test the quality o f the hypothesi s inve nted b y subj ect in vestigator s in more realistic ca ses. Ho w wide is the applica bility of the algorit hm in ter ms of social network topo logy, communicatio n pattern, a nd their dyna mical change? We need to investiga te on the performance o f the algor ithm under m ore variet y of envir onments, a nd to o ptimize the ranking f unction. W e b elieve that t he pr oposed method will contribute to understa nd the latent thre ats i n so cial p henome na a nd hu man ac tivities, as well as to analyze the covert social net work foundatio n hi dden b ehind the ter roris m d isaster, alo ng with the future study for the r emai ning issues. Reference Adamic, L. A., and Adar, E . (2003) ‘F riends and neighbors on the web’, So cial Networks , Vol. 25, pp.211-228. Adamic, L. A., Lukose, R. M., Puniyani, A. R., and Huberman, B. (2001) ‘Search in power-law networks’, Physical Review E , Vol. 64 , 046135. Albert, R., Jeong, H., and Barabas i, A. L. (2000) ‘Error attack tolerance of co mplex networks’, Nature , Vol. 406, pp.378-381. Barabasi, A. L., Albert, R., and Jeong, H. (1999) ‘Mean-field th eory f or scale-free random networks’, Physica A , Vol. 272 (1999), pp.173-187. Batallas, D. A., and Yassine, A. A. (2 006) ‘Inform ation leaders in p roduct d evelopment organizational networks: S ocial network an alysis of the design stru cture m atrix’, IEEE Transactions on Engineering Management , Vol. 53, pp.570-582. Cusumano, M. (2000 ) ‘Ho w M icrosoft makes large teams work like s mall teams’, Sloa n Management Review , Vol . 39, pp.9-20. Fenner, T., Levene, M. , and Loizou, G. (2007) ‘A model for collabo ration n etworks giving rise to a power law distribution with an exponential cutoff’, Social Networks , Vol. 27, pp.79-90 . Friedman, N., Murphy, K., and Russell, S. (1998) ‘Learning the structure of dynamic pro babilistic networks’, P roceedings of the Ann ual Conference on Uncertainty in Artificial I ntelligence , Madison, pp. 139-146. Freeman, L. C. (1 979) ‘Centralit y in networks: I. Con ceptual clarification’, S ocial Netwo rks , Vol. 1, pp.215-239. Hastie, T., Tibshiran i, R., and Friedman, J. (200 1) The elements of stati stical learning: Data mi ning, inference, and prediction (Springer series in statistics) , Springer-Verlag. Keila, P. S., and Skilli corn, D. B. (2006 ) ‘Structure in the Enron email dataset’, Journ al of Computational & Mathematical Organ ization Theory , Vol. 11, pp.183-199. Klerks, P . (2 002) ‘The network paradigm applied to criminal organizatio ns’, Connections , Vol. 24, pp.53-65. Krackhardt, D. (1987) ‘Co gnitive social structures’, Social Networks , Vol. 9, pp .109-134. Krebs, V. E. (2002 ) ‘Mapping networks of terrorist cells’, Connections , Vol. 24, pp.43-52. Liben-Nowell, D., and Kleinberg , J. (2004) ‘The link predi ction problem f or social networks, Proceedings of the ACM International Conference on Information & Knowledg e Management , New York. Maeno, Y., and Ohsaw a, Y. (20 07) ‘Human-computer interactive annealin g f or discovering invisible dark events, IEE E Transactions on Industrial El ectronics , Vol. 54, pp.1184-1192. Maeno, Y., and Ohsawa, Y. (2 006) ‘Stable deterministic crystallization for discovering h idden hubs’, Proceedings of the IEEE Intern ational C onference on Systems, Man & Cybernetics , Taipei, pp.1393-1398. Analyzin g cove rt social n etwork found ation be hind terrorism disaster Morselli, C., Giguere, C., a nd Petit, K. (2007) ‘The efficiency/security trade-off in criminal networks’, Social N etworks , Vol. 29, pp.143-153. Ohsawa, Y. ‘Data crystallization (2 005) ‘ chance discovery extended for dealin g with unobservable events’, New Mathematics and Natural Computation , Vol. 1, pp.373 -392. Popp, R. L., and Yen, J. eds. (2006) Emergent information technologies and enabling policies for counter-terrorism , IE EE Press. Shen, Z., Ma, K., and Eliassi-Rad, T. (200 6) ‘Visual analysis of large heterogeneous social networks by s emantic and structural abstraction’, IE EE Transa ctions on Visualization and Computer Graphics , Vol. 12, pp .1427-1439. Silva, R., Scheines, R., Glymour, C., and Spirtes, P. (2006) ‘Learning the structure of lin ear latent variable models’, Jou rnal of Machine Learning Research , Vol. 7, pp.191 -246. Singh, S., Allanach, J., Haiying, T., Pattipati, K., and Willett, P. (2004) ‘Stochastic modelin g of a terrorist event via the ASAM syste m’, Proceedings of the IEEE Interna tional Conference on Systems, Man & Cybernetics , Hague, pp.5673-5678. Watts, D. J., and Strogatz, S. H. (19 98) ‘Collective d ynamics of s mall-world n etworks’, Natu re , Vol. 398, pp.440-442 .
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment