A Clonal Selection Algorithm with Levenshtein Distance based Image Similarity in Multidimensional Subjective Tourist Information and Discovery of Cryptic Spots by Interactive GHSOM

Mobile Phone based Participatory Sensing (MPPS) system involves a community of users sending personal information and participating in autonomous sensing through their mobile phones. Sensed data can also be obtained from external sensing devices that…

Authors: Takumi Ichimura, Shin Kamada

A Clonal Selection Algorithm with Levenshtein Distance based Image   Similarity in Multidimensional Subjective Tourist Information and Discovery   of Cryptic Spots by Interactive GHSOM
A Clonal Selection Algo rithm w ith Le v enshtein Distance based Image Similarity in Multidimensional Subjecti v e T ourist Information and Discov ery of Cryptic Spots b y Interact i v e GHSOM T akumi Ichimura Faculty of Management and Information Systems, Prefectural Univ ersity of Hiroshima 1-1-7 1, Ujina-Higashi, Minami-ku , Hiroshima, 734-85 59, Japan Email: ichimur a@pu-hirosh ima.ac.jp Shin Kamada Graduate School of Comprehen si ve Scientific Research, Prefectural University of Hir oshima 1-1-7 1, Ujina-Higashi, Minami-ku , Hiroshima, 734-85 59, Japan Email: sh inkamada4 6@gmail.com Abstract —Mobile Phone based Participatory Sensing (M P PS) system in volv es a communi ty of users send ing personal infor - mation and participating in autonomous sensing through their mobile phones. Sensed data can also be obtained from external sensing devices th at can communicate wirelessly to the phone. Our deve loped tourist subjective data c ollection system with Android smartphone can determine the fil tering rules to provide the important information of sightseeing spot. Th e rules are automatically generated by In teractiv e G rowing Hierarchical SOM. H owev er , the filtering rules related to photograph were not generated, because the extraction of th e specified characteristics from images cannot be realized. W e p ropose the effective meth od of the Levenshtein distance to deduce the spatial proximity of image viewpoin ts and thus d etermine the specified pattern in which images should be processed. T o ve rify the proposed method, some experiments to classify the subjective d ata with images are executed by Interactiv e GHSOM and Clonal Selection Algorithm with Immunological Memory Cells in this paper . Index T erms —Lev enshtein Distance, Cl onal Selection Algo- rithm, Image An alysis, Immunological Memory Cell s, Gro w- ing Hierarchical SOM , Interactiv e GHSOM, Smartphone based Participatory Sensing S ystem, T ourist Informatics, Knowledge Discov ery I . I N T RO D U C T I O N The current in formatio n technolog y can co llect various data sets because the recent tremendo us technical advances in pr o cessing power , storage capac ity and network con nected cloud computin g. Th e sample record in such data set in cludes not on ly num erical values but also langua ge, ev a lu ation, and binary d ata such as p ictures. T h e techn ical metho d to d iscover knowledge in such databases is kn own to b e a field of data mining and dev eloped in various r esearch fields. Mobile Phon e ba sed Participatory Sen sin g (MPPS) system in volves a com munity of users sending person al infor m ation c  2013 IEE E. Personal use of thi s mate rial is permitted. Permission from IEEE must be obtaine d for all other uses, in any current or future media, includi ng reprinting/ republishi ng this material for adv ertising or promot ional purposes, creating new collect i ve works, for resale or redistributi on to s ervers or lists, or reuse of any cop yrighted component of thi s work in other works. and participating in autonom o us sensing through th eir m obile phone s [1]. Sensed data can be obtained from sensing devices present on mob iles such as au dio, video, and motion sensors, the latter av ailable in hig h -end mobile phon es. Sensed data can also be obtain ed from external sensing de vices that can commun icate wirelessly to the phone. Participation of mo - bile pho ne users in sensor ial d ata collection both f rom th e individual and fr om th e surr oundin g environmen t presents a wide range of opp ortunities for truly p ervasi ve ap p lications. The tourist subjective data collectio n system with Andr oid smartphon e has been developed [2]. T he application can collec t subjective data such as pictures with GPS, geogr a p hic lo c ation name, the ev aluation, and com ments in r e al sightseeing sp ots where a tourist v isits and m ore than 500 subjecti ve data are stored in the d atabase. A ttr activ e knowledge discovery for sight seeing spots is requ ired to promo te the sightseeing industries. W e have alread y proposed the classification metho d f r om the collected subjective data b y the interactive GHSOM [3], [4] and th e knowledge is extracted fr o m the classification r e sults of the interac tive GHSOM by C4. 5 [ 5]. Howe ver , the imag e data was n ot included in the classification tasks, because it is too la rge amount o f inf ormation to realize the extractio n of specified characteristics from images. There is cu rrently an abundance of vision alg o rithms which are cap able of deter mining th e relative p o sitions of th e view- points fr om which the images hav e been acq uired. Howe ver , very few o f th ese algorith ms can cop e with u norder ed im- age sets fo r which no a prio r proxim ity o rdering in forma- tion is available. I m age localization can be ad d ressed in the framework of the funda m ental structu re and motion (SaM) estimation pr oblem an d ben efits fro m the wid e field of view offered by Smartphon e camera . This is b ecause a wid e field of view facilitates cap turing large p ortions of the en viron ment with few images and witho ut resorting to the use of movable gaze contro l mechan isms such as pan-tilt units. Fu rthermor e, en vironm ent fea tures rem ain visible in large subsets of images and c r itical surfaces are less likely to cover the whole visual field. The defin ition of relativ e positions and orien tations of the viewpoints corresp onding to a set of unorder ed central images is an imp ortant proced ure to be statistical analysis in image retriev al. The idea in the pro posed app roach employs the L ev- enshtein distance[6] to deduce the spatial proxim ity of imag e viewpoints and thus deter mine th e spec ified p attern in which images should be processed. Horizo ntal m atching method for localizing uno rdered p anoramic images has b een propo sed[7]. In the meth od, all images have been acquired from a constant height above a planar groun d an d ope r ates seq u entially b y the Lev enshtein distance. Our propo sed method can process not only in horizontal matching but also in vertical matchin g. In this paper, the photo g raphs ar e di vided into some categories accordin g to the similar ity by the clonal selection algorithm with immu n ological memory cells bef ore the classification by GHSOM. The ar ea of artificial imm une system (AIS) h as been an ev er-increasing inte r ested in not only theo retical works but applications in pattern recognitio n , network security , and op- timization [8], [9]. AI S uses idea s g leaned fro m im munolo gy in order to develop adaptive systems capable of performin g a wid e rang e of tasks in v arious researc h areas. Gao in- dicated th e co mplementar y r oles of somatic h ypermu tation (HM) and r eceptor editing (RE) and presented a novel clonal selection algorithm called RECSA model b y incorporatin g the Receptor E diting m ethod [1 0]. The immu nologica l memo ry which lead s to a p erception that an individual is immun e to a particular ag ent is rea lized by the clustering of the generated antibodies[1 1 ]. The rem ainder of this pap er is organ ized as fo llows. In Section II, the clo nal selectio n th eory with memo ry cells will be explained briefly . The idea ab out the antibody structure of images by Lev enshtein Distance and exp erimental results are discussed in Section III. Section IV describes the algorith m of interactive GHSOM an d its interface tool. Section V explain s the tour ist subjective data and the exper im ental results. In Section VI, we g iv e som e d iscu ssions to co nclude this p aper . I I . C L O N A L S E L E C T I O N A L G O R I T H M W I T H I M M U N O L O G I C A L M E M O RY Clonal Selection Algo rithm with Immun ological M e m- ory(CSAIM) m odel h as been pr oposed to introd uce an idea of immuno logical mem ory into th e RECSA model. Th is section describes the struc ture o f antib ody in RECSA model to th e medical diag nosis briefly . The fur ther details about the CSAIM algorithm was descr ibed in [1 1]. A. Antib ody for Classificatio n Pr o blem This subsection describes the antibo d y for classification problem ab out the structur e, the method of somatic hyp ermu- tation and receptor editing, an d a ffinity . 1) Struc tur e of Antibo dy for Classificatio n Pr oblem: Fig.1 shows th e structure of antibody in the classification problem [11 ]. w k , θ is the weigh t of antibody an d th reshold, θ W 1 W k R 1 R 2 R 3 R n_sub Fig. 1. The antibody structure w k q w 2 w 1 w k - 1 w k w 2 w 1 w k - 1 q Fig. 2. RE for w 2 , w k − 1 respectively . R 1 , · · · , R n sub indicate the sub-region in th e problem , b e cause some classification prob lem ca n be divided into n sub sub tasks. Th at is, a region is expert for the sp e cified task in classification. 2) Somatic Hypermutation and Receptor Editing : HM up - dates the rando m ly selected w i and θ for a para tope P = ( w 1 , ..., w k , θ ) as follows. w i = w i + ∆ w , θ = θ + ∆ θ, where ∆ w , ∆ θ are − γ w < ∆ w < γ w , − 1 < ∆ θ < γ θ , respectively . γ w and γ theta are a small number . RE makes a crossover of 2 set o f w i for a par atope as shown in Fig.2. 3) Affinity: Th e system calculates the degree of affinities between antibod y and antigen by using Eq.(1) and Eq.(2). f ( x p ) =  1 if | P k i =1 w i x p i − θ | ≥ E sim 0 otherw ise (1) g ( x p ) =  1 if f ( x p ) = x p T ar get 0 otherw ise (2) Eq.(3) calculates the d egre e of affinity . Initialize Population Compute Affinity Select n Best Antibodies Elite Pool 1 Clone P 1 Antibodies Antibodies Reselect Hyper mutation Receptor Editing Elite Pool n Clone P n Antibodies Antibodies Reselect Hyper mutation Receptor Editing Cell Update End ............... training Memory Cells 1 2 3 4 5 6 7 8 Create c random antibodies Select Memory Cells Fig. 3. A flow of CSAIM model Af f inity = tr num X p =1 g ( x p ) , (3) where x p means the p th sam ple in tr num training ca ses and x p T ar get is 1 if an example is a T arget, otherwise 0. B. RECSA Model[10] The shap e-space mode l aims at qua n titati vely describing the interaction s among Ag s an d Ab s ( Ag - Ab ) [12]. The set of features that ch aracterize a molecule is called its gener alized shape. The A g - Ab cod ification de te r mines their spatial repre- sentation and a distance measure is used to calculate the d egree of interaction between these molecules. The Gao’ s mode l [10] can be describ ed as fo llows. 1) Create an initial pool of m antibodies as candidate solutions ( Ab 1 , Ab 2 , · · · , Ab m ) . 2) Compute the af finity of all antibodies: ( D ( A b 1 ) , D ( A b 2 ) , · · · , D ( Ab m )) . D () means the functio n to compu te the affinity . 3) Select n best in dividuals based o n their affinities from the m origin al antibo dies. These antibod ies will be refe r red to as the elites. 4) Sort the n selected elites in n sepa r ate and distinct pools in ascending or der . They will be ref erred to as the elite po ols. 5) Clone the elites in the pool with a rate proportion al to its fitness. The amo unt of clon e gen erated for these antibodies is giv en by E q.(4). P i = rou nd ( n − i n × Q ) , (4) where i is the ord in al nu mber of the elite poo ls, Q is a multiply ing factor for determ ining the scope of the clon e and round () is the o perator that rounds tow ards the closest integer . Then , we can obtain P P i antibodies as (( Ab 1 , 1 , Ab 1 , 2 , · · · , Ab 1 , p 1 ) , · · · , ( Ab n , 1 , Ab n , 2 , · · · , Ab n , p n )) . 6) Subject the clones in ea c h po ol through either hyper mutation or recepto r editing pro cess. The mutatio n rates, P hm for hyperm utation and P r e for recepto r editing g iven by Eq.(5) and Eq.(6), are inversely propo rtional to the fitness of the parent a n tibody , P hm = a/D () (5) P r e = ( D () − a ) /D () , (6) where D () is the affinity of the curr ent par ent antibody and a is an approp riate numero us value. 7) Determine the fittest in dividual B i in each elite p ool from amongst its mutated clones. The B i is satisfied with the following equatio n. D ( B i ) = max ( D ( Ab i , 1 ) , · · · , D ( Ab i , p i )) , i = 1 , 2 , · · · , n (7) 8) Update the p arent an tibodies in each elite poo l with the fittest individual o f the clones and the prob ability P ( Ab i , → B i ) is accord ing to the r o les: if D ( Ab i ) < D ( B i ) then P = 1 , if D ( Ab 1 ) ≥ D ( B 1 ) th en P = 0 , if D ( Ab i ) ≥ D ( B i ) , i 6 = 1 then P = exp ( D ( B i ) − D ( Ab i ) α ) . Immunological Training IM IM+1 3 3 2 2 new 1 1 Fig. 4. A clustering method of m emory cells 9) Replace the worst c ( = β × n , β is the parameter .) elite pools with new rand om antibodies o nce every t genera tions to intro duce d i versity and prevent the searc h from bein g trapped in local optima. 10) Determine if the max imum numb e r of generation G max to ev o lve is reached . If it is satisfied with this con d ition, it terminates and r eturns the best an tibody . Othe r wise, go to Step 4). C. Immu nological Memory Cell Clustering Memory Cells are r equired to classify the an- tibodies respondin g the specified samples. T h is paper rea l- izes the clu stering by allocatin g the generated antibo d ies by RECSA model into som e categories. The in itial numb er of categories is predefined and a new category is created accord - ing to tr aining situation. Fig. 4 shows the clustering metho d of memory cells. Similar antibodies crowd arou nd an app ropriate point in each category , an d then only central antibo dy of the crowd can become a memory cell. Howev er , we may meet that memory cells can n ot recognize so m e of samp le s in the data set. In such a case, some new gen erated antibodies by RECSA model tries to respond to the mis-classification of th e samples, if the similar antibodies m ake a crowd. T o find the crowd of similar cases, the system check s whether th e Euclid ean distance b etween n ormalized trainin g sample and its co r respond ing antib ody is smaller than the predeterm ined param e te r µ θ . The similarity is measured by the following. Let ~ d = ( d 1 , · · · , d i , · · · , d k ) be the elements of input signa l and ~ h = ( h 1 , · · · , h i , · · · , h k ) be the element of a ntibody . In order to calculate th e distance between th e sample a n d the antibod y , the ran ge of sample is chan ged to that o f antibod y as follows. d ′ i = d i × h j d j ( d i 6 = 0 ∧ h i 6 = 0) , where d j is the min value of element in the input sample. Then, if th e Euclidean distance betwee n ~ d ′ and ~ h is smaller than µ θ , th e antibod y can respond th e samp le. In th is p aper, µ θ is the summation of 12 in put elements. I I I . S I M I L A R I T Y O F I M A G E S B Y L E V E N S H T E I N D I S T A N C E A. Levenshtein Dista n ce The Le venshtein Distance (LD) is a string metric for measuring the difference b etween two seq uences, the source ( s ) and target ( t ). The LD between s and t co rrespond s to the minimum nu mber o f edits needed to tran sform one string into the other . Th e a llow able op eration to edit are defined as letter deletio n, inser tio n, and substitution of a sing le char a cter . The LD can be c o mputed in O ( | s || t | ) time b y a d ynamic progr amming technique, known a s the Levenshtein algorithm . That is, LD betwee n s and t , l ev s,t ( | s || t | ) , is giv en as f o llows. l ev s,t ( i, j ) =            max( i, j ) , if min( i, j ) = 0 min    l ev s,t ( i − 1 , j ) + 1 , l ev s,t ( i, j − 1) + 1 , l ev s,t ( i − 1 , j − 1) + [ s i 6 = t j ]) , , otherw ise (8) As a byprodu ct, this algorithm re turns the pairs of letters that hav e been matched while comp u ting the LD. I f a letter at po sition i in s matches the letter at position j in t , then letters in s at po sitions k > i can o nly m atch letters in t that are at po sitions l > j . Th e LD has been employed in various d omains in need of p a ttern matching , such a s spell checking , pattern r ecognition , speech re cognition , informatio n theory , cryptolog y , bio inform atics, and so on. W ith respect to the field of compute r vision , the use of the LD has be e n rather limited and has c oncerned th e comp a rison of g r aph structures under edit operatio n s[13 ]. B. Data Seq u ence in Photo Images The matchin g method using Levenshtein distance described in [7] is the recovering the po sition and orien tation parameters correspo n ding to the viewpoints of a set of pano ramic images. Howe ver, th e m e th od is effecti ve only for the horizo ntal diminishing scale, that is, only fo r c o n verting th e panoramic image into the o riginal image. Ther e is no advantage to the scale for th e usual photo image and the similarity b etween photo images. In this p aper, th e pro posed meth od uses a ne wly-devised sequence of pixel in the usual photog r aph image as shown in Fig.5. The gr id in Fig.5 means the pixel in the foc used imag e, which is the main person or the ma in scene in a picture. The symbol A  , B  , an d C  in Fig.5 is 3 kind s of concentr ic rings. The circular segment of cu rves constituting tubes, B  and C  are located in 1 / 3 and 2 / 3 size of circle, respectively . T h e diameter of circ le A  is 1 / 3 . The sequence of pixel in e a ch region is arr a nged in a spiral p a ttern spreadin g o utward from a central source. Moreover , the circle is divided into 4 su b -regions on the roulette wheel and o ne of them is selected to m easure the similarity of sequen ces between the p hoto imag e an d the representative image suc h as a landmark in the famous spot. The fiabellate shape is r andomly chosen to measure similarity in many times. ψ in Fig.6 is an arbitrar y gap from a rotation axis. The similarity is classified by CSAIM in the section II- C. A B C a c = ( a / 2 - b ) / 4 b = a / 3 b c Fig. 5. A sequence of pixel in a photo image 1 R R 3 2 R R 4 ϕ Fig. 6. 4 sub-regi ons on roule tte wheel I V . I N T E R AC T I V E G H S O M [ 3 ] , [ 4 ] A. Gr owing Hierar chical SOM A b asic alg orithm of GHSOM was described in [1 5]. The algorithm has been cho sen f or its capability to develop a hierarchica l structur e of clu stering and for the intuitive outpu ts which help the interpre ta tio n of the clu sters. These capabilities allow different classification resu lts fro m rou gh sketch to very detailed g r ain o f knowledge. Th is techniq ue is a dev elopmen t of SOM, a popu lar unsupervised neural n e twork model fo r the analysis of high dimension al inp ut d ata [ 1 4]. Fig.7 shows the overview of h ierarchy structu re in GHSOM. B. Contr ol of gr owing hierar chies The p rocess of u nit insertio n and layer stratification in GH- SOM works a c cording to the value of 2 kinds of parame te r s. Because the threshold of their parameters gives a criterio n to hierarchies in GHSOM, GHSOM ca nnot change its structure to data samples adaptively w h ile training m a ps. Therefore, only a few samples are occu rred in a terminal map o f hierar chies. In such a case, GHSOM has a comp lex tr ee structu r e an d many nodes(ma ps). As for these classification results, the acquired knowledge from the structur e is lesser in scope or effect in the d ata mining. When we grasp th e ro ugh an swer f rom the specification in the data set, the optim al set of parameters must be gi ven to a trad itional GHSOM. It is very difficult to find the optimal values throug h empir ic a l studies. M 11 M 21 M 22 M 23 M 24 M 25 M 26 M 32 M 31 Layer 0 Layer 1 Layer 2 Layer 3 Fig. 7. A Hierarchy Structure in GHSOM W e propo sed th e recon struction method of h ierarchy of GHSOM e ven if the deeper GHSOM is perfo rmed[3], [4]. A stopping criterion for stratificatio n is defined. Mo reover , if the quantization error is large and the cond itio n of hie r archies is not satisfied, the requiremen ts f or redistribution o f error a r e defined. Case1) If Eq. (9) and the classification cap ability fo r hierar- chies are satisfied, stop the p rocess o f hier a rchies and insert new units in the map again. n k ≤ αn I , (9) where, n k , n I mean the number of inpu t samples for the winner un it k and o f the all inpu t samples I , respectively . The α is a con stant. Case2) If the qua ntization error is not larger and the ad dition of layer is not executed, we may m e et th e situatio n that the quantization error of a unit is larger than the quantization err or in an overall map. If Eq.(1 0) is satisfied, then a new unit is inserted. q e k ≥ β τ 1 X q e y , y ∈ S k , (10) where S k is the set o f winne r u nits k . β τ 1 is a constant for the quantization error . C. An interface of interactive GHSOM W e d ev eloped the And roid smar tp hone based interface of in teractive GHSOM to acquire the knowledge in tuitiv ely . This to ol was de veloped by Jav a lan guage. Fig. 8 shows the clustering results of Ir is data set [16] by GHSOM. The n otation [ R ][01][1 0] : 1 1 as shown in Fig. 8 represents the locatio n of unit in th e con n ection from the top level [ R ] . [ R ] mean s a root node. The nu merical value(’11’) shows the numb er of samples divided in to the leaf map a f ter the seq uence o f classification [ R ][01][1 0] . The numerica l values in the brackets m ean the position of units in the correspond e d map. The first letter (e.g. ‘0’) and th e second letter (e.g. ‘1’ ) are the po sition in the column and the r ow in the map(e . g. ‘01’) , respectively . The similar colo r of units rep resents an intuitive understand- ing of similar patter n of sam ples. If the num ber of un its in a map ar e increa sed, only a few samples c o uld be classified into a new gen erated unit. On c e th e u n it co nnected to the m ap is Fig. 8. Simulation Result for Iris data set selected, the metho d re-calculates to find a n op tim al set of weights in the loc al tree structur e search an d then a better structure is depicted. When the corre sp onding unit is touche d, the system calcu- lates th e 4 a llocated samples. The system continues to classify again till the user de te r mines the GHSOM structure. W e ca ll the metho d the interac ti ve GHSOM in th e interactive p rocess. The calculation result by the interactive GHSOM as shown in Fig. 8 is obtained and th e effecti veness of th e interactive GHSOM is shown as results of empirical studies. The classification by GHSOM shows the tr ee structur e of clusters an d the con nection amo ng them. The detailed knowledge cannot be represente d in the form of If-The n ru les. Despite low r esolution in kn owledge rep resentation, we grasp the rou gh sketch of kn owledge structure be cause GHSOM shows the sam p les divided in to each unit on the map as shown in Fig . 8. Moreover , there is only a few samples in each unit. Therefo re, k nowledge discovery is executed b y graspin g the structure and a g rain of knowledge. V . E X P E R I M E N TA L R E S U L T Participation of mobile p hone users in sensorial data col- lection both from the individual and from the surro unding en vironm ent pr esents a wide range o f oppo rtunities for truly pervasi ve application s [ 1]. Ou r developed Android smartpho ne application [2] can collec t th e tourist subjective d ata in the research field o f MPPS. The collected subjective data consist o f jpeg files with GPS, geo graphic location name, the ev aluation of { 0 , 1 , 2 , 3 , 4 } and co mments written in natura l language at sightseeing spots to which a user really v isits. The ‘comment’ must be con verted th e numb er of words extrac te d f rom html files in the T ourist websites to a numer ical value. The term frequen cy in the sub jectiv e c o mments is calculated b y TF- IDF (term freq u ency inv erse d ocument frequ ency) meth od [17]. More than 500 sub jecti ve d ata are stored in the data b ase throug h MPPS. (a) Wi thout photos (b) Wi th L eve nshtein distance based similarity Fig. 9. Classificati on Result around Miyajima Fig.9 shows th e overview of classification result o f sub- jectiv e data by GHSOM. Fig.9(a) shows clusters by u sing only GPS, comments, and evaluation. T he data with out pho tos was classified by GHSOM , becau se the image data has a large amount of inform ation an d it is difficult for GHSOM to c la ssify th em as it is. On the con trary , Fig.9 ( b) is the classification re su lt with the image wh ich was pro cessed by using th e Lev enshtein distance b ased similarity of image s. In the paper, we prep are the imag es for 6 representative spots. Especially , the famou s symbol ‘T orii’, a gateway at the entrance to a shrin, was drawn. The clu sters in Fig.9 (b) are divided into 3 groups; ‘high similarity imag e with high TF-IDF and high evaluation’, ‘lo w similarity image with high TF-I DF and high evaluation’, and ‘low similar ity im a ge with low TF- IDF and low ev aluation’. The data with ‘high similarity imag e’ represents the famous sightseeing sp o t. The data with ‘high T F- IDF and high ev aluation’ and w ith out ‘high similar ity image’ means th at th e spot is including the cryp tic tour ist inform ation. Therefo re, the subjective data has scarcity v alue and will be a sightseeing spot potentially . Mo reover , th e classification of image da ta cou ld distinguish th e u n known sp ot in the field of famous landmar k. I n other words, the cr yptic spots will be discovered in near landmar k . V I . C O N C L U S I V E D I S C U S S I O N This pap er presents an efficient an d ro bust metho d f or ex- tracting the specified th e images and arran ge the m in ascen ding order o f the similarity , which represen ts the d egre e in cluding the image o f landmark . Matching a limited amou nt of imag e data has been shown to suffice for registering the images. CSAIM me thod c a n classify th e specified p attern extracted from the images and then the spices in the pattern are divided into som e gro ups. The to urist subjective d ata with the specified image which ar e collected through M PPS is classified by Interactive GHSOM. As a result, the discovery of cr y ptic sightseeing spots is executed. The similarity to the unfamiliar landmark in sightseeing spo ts does n ot measured . In ord e r to improve such a prob lem, more subjectiv e data is r equired to classify them and th e social action amon g participants will be in vestigated in future. A C K N OW L E D G M E N T This work was supported by JSPS KAKENHI Grant Num - ber 25330 366. R E F E R E N C E S [1] N.D.L ane, E.Miluzzo, L.Hong, D. P eebles, T .Choudhury , A.T .Campbell, A surve y of mobile phone sensing , IEEE Communicat ions Magaz ine, V ol.48, No.9, pp.140-150, 2010. [2] ITProducts, Hir oshima T ourist Map , htt ps://mark et.androi d.com/details? id=jp.it products.Kank ouMap, [online], 2011. [3] T . Ichimura, S.Kamada, and K.Kato, Knowledge Discovery of T ourist Subject ive Data in Smartphone Based P articipa tory Sensing System by Inter active Gr owing Hierar chic al SOM and C4.5 , Intl. J. Knowledge and W eb Intell igence , V ol .3, No.2, pp.110-129, 2012. [4] T . Ichimura, S.Kamada, A Gener ation Method of Filt ering R ules of T wit- ter V ia Smartphone Based P articipatory Sensing System for T ourist by Inter active GHSOM and C4.5 , 2012 IEE E Internati onal Conference on Systems, Man, and Cybernet ics (IEEE SMC 2012), pp.110-115, 2012. [5] J.R.Quinlan, Impr oved use of continuous attribute s in c4.5 , Journal of Artificia l Intel ligenc e Research, No.4, pp.77-90, 1996. [6] V .I. Lev enshtei n, Binary codes capable of corr ecti ng delet ions, insertions and re ver sals , Soviet Physics - Dokla dy , V ol.10, No.8, pp.707710, 1996. [7] D.Michel, A. A.Argyros, M.I.A . Lourakis, Horizon matching for localiz- ing unorde red panorami c imag es , J ournal Computer V ision and Image Understand ing, V ol.114, No.2, pp.274-285, 2010. [8] L.N. de Castro and J. Timmis, “ Artificial immune systems: A new computat ional Inte lligenc e Approach, ” Spring er-V erlag, 1996. [9] D. Dasgupta, “ Artificial immune systems and their applicat ions, ” Springer -V erlag, 1999. [10] S.Gao, H,Dai, G.Y ang, and Z.T ang, A novel clonal selecti on algorithm and its appli cation to travel ling salesman probl em , IEICE Tran s. Funda- mentals, V ol.E90-A, pp.2318-2325, 2007. [11] T .Ichimura and S.Kamada, Clustering and Retrieva l Method of Immuno- logi cal Memory Cell in Clonal Select ion Algorithm , Proc. of The 6th Inter- nationa l conferen ce on Soft Computing and Intelligen t Systems and T he 13th Internationa l Symposium on Adva nced Inte lligent Systems(SCIS- ISIS 2012), pp.1351-1356, 2012. [12] A.S. Perelson, Immune network theory , Immunologica l Revie w , V ol.110, pp.5-36, 1993. [13] J. Ng and S. Gong, Learning Intrinsic V ideo Cont ent Using Levensht ein Distance in Graph P artition ing , Proc. of E CCV02, volume 4, pages 670684, Springer -V erlag, 2002. [14] T .Kohonen, T . Self-Org anizing Maps , Springer Series in Informati on Science s, V ol.30, Springer , 1995. [15] A.Rauber , D.Merkl, M.Dittenbach, The grow ing hierar ch ical self- org anizing map: ex plorat ory anal ysis of high-dimen sional dat a , IE EE Tra nsactions on Neural Networks, vol.13 , pp.1331-1341, 2002. [16] R.A.Fisher , Iris Dataset , UCI Machine Learning Reposi tory , http:/ /archi ve.ics.uci.edu/ml/datasets/Iris , [onlin e] 1936. [17] H.C.Wu, R.W .P .Luk, K.F .W ong, K. L .Kwok, Interpreti ng TF-IDF term weight s as making re lev ance deci sions , ACM Tra nsaction s on Information Systems, V ol.26, No.3, pp.137, 2008.

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment