A Recommender System based on Idiotypic Artificial Immune Networks
The immune system is a complex biological system with a highly distributed, adaptive and self-organising nature. This paper presents an Artificial Immune System (AIS) that exploits some of these characteristics and is applied to the task of film reco…
Authors: Steve Cayzer, Uwe Aickelin
A Recommender Syste m based on Idio typic Artificia l Immune Netwo rks Journal of Mathem atical Modelling and Algor ithms, 4(2), pp 181-198, 2005. Steve Cay zer 1 and Uwe Aickelin 2# 1 Hewlett-Packar d Laborato ries, Filton Road , Bristol B S12 6QZ, UK, steve.ca yzer@hp.com 2 School o f Computer Science, University of Nottingh am, NG8 1BB, UK, u xa@cs.nott.ac. uk # correspo nding author Abstract-The immune system is a complex b iological syste m with a highl y distrib uted, adaptive and self- organising nature. This paper presents an Artificial Immune System (AIS) that exploits some of t hese characteristics and is applied to the task of film reco mmendation b y Collabo rative Fil tering (C F). N atural evolution a nd in par ticular t he i mmune s ystem have not bee n designed for classical opti misation. Ho wever, for this proble m, we ar e not inter ested in finding a single o ptimum. Rather we intend to identify a sub-set of good matches on whic h recomme ndations ca n be based. I t is our hypot hesis that an AIS b uilt on two ce ntral a spects of the biologica l immune syste m will be an ideal candid ate to a chieve this: Anti gen-antibod y interaction for matching and id iotypic antib ody-antibody intera ction for diversity . Computational results are presented in support of this c onjecture and c ompared to those found by other CF tec hniques. 1 INTRODUCTION Over the last few years, a novel computational intelligen ce techn ique, in spired by bio logy, has emerged: AIS. This section i ntroduces AIS and s hows how it can be u sed for solving co mputational prob lems. I n essence, the immune system is used here a s inspirat ion to create a n u nsupervised machine-lear ning al gorith m. T he immune system metaphor will be explored, involving a b rief overview o f the basic i mmunologica l theories tha t are re levant to our work. We also introduce the basic co ncepts of C F. 1.1 OVERVIEW OF THE IMMU NE SYSTEM A detailed overvie w of t he im mune syste m can be found in man y textboo ks, for in stance [19]. B riefly, the purp ose o f the immune syste m is to pro tect the body against infecti on and includes a set of mechanis ms co llectively termed humoral i mmunity. This r efers to a population of circula ting white bloo d ce lls called B-lymphoc ytes, and t he antibo dies they create. The features that are pa rticularl y releva nt to our research are matching, d iversity and distrib uted control. Matc hing refers to the binding be tween antibod ies and antigens. Diversit y refers to the fact that , in o rder to achieve o ptimal antigen spac e coverage, antibody diver sity must be encouraged [15]. Distrib uted control means t hat there is no centra l controller, rather, the immune system is governed by local interac tions between cells a nd antibod ies. The idio typic effect builds o n the pre mise that antibodies ca n match other an tibodies as well as antigens. It was first proposed by Jerne [17] and formalised into a model b y Farmer et al [ 11]. T he theory is currentl y debated b y immunologists, with no clear consens us yet o n it s effects in the h umoral immune system [14]. T he idiotypic network hypothesis b uilds on t he recognitio n that a ntibodies c an matc h other an tibodies a s well as antigens. Hence, a n antibod y may be matched by other antibod ies, which in tur n may be matched b y yet other a ntibodies. T his activation can continue to spread thro ugh the po pulation and potentiall y has much explanato ry po w er. It could, for example, help explain how the memory of past infections i s maintained. Further more, it could r esult in the supp ression of similar antibodies th us enco uraging diversity in the ant ibody pool. The idiotypic net work has bee n for malised b y a number o f theoretical i mmunologists [21 ]. There are many mor e features of the immune syste m, i ncluding adap tation, immuno logical memor y a nd protection against auto-im mune attack. S ince these are not d irectly relevant to this wor k, they will not b e reviewed here. 1.2 OVERVIEW OF CF In this paper, we are using an AIS as a CF technique extending earlier w ork ([5], [6], [7]). CF is the term for a br oad range of algorithms that use similarity measures to obtain recommendations. The best-kn own exa mple is probably the “people who bo ught this al so bought” feature of t he internet company Amazon [2]. However, an y problem d omain where users are req uired to rate ite ms is a menable to CF techniq ues. Co mmercial applications are usually called recommender systems [22] . A canonical exa mple is movie recomme ndation. 2 In traditional CF, the ite ms to be recommended a re treated as ‘blac k boxes ’. That is, your re commendations are based purely o n the votes of your neighbours, and not on the content of the item. The preferences of a user, usually a se t of votes on a n item, co mprise a user p rofile, and these p rofile s are co mpared in order to bui ld a neighbo urhood. T he key decisions to be made are: • Data encoding: Per haps the m ost o bvious re presentation for a user pro file is a strin g of n umbers, where the le ngth is the number of ite ms, a nd the position is t he ite m ide ntifier. Each numbe r rep resents t he 'vote ' for a n ite m. Vote s are sometimes binar y (e.g. Di d you visit this web page?), but can also be inte gers in a range (say [0, 5] ). • Similarity Measure: The most common method to co mpare two users is a correlatio n-based measure like P earson or Spear man , which gives two neighbours a m atc hing sco re bet ween -1 and 1 . Vector based, e.g. c osine of the angle between vectors, a nd proba bilistic methods are alternative ap proaches. The canonical example i s the k-Nearest-Nei ghbour algorithm, which uses a matchin g method to select k revie wers with high si milarit y meas ures. The votes fro m t hese reviewers, suitably weighted, are used to make p redictions and recommendations. Many i mprovements on this method are p ossible [1 3]. For example, t he user profiles are usually extre mely sparse because many ite ms are not r ated. This means t hat si milarity meas urements a re bo th inef ficient (the so-called ‘curse of dimensionalit y’) a nd dif ficult to calculate due to the small overlap. D efault vote s ar e so metimes used for ite ms a user has not explicitl y voted on, and these can increase the overlap size [4]. Dimensionalit y reductio n methods, such as Single Value Deco mposition, both improve efficie ncy a nd increase overlap [3]. Other pr e-processing methods are o ften used, e.g. cl ustering [1]. Con tent-based information ca n be used to enhance t he p ure CF a ppro ach [ 13], [9]. Finall y, t he weighting of each neighbour c an b e adjusted by tr aining, and there ar e many lear ning al gorith ms available for this [1 0]. All these improvements could in p rinciple be applied to our A IS but in the interests of a clear and uncluttered comparison we have kept the CF algorithm as si mple as pos sible. The evaluation o f a CF algor ithm usually centres on its ac curacy. There is a differe nce between predictio n ( given a movie, pred ict a given user’s rating of that movie) a nd recommendatio n (given a user, s uggest movies that are likely to attract a high rating). Pre diction is easier to assess quantitativel y b ut reco mmendation is a more natural fit to the movie domain. We p resent results ev aluating both the se behaviours . 1.3 USING AN AIS F OR CF To us, the attraction of the i mmune system i s t hat if a n adap tive p ool of antibod ies can p roduce 'intellige nt' beha viour, can we harnes s the po wer of this co mputation to tac kle the problem of prefere nce matc hing and re commendation? Thus, in t he f irst instance we inte nd to build a m odel where known user preferences are our p ool of antibodies and t he new preferences to b e matched is t he antigen in q uestion. Our co njecture is that if t he c oncentratio ns of those antibo dies that p rovide a better m atch are allo wed to increase over time, we sho uld end up with a subset of good matches. H owever, we are not interested in optimisin g, i.e. in findi ng the one best match. Instead, we require a set of antibod ies that are a close match but which are at the same time distinct from each other for successfu l recommendation. T his is where we p ropose to harness the idiot ypic effects of b inding antibodies to si milar antibodies to encourage di versity. The next sectio n pre sents more de tails o f o ur p roble m and e xplains t he AIS model w e i ntend to use. We t hen d escribe the experimental set-up, prese nt and revie w results and discus s some possibilities for future work. 2 ALGORITHMS 2.1 APPLICATION OF THE A IS TO THE EA CHMO VIE TASKS The eachmovie d atabase [8] is a public database, which r ecor ds explic it votes of user s for m ovies. It holds 2,8 11,98 3 votes taken fro m 7 2,916 users on 1,6 28 films. T he task is to use t his data to make p redictio ns a nd recommendatio ns. In the for mer case, we provide an estimated vote for a pre viously unseen movie. In the latter case, we present a ranked list of movies that the user might like. The basic approac h of C F, is to use information fr om a nei ghbourhood to make useful predictio ns and recommendations. T he central task we set oursel ves is to identi fy a suitable nei ghbourhood . The SWAMI (Shared Wisdom through t he Amalgamation of Ma ny Interp retations) framework [12] is a p ublicl y accessible software for CF experiments. Its ce ntral algorit hm is as follo ws: Select a set of test users rando mly from the d ataba se FOR each test user t Reserve a vo te of this user, i.e. Hide from predictor From remainin g votes crea te a new train ing user t’ Select neighbourhoo d of k reviewers based o n t’ 3 Use neighbourhoo d to pred ict vote Compare this with actual vo te and collect statistics NEXT t The cod e shown in bold indicates a place where SW AMI allows a n implementation-de pendent choice of al gorithm. We use an AIS to p erform selectio n and predictio n as below. 2.2 ALGORITHM CHO ICES We use the SW AMI data enco ding: { } { } { } { } n n score id score id score id User , ... , , , 2 2 1 1 = Where id c orresponds to the unique ide ntifier of the movie being rated and scor e to this us er’s score for that movie. T his captures the esse ntial features of the data avai lable. Eachmovie vote d ata links a p erson w ith a movie and assigns a score (taken from t he se t {0, 0.2, 0 .4, 0.6 , 0.8, 1 .0} where 0 is the worst). U ser demographic infor mation (e .g. Age and gender) is provided but thi s i s not used in o ur encoding. Conte nt informatio n about movies (e .g. Categor y) is similarly not used . 2.3 SIMILARITY M EASURE The P earson measure is used t o compare t wo users u and v : ( ) ( ) ( ) ( ) ) 1 ( 1 1 2 2 1 ∑ ∑ ∑ = = = − − − − = n i n i i i n i i i v v u u v v u u r Where u and v are users, n is t he number of overlappin g vot es (i.e. Movies for which bo th u and v have voted), u i is the vote of u ser u for movie i a nd ū is the average vote o f user u over all films (not j ust the o verlapp ing votes). T he measure is amended as follo ws: ( ) ( ) ) ( , ) 2 ( , 0 , 0 1 1 2 2 penalty overlap P where r P n r P n if ceDefault ZeroVarian r v v u u if efault NoOverlapD r n if n i n i i i = = < = = − − = = ∑ ∑ = = The two default values are required because it is impossible to calculate a P earson measure in s uch case s. Bo th were set to 0. Some experi mentation sho wed that an overlap penalty P was b eneficial (this lo wers the absolute correlation for users w ith only a small o verl ap) but that the exact val ue was not critical. W e chose a va lue of 100 because this is the maximum overlap expected. 2.4 NEIGHBOURH OOD SELEC TION For a Simple Pea rson ( SP) predictor , neighbo urhood selectio n means choosing t he best k (ab solute) co rrelation scores, where k is the neighbourhoo d size. Not every po tential ne ighbour will have rated the film to be pred icted. Reviewers who d id not vote o n the film are not add ed to the neighbourhood . We have chose n t he S P as a benchmark for our AIS recommender because it is t he de facto standar d for r ecommender algorithms a nd also the usual starting p oint for more complex neighbourhoo d sele ction schemes. Furthermore, the AIS recommende r is both sufficie ntly di fferent a nd of similar co mplexity to warrant a fair co mparison. For the AIS p redictor, a more involved pro cedure is req uired: Initialise AI S Encod e user for whom to ma ke predictio ns as antig en Ag WHILE (AIS n ot stabilised) & (Reviewers ava ilable) DO Add n ext user as an a ntibody Ab Calculate ma tching sco res between Ab and Ag Calculate ma tching sco res between Ab and other antibod ies WHILE (AIS a t full size) & (AIS no t stable) DO Ite rate AIS 4 OD OD Our AIS b ehaves a s follows: At each step (iter ation) an antib ody’s co ncentratio n i s i ncreased by an amount dep endent on its matching to the antige n a nd decreased by an amou nt which depends o n its matching to other antibod ies. In absence of either, an antibody’s concentration w ill slo wly decre ase over time. A ntibod ies w ith a sufficiently lo w concentration are re moved from t he system, wher eas a ntibodies with a high concentration may saturate . An AIS iteration is governed by the fo llowing equation, due to Farme r et al [1 1]: ) 1 ( 2 1 1 1 1 i n j j i ji N j N j j i ij j i ji i x k y x m x x m k x x m c rate death recognised antigens recognised am I recognised antibodies c dt dx − + − = − + − = ∑ ∑ ∑ = = − Where: N is the number o f antibodies and n is the nu mber of antige ns. x i (or x i ) is the concentration o f antibod y i (or j ) y i is the concentratio n of antigen j c is a rate co nstant k 1 is a suppressive e ffect and k 2 is the death rate m ji is the matchi ng function bet ween antibody i and antibody (o r antigen) j As can be seen fro m the above equation, t he nat ure of an idiot ypic interact ion c an b e e ither po sitive or negative. Moreover, if the matching f unction is s ymmetric, then t he balance between “I am r ecognised” and “ Antibodies recognised” (para meters c and k 1 in the eq uation) wholly deter mines whet her the idiot ypic e ffect is positive or negative, and we can simplif y the equa tion. We can si mplify the equation still further i f we onl y a llow one a ntigen i n the AIS. The simplified equation looks like t his: ) 2 ( 3 1 2 1 i j i n j ij i i i x k x x m n k y x m k dt dx − − = ∑ = Where: k 1 is stimulation, k 2 suppre ssion and k 3 death rate m i is the correlatio n between antibo dy i and the (sole) a ntigen x i (or x i ) is the concentration o f antibod y i (or j ) y is the concentratio n of the (so le) antigen m ij is the correlatio n between antibo dies i and j n is the number of a ntibodies. In the ne w e quation, t he first ter m is simplified as we only ha ve one antige n, and the suppr ession term is normalised to allow a ‘like for like ’ compariso n between t he different rate constants. k 1 and k 2 were varied a s described in section 3. k 3 was fixed at 0. 1, while t he concentratio n range was set at 0– 100 (initially 10). W e fixed n at 100. T he matching functio n is the absolute value of the Pearson corre lation measure. This allo ws us to have both positivel y and negatively correlated users in our neighbo urhood, wh ich i ncreases the pool o f neighbours available to us. The AIS is considere d stable after iter ating for ten iteration s w itho ut changing in size. Stabilisation thus means that a sufficient n umber of ‘goo d’ neighbour s have b een identifie d and therefore a pr ediction can be made. ‘Poor ’ neighbours would be exp ected to d rop out of the AIS after a few iterations. Once the AIS ha s stabilised using the above algorithm, we use the antibod y concentrat ion to w eigh the neighbours. However, earl y experi ments showed that the most recentl y added a ntibodies were at a d isadvantage compared to e arlier antibodies. T his is b ecause t hey have had no ti me to mature (i.e. increase i n concent ration). Like wise, the e arliest antibodies had saturated. To overcome t his, we reset the concentrations and allo w a limited run of the AIS to differentiate the c oncentration s: Reset AIS (set all antibod ies to initial c oncentrations) WHILE (No a ntibody at m aximum concen tration) DO 5 Iterate AIS OD 2.5 PREDICTION We pre dict a rating p i by using a weighted average o ver N , the neig hbourhood o f u , which was take n as the entire AIS. ( ) ) ( ) 4 ( absolute not relative NB x r w w v v w u p v uv uv N v uv N v i uv i = − + = ∑ ∑ ∈ ∈ Where w uv is the weig ht bet ween user s u and v , r uv is the corre lation sco re between u and v , and x v is the concentration of the antibod y correspond ing to user v . 2.6 EVALUATION • Predic tion Accuracy: We take the mean absol ute error, where n p is the number of pre dictions: ) 5 ( p n predicted actual MAE ∑ − = • Mean number of rec ommenda tions: This is the to tal number of unique films rated b y the neighbours. • Mean overlap size: This is the number of reco mmendations that the user has also see n. • Mean accuracy of reco mmendations: Each overlap ped film has an actual vote (fro m t he antigen) and a predicted vote (fro m the neighbours). T he overlappe d films w ere ran ked on bo th actual and p redicted vote, breaking ties b y movie ID. T he two ra nked lists were co mpared using Ke ndall’s Tau τ. T his measure reflect s the le vel o f concordance in the lists b y counting the n umber of disco rdant pairs. T o do this we order the films b y vote and app ly the following for mulae: ( ) ( ) ( ) > = = − − = ∑ ∑ = + = otherwise r r if r r D r r D N n n N j i j i n i n i j j i D D 0 1 , ) 6 ( , 1 4 1 1 1 τ Where n is the overlap size and r i is t he ran k of fil m i a s recommende d by t he neighb ourhood. Note that i here refers to the antigen rank of the film, not the fil m I D. N D is the number of disco rdant pairs, o r, equivalently, the expected co st of a bubble so rt to reco ncile the two lists. D is set to one if t he rankings are discordant. • Mean number of re viewers. T his is the nu mber of revie wers looked at before the AIS stabi lised. • Mean number of neighbours: This is the fi nal number of nei ghbours in the stabili sed AIS. 3 EXPERIMENTS Experiments were carr ied o ut on a Pentium 700 with 25 6MB RAM, running Windo ws 2000. The AIS was coded in Java TM JDK 1.3. Each run invo lved loo king a t up to 15 ,000 reviewers (a ra ndom sa mple o f up to 20% of t he Eac hMovie data set) to provide predictions and reco mmendations for 100 users. Averaged statistics are then taken for each r un. Runtimes ranged from 5 to 6 0 minutes, largely d ependent on the n umber of revie wers. 3.1 EXPERIM ENTS ON SIM PLE AIS Initial experi ments conce ntrated on a simple AI S, w ith no id iotypic effects. The goal was to find a good stimulation rate, but also to ensure tha t th e ‘baseline’ syste m op erates similarly to a SP predictor . T herefore, we set the suppressio n rate to zero, and varied only the stimulatio n rate, i.e. T he weig hting given to anti gen binding. Other p arameters had been fixed b y preliminary expe riments to values t hat worked well with both AIS a nd SP.. 6 E ffe c t o f S tim ula tio n o n N e ig h b o u rh o o d s iz e 0 10 20 30 40 50 60 70 80 90 100 0 0. 2 0. 4 0. 6 0. 8 1 S ti m u l a ti on Ra te N e i g h b o u rh o o d S i z e E ffe c t of s tim u la tion o n nu mb e r o f u s e rs lo o k e d a t 0 5000 100 00 150 00 0 0. 2 0. 4 0. 6 0.8 1 S ti m u l a ti o n R a te N u m b e r o f u s e r s l o o k e d a t Figure 1: Effect of stimulation rate on neighbo urhood and re viewers. The graphs sho w averaged results o ver five ru ns at eac h sti mulation rate. T he bars show standard deviations. In ord er to have a fair compariso n, the S P parameters ( neighbourhood and nu mber o f reviewers looked at) m atc h the AIS values for ea ch rate. In figure 2, we show t he pr ediction error, number of reco mmendations, number of overlaps and recommendation accurac y for each algorith m. Note that lo w pred iction error values are b etter, whereas for the o ther measures we are looking for high values. E ffe c t o f stim u la tio n o n p re d ic tio n e rro r 0. 5 0. 55 0. 6 0. 65 0. 7 0. 75 0. 8 0. 85 0. 9 0. 95 1 0 0.2 0. 4 0.6 0.8 1 S tim u l a ti o n Ra te M e a n A b s o l u t e E r r o r A IS (av) S P (av ) E ffe c t of stim u la tio n o n re c o m me n d a tio n ac c u r ac y 0. 35 0. 4 0. 45 0. 5 0. 55 0 0. 2 0. 4 0. 6 0.8 1 S ti m u l a ti o n R a te Rec om m e n da tio n Acc ura cy (Ken da l l 's T au ) A IS (av ) S P ( av ) E ffe ct of S timu la tio n o n n u m b e r of re c o m m e n d a tio n s 0 200 400 600 800 100 0 120 0 0 0. 2 0. 4 0.6 0. 8 1 S tim u la tio n Ra te Nu m b er o f rec o m m en d atio n s A IS (av ) S P (av ) E ffe c t o f S timu lat ion on n u m b e r o f o v er lap s 0 10 20 30 40 50 60 0 0. 2 0. 4 0. 6 0. 8 1 S ti m u l a ti o n Nu m b er o f o ver la p s A IS (av ) S P (av ) Figure 2: Effect of stimulation rate on p rediction and rec ommendation. It can be seen that the simple AIS gives broadly similar p rediction perfor mance to the SP. T he Mean Absolute Error (MAE) measure ments fro m di fferent runs are not nor mally d istributed, so a non-para metric statis tic is app ropriate. W e performed a Wilco xon anal ysis, which sho wed no significa nt di fference bet ween pr edictio n err ors of SP and AIS (a t a 95% co nfidence le vel). In ad dition, the choice o f an appropr iate stimulation r ate did make a significant d ifference (comparing a rate of 0.2 with 0.02 at the 95% level). For recommendatio n, the AIS perfor ms better than the SP at stimulation rates abo ve 0.1. Again, we per formed a positive 95% W ilcoxon analysis to assess significa nce. We excluded cases where a reco mmendation score was unavailable ( due to an i nsufficient number o f overlaps). The number of reco mmendations a nd o verlaps s how simi lar trends though t he AIS gives a more constant value. Again, s ome stimulation was beneficia l. In later experi ments, the stimulation rate was fixed at one of the better values (0 .2, 0.3 or 0.5), in ord er to give us a good base to work o n. T hese va lues give us generally good p erformance, while keeping a good neig hbourhood size and still evaluating a reaso nable numb er of revie wers. 7 3.2 EXPERIM ENTS ON THE I DIOTYPIC AIS Having fixed all the s imple para meters, we tested the effect of suppression for stimulation r ates o f 0 .2, 0.3 and 0 .5. Not surprisingl y we found t hat suppression changed the number of r eviewers loo ked a t a nd the nu mber of neighbours (figure 3): E ffe ct of su p p re s s ion o n n e igh b o u rh o o d siz e 0 10 20 30 40 50 60 70 80 90 100 0 0. 2 0. 4 0. 6 0.8 1 S up p re ss io n r a te Nei g h bo u rh o od si z e Rat e 0 .2 Rat e 0 .3 Rat e 0 .5 E ffe c t o f su p p re s s io n o n n u m b e r o f re v iew e rs lo o k e d a t 0 200 0 400 0 600 0 800 0 100 00 120 00 140 00 160 00 0 0. 2 0. 4 0.6 0. 8 1 S u pp r e ss io n R a te Nu m ber revi ewe rs Rat e 0 .2 Rat e 0 .3 Rat e 0 .5 Figure 3: Effect of suppressio n rate on neig hbourhood size a nd reviewers. We then tested the e ffect of supp ression on the AIS perfor mance. Here we fixed the b aseline ra te at stimula tion only ( no suppression), and to ok measur ements re lative to this baseline (Figure 4). Again, it should be noted that the first grap h shows prediction er ror (hence, a good result is low). E ffe ct of s u p p re s sio n o n pr e d ic tion e rr or 70. 0% 80. 0% 90. 0% 100 .0 % 110 .0 % 120 .0 % 130 .0 % 0 0. 2 0. 4 0. 6 0. 8 1 S u pp r e ss io n ra t e M ean ab s ol u te er ror (r el ati ve to ba sel i n e ) Rat e 0 .2 Rat e 0 .3 Rat e 0 .5 E ffe ct of su p p re s s ion o n rec o m me n d a tio n ac c u ra cy 80. 0% 90. 0% 100 .0 % 110 .0 % 120 .0 % 0 0.2 0. 4 0. 6 0.8 1 S u pp r e ss io n ra te Rec om m e n dat io n accu r acy (K end a ll ) re lati v e to b as eli ne Rat e 0 .2 Rat e 0 .3 Rat e 0 .5 E ffe c t of su p p re s s ion o n n u m b e r o f o v e rlap s 0. 0% 20. 0% 40. 0% 60. 0% 80. 0% 100 .0 % 0 0.2 0. 4 0. 6 0.8 1 S u pp r e ss io n ra te Num b er o f o ver la ps (rel ati ve to bas el i n e) Rat e 0 .2 Rat e 0 .3 Rat e 0 .5 E ffe c t o f s u p p re s s io n o n n u m b e r o f re c o m me n d a tio n s 0. 0% 20. 0% 40. 0% 60. 0% 80. 0% 100 .0 % 120 .0 % 0 0. 2 0. 4 0. 6 0.8 1 S u pp r e ss io n ra te Nu m b er of r eco m m en d ati o n s (rel ativ e to b ase li ne ) Rat e 0 .2 Rat e 0 .3 Rat e 0 .5 Figure 4: Effect of suppressio n rate on pr ediction and reco mmendation. Again, the graphs sho w a veraged results o ver five runs at each suppressio n r ate. The bars sho w sta ndard deviations (similar size bar s for rates 0.2 and 0.5 have been o mitted in the inter ests of clarit y). At low leve ls of sti mulation, predictio n accuracy is not significantly affected . However r ecommendation acc uracy is improved significa ntly (95% Wilcoxon). Fo r instance, for 0.3 sti mulation, rates from 0.05 to 0.2 gave a si gnificantly i mproved p erformance. In act ual terms, the Kendall meas ure rises from 0.5 to nearly 0 .6. This means that the cha nce o f a ny two rando mly sa mpled pairs being correctl y ranked has rise n from 60% to 8 0%. T oo much suppression had a detr imental effect on all measures. 8 4 IDIOTYPIC ANALYSIS 4.1 INTRODUCTIO N It has pr eviously b een sho wn that a reco mmender b ased on immune system idiot ypic p rinciples can outperfor m o ne based on corr elation alone. Ho wever, so far we have not e xplored the mechanisms o f th at beneficial effect. Such an exploration would see m wort hwhile, p articularly if this re sults in identifyi ng the u nderlying causes o f the i mprovements of the ‘character istics’ of a c ommunity (either by cha nging its m ember ship, or b y evaluat ing the relative merit of each member). Such an e ffect will be generally use ful in a range of ap plications, o f which recommender systems p rovide just one example. In additio n, a d eeper understanding o f the id iotypic effect may pro ve useful to the designers of other AIS applications. 4.2 ANALYSIS OF EFF ECTS To compare the two p redictors regarding their ne ighbourhoo d co mposition, a test user is taken from a database, and then predictio ns and recommendatio ns are made for that user. Both pr edictors work b y finding a neighbourhoo d and using that neighbourhoo d to produce pre dictions and recommendatio ns. Although both the AIS and SP reco mmender algorithms are ba sed on Pear son correlatio ns, they act differ ently for a nu mber of reasons: The choice of neighbours is di fferent. In the SP, the 100 highest correlated users ( or all users that show any correlatio n, if there are less than 1 00) ar e chosen to for m a neighbour hood . In the AIS, t his general rule is follo wed, except t hat stimulation add s threshold and idiotypic effec t adds diversity . Even given the same neig hbours, the weightin g is dif ferent. In the SP, the neig hbour weight is the correla tion b etween that neighbour a nd the te st user . In the AIS, this cor relation is multiplied b y that antib ody’s (neighbour ’s) concentratio n, which in turn is d etermined b y running the AIS algorithm ov er the neighbour hood. To deal with t he first p oint, t he stimulation rate pr ovides so me fixed thres hold for the co rrelation of a ny a ntibod y with the antigen. E ven in the ab sence of an y idiotypic interacti ons, an antib ody’s corr elation ( weighted by the sti mulation rate) mus t outweigh the death ra te; other wise, it will not survive i n the AIS. So, at lo w sti mulation ra tes it may pro ve difficult to fill t he AIS co mpletel y. Conversel y, at very hi gh stimulatio n rates it may not be necessary to exa mine all the supplied users in o rder to fill an AIS. This effect can be seen in Figure 1. Suc h a t hresholding effect has been s hown to be beneficial b y Go khale [13] in maintaining the qualit y of a neighbour hood by filtering out po orly correlated users (the SP will co nsider all reviewers who have at le ast one vote in co mmon with the te st user). Thus, t he idiot ypic effect shoul d b e viewed in the co ntext o f providing further refineme nt t o a neighbourhood that is a lready known to be in so me se nse ‘good ’. Si nce the effect (i n our model) is a lways negative, its i mpact may b e to im pro ve diversity by re moving ‘subop timal’ u sers from the AIS . Conversely, it might be that the idio typic effect is effective b ecause, given a neighbour hood , it changes the wei ght of each neighbo ur (or co ncentration of each antibod y) in that neighbo urhood. T his is the second p oint highlighted abo ve. In ord er to test out t hese h ypotheses, we took a sample r esult, based on 1 00 p redictions fo r detaile d analysis. T he three settings for ea ch algorith m we re as de tailed in section 2, e xcept t hat default votes were not used. Thus, i f a neighbour has not seen a fil m then that neighbo ur is ignored when making a pre diction for that fi lm. The AIS para meters were set to ‘goo d’ values (as ob served previously). Thus stimulatio n rate was set to 0.3 and suppression rate to 0.2. As rep orted previously, the prediction perfor mance (mea n absolute erro r) was not sig nificantly different b etween the t wo algorithms, but recommenda tion (Kendall’s Ta u) was si gnificantly better for the AIS recommender (as before, a Wilcoxon matched pairs signed rank test was u sed to assess signi ficance). Comparison of neighbourho ods for AIS and SP predictors 0 20 40 60 80 100 120 AIS SP Pre dictor type Neighb ourh ood siz e Uniqu e Common Figure 5: Co mparison of AIS and SP neighbo urhoods. T he total size of each b ar repr esents the total size o f the neighbourhoo ds prod uced by each predicto r (averaged o ver 100 predictio ns; bar shows standard deviation). T he lower part of each b ar shows the ave rage number of co mmon neig hbours (i.e. Appear ing in both neighbourhood s). The remainder of t he bar is co mposed of unique neighbours – that is, neighbours who appeared in one neighbourhoo d but not the other. 9 The first thing to observe is t hat the neighbo urhoods p roduced by eac h algorithm are d ifferent. As implied fro m t he above, SP te nded to pro duce lar ge neig hbourhoods (average 95 .4 as oppo sed to 73.8 using the AI S) and Figure 5 shows that the composition of these neighbourhoo ds is different. I n par ticular, it does not seem that the AIS neighbourhoods are merely subsets o f the SP neighbour hoods. In fact, t he vast majo rity of neighbours ar e ‘unique ’ – t hat i s, cho sen b y one algorithm b ut not the other. Is it the neighbourhood s that make the d ifference to p rediction and rec ommendatio n performance? Figure 6 shows AIS and SP performance o n bo th neighbourhoods. For this experi ment, we record ed the neighbourhoo ds found by bot h the AIS and SP algorithms. We then reran the pr edictions, with e verything t he sa me except that t his time we forced the AIS a nd SP al gorithms to use o ur ‘fixed’ neighbour hoods. We can see that for pred iction, c hanging the nei ghbourhoo d (or indeed algorithm) did not seem to ma ke a ny signi ficant difference ( Table 1 has the d etails of the sta tistical tests) . However, for recommendation, although the m eans ar e very si milar (Figure 6 ), the A IS neighbourhood usually prod uced better recommendations than the SP nei ghbourhood (Table 1b). In fact, the neighbour hood effect see ms to d ominate, si nce given the AIS nei ghbourhood, the SP algorith m ap pears to do sig nificantly better than t he AIS algorit hm for recommendation. There is o ne exception to this, where the AIS algorit hm d oes not d o significantly b etter for either neighbourhoo d. In addition, the AIS algorithm does better on the SP neig hbourhood than the SP algorith m, i ndicating that the neighbo ur weightings, as well as the neighbours themsel ves, also contrib ute to the re commendation quality. 1st Predic tor 1st NH 2nd Predic tor 2nd NH Median 1 Median 2 Number of comparisons 1st better 2nd better Significance (upper b ound) SP SP AIS SP 0.682 0.6 97 97 2212 2541 0.5551 SP SP SP AIS 0.682 0.6 58 97 2163 2590 0.4434 SP SP AIS AIS 0.682 0.652 9 7 2176 2 577 0.47 17 AIS SP SP AIS 0.69 7 0 .658 97 225 6 249 7 0.6659 AIS SP AIS AIS 0.697 0. 652 97 2258 2495 0.6711 SP AIS AIS A IS 0.6 58 0.652 84 170 6 186 4 0.7263 Results for P redictions abo ve; Result for Reco mmendations b elow SP SP AIS SP 0.525 0.5 57 83 801 2685 1.917 e-05 SP SP SP AIS 0.525 0.5 49 83 707. 5 27 78.5 2.617 e-06 SP SP AIS AIS 0.525 0.5 42 85 930 2725 8.483 e-05 AIS SP SP AIS 0.557 0.5 49 82 1218 .5 2184 .5 0 .02571 AIS SP AIS AIS 0.557 0. 542 80 1426 1814 0.3534 SP AIS AIS A IS 0.5 49 0.542 78 214 9 932 0.00245 9 Table 1: Analysis of differenc es between neig hbourhoods (NH) and algorithms for both pre diction and recommendation. In each case , the Wilcoxon si gnificance test was applied to the results obtained from each pair of regimes. Regi mes that are sig nificantly better a re shown in b old (there were no significant d ifferences found for predictio n). Effect of ne ighbo ur hoo d o n pr ed iction per for man ce 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 S P pr e dict o rSP n eigh b o u r h o o d A I Spre d i c t o rSPn e i g hb ou r ho o d SPpr e dict o rAIS n e i g hb ourhood AI S p re dit o rA I Snei g hb o urh o o d M ean ab so lut e err or Effect of neighb ou rh ood on r eco mme nda tion per for manc e 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 S P pred ic t orSP n e ig h bou rho od A I S pre d ic torS P n ei gh b our ho od SP pre dict orAI Snei ghb ourh o od AI Spr e di ctorA IS n e ig hbo u r h oo d R ec o m m e nd a ti o n p er f or m a n ce (K e nd a ll 's T a u) Figure 6: Effect of neighbourho od co mposition for AIS and SP algorithms. T he left graph sho ws prediction performance ( measured as mean absolute erro r averaged ove r 100 p redictions) for ea ch algorithm and e ach neighbourhoo d. The right grap h shows reco mmendation perf ormance deviatio n (measured as Kendall’s T au averaged over 100 predictions) for e ach algorithm and eac h neighbourhood . Bars sho w standard deviation. 10 We ran these experi ments using d efault votes (neig hbours who had not voted o n a fil m were assumed to give t he film a slightly ne gative rating) a nd ob tained similar re sults. It is w orth pointing out at t his stage t hat these res ults should not be taken to be exhausti ve, merely indica tive. I ndeed, we would not want to dra w an y firm c onclusions based on on ly 10 0 predictio ns. T his po int will be re turned to later. Ne vertheless, the re sults obtained so far s eemed to indicate that it was worth investigati ng the contri bution of nei ghbourhood c omposition to rec ommendation p erformance. We looked at a variety o f neighbourho od para meters ( we might term these co mmunity characteristic s) across SP and AIS neighbourhoo ds. Four c haracteristics a re of pa rticular interest, and each will be d iscussed in turn. Firs tly, it might seem reasonable to assume th at performance improves with the number of neig hbours in a neighbourhoo d. However, clearly there is a c ost in collecting neig hbours ( of ap propr iate quality) together, and th us it will be useful i f we ca n provide goo d quality recomme ndations fro m smaller neighbo urhoods. Another c haracteristic is t he overlap size, which governs t he numbe r of r ecommendations we ca n assess ( An overlap is a test user vote that is also contained in the union o f all nei ghbours’ votes). Thirdl y, we l ooked at co rrelation b etween each neighbo ur and the test u ser. A hi gh corr elation sho ws that nei ghbours are clustered ‘tightl y’ ar ound t he test user, which we mi ght ima gine would p rovide for b etter r ecommendations. Fourthl y, the idiot ypic effect is expected to reduce the inter-nei ghbour c orrelations. An o bvious i ntuition might be that such a red uction cau ses an i ncrease in recommendation q uality. Table 2 sho ws the dif ference i n these co mmunity c haracteristics acro ss SP and AIS neighb ourhoods. It can be seen t hat the AIS does p roduce nei ghbourhoods tha t are measurably different i n character to the SP neighbourhoo ds. In summar y, the A IS neighbourhood s are smaller, ha ve less overlap, are generally less co rrelated with the test user and have lower inter-neighbo ur co rrelations. I n o rder to test out which (if a ny) of t hese c haract eristics i s crucial, we plotte d recommendation perfor mance against each for the AIS algorithm. The results see m to show that none o f these characteristics on their o wn influences the per formance in a clear way. Fi gure 7 shows scatter p lots ge nerated for each characteristic against r ecommendatio n qualit y. T rend lines (based on a po wer law) ha ve been add ed to emphasise any underlying data tr ends. 1st Predic tor 2nd Predic tor Characteristic tested Mean 1 Mean 2 Unequal neighbour- hoods 1st has higher value 2nd has higher value Significance (upper b ound) SP AIS Neighbours 95.40 73.75 97 4602 151 1.196e-15 SP AIS Ov erlap 47.46 4 6.39 2 6 334.5 16.5 5.6 86e-05 SP AIS Correlation 0.12 0.1 0 79 256 6 594 1.4 65e-06 SP AIS Neighbour correlatio n 0.15 0.0 4 83 347 7 9 3.572 e-15 Table 2: Analysis of differenc e in neighbo urhood characteristics b etween SP and AIS algor ithms. Four charac teristics are shown. In eac h case, the W ilcoxon significance te st was ap plied to the neighbourhood s obtained fro m the algorithms. I n all four cases, t he value for the SP was significantl y higher; this is i ndicated by bold type. The first plot s uggests that n eighbourhood size is not esse ntial in orde r to obtain high quality reco mmendations. T he second plot, ho wever, does s uggest t hat small o verlap sizes might be beneficial for p roducing goo d r ecommendatio ns (regression anal ysis ha s not b een perfor med so at this sta ge this i s merely a suggestion). This in some sense is intuiti ve, as it might be e asier to pro duce hig her qualit y reco mmendations if t here are less of them. However, a b alance needs to be struck here; o nce the overl ap size gets to o low, the nei ghbourhoo d may no longer pro ve useful to the user. T he third plot shows that, perhaps s urprisingly, high co rrelation b etween neig hbours a nd the tes t user may not be essential fo r high quality reco mmendations . Finally, the fourth plot w oul d seem to ind icate t hat r educed inter -neighbour corre lation is not importa nt in recommen dation accurac y, or at least if i t is responsible, it is p art of a wider effect. Effect of neigh bo urh oo d size on rec omme nda tion ac cu ra cy 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 10 20 30 40 50 60 70 80 90 1 00 Neigh bourh ood Size R e co m m e n d a ti o n A c c u ra c y (K e n d a l l 's T a u ) Effect of over lap size o n re co mmend a ti on ac cur ac y 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 50 100 1 50 200 250 30 0 Ov erlap Size R e co m m e n d a ti o n A c c u ra c y (K e n d a l l 's T a u ) 11 Effect of cor re lation with test use r on re co mmend ation acc ur acy 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0.05 0 .1 0.15 0.2 0 .25 Adju sted Correl ation w ith test user (median ) Re comme ndati on A ccura cy (Ken dal l' s Tau ) Effect of inter-ne ighb our cor re lation on re co mmen dation acc ur a cy 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0 0 .01 0.02 0.03 0 .04 0.05 0.06 0.07 0.08 0.09 Adju sted I nter neighb our correlat ion (m edian) R ecomme nd ation A ccura cy (Ke nda ll 's Tau ) Figure 7. Effect o f various neighbour hood meas ures on AIS r ecommendation per formance. In each graph, the measure is shown on the x -axis. The re commendation perfor mance (wher e available) for each of 10 0 AIS predictio ns is plotted against this nei ghbourhood measure. Trend lines are added to indicate the underl ying data trend ( if any). 5 DISCUSSION AND CONCLUSIONS It is not pa rticularly surprisi ng that the si mple AIS performs si milarly to the SP p redictor. T his is because the y are, at their co re, based a round the same al gorithm. T he sti mulation rate ( in absence of any i diotypic e ffect) is effectivel y setting a threshold for corre lation. T his has bo th stre ngths and weakne sses. It has b een s hown t hat a threshold is u seful in discarding the potentiall y misleading p redictions of po orly correlate d revie w er s [13]. On the other hand, a rigid threshold means tha t one has to ‘prej udge’ t he approp riate level to avoid both premature con vergence and empty communities. Indeed, detailed examination o f the individ ual runs s howed that the AIS had a tendency to f ill its neighbourhoo d either earl y or not at a ll. T he setting of a threshold also means that sufficie ntly goo d antibodies ar e taken on a first come, first served basis. It is intere sting to observe that such a strate gy nevertheless seems (in these experiments) to p rovide a more constant level o f overlaps, and b etter recommendatio n qualit y. The richness of o ur AIS model co mes whe n we all ow i nteractions bet ween antibodies. Earl y, q ualitative experimentatio n with the idi otypic net work showed an tibod y concentration ris ing and falling dynamicall y as the population varied. For instance, in the simple AIS, the concentrat ion o f a n a ntibody will monotonically increase to saturation, or decr ease to elimination, u naffected b y the other a ntibodies. Ho wever, t here is a delicate ba lance to b e struck bet ween stimulatio n and suppression. An imbalance may lead to a loss i n populatio n size or diversity. T he graphs show that a s mall amount of suppression may indeed be beneficial to AIS p erformance, i n partic ular recommendation. I t is interesting to note that the increase in reco mmendation qualit y occ urs w ith a relatively co nstant o verlap size. At too high le vels of suppressio n, it is harder to fill t he neighbo urhood, with co nsequent la ck o f d iversity a nd hence recommendation ac curacy. We b elieve that t hese initial results sho w two things. Fir stly, po pulation e ffects ca n be beneficial for CF algorith ms, particularl y for r ecommendatio n; seco ndly, that CF is a pro mising new applicatio n ar ea for AISs. In fact, we ca n widen the con text, si nce t he pro cess o f neighbo urhood selection describ ed in this paper can ea sily be ge neralized to t he ta sk of ad-hoc co mmunity for mation. As mentioned previou sly, it is not claimed that these results are conclusive. Indeed , much more data is re quired befor e any firm co nclusions can b e dra wn . I n this respect, this paper is very much a work i n progress. Nevert heless, t he results to date certainly ar e indic ative, and c hallenge certai n assumptions. It is hope d that the presentation o f these results will stimulate disc ussion and int erest in the nature o f the idiot ypic effect. It does not see m likely that the idiot ypic effect ca n be captured by one p articular meas urement as it is l ikely to be a combination o f factors. Fo r example, we have shown th at both the ne ighbourhood choice and the weighti ng of neighbours within that nei ghbourhood can influence the re commendation perfor mance. There a re further co mmunity characteristics t hat could be explored . Some (for example, number of rec ommendatio ns, overlaps per neighbour, absolute correlatio n scores) have been exami ned and sho wn to be inconclusive. Some (for example, number of neighbours voti ng on eac h fil m) remain pote ntial future subjects for investigation. Other tes ts (e.g. setting each neighbour’s co ncentration to a r andom n umber for immune s ystem p redictions, to see wh ether acc urate concentra tions are really neces sary) might sh ed further light on t he relative i mportance of eac h measure. There are wider implications for such work. T he database used for this study [8] is based on real peoples’ pr ofiles. Thus, any headway made into improving neighbour hoods by the idiotypic effect can have real benefit for other recommenders – and i ndeed a ny communit y based app lication. Current researc h is unde r way to e xtend this work to predict websites o f interest ba sed on users’ boo kmarks [20]. 12 References [1] Aggarwal C a nd Yu P , On T ext Mini ng T echniques for Personalizatio n Lecture Notes in A rti ficial Intellige nce, vol. 171 1, pp. 12-18, 19 99. [2] Amazon.com Reco mmendatio ns (http://ww w.amazon.co m/). [3] Billsus, D. And Pazzani, M. J., "Learning Collabo rative Information Filters," Proceedings of the Fifteenth International Co nference on Machine Lear ning. Pp. 46 -54, 1998 . [4] Breese J S, Heckerman D and Kadie C, E mpirical Analysis o f predictive algorithms for CF, Proce edings of the 14 Conference on U ncertainty in Reasoning, pp . 43-52, 1998. [5] Cayzer S a nd Aickelin U, " A Reco mmender S ystem based on the I mmune Net work", in Pro ceedings CEC20 02, pp 807-81 3., Honolulu, US A. [6] Cayzer S and Aickelin U, " An AIS B ased Reco mmender" , Research Report External HP Lab, HP Labs Bristol, UK. [7] Cayzer S and Aickelin U, " On the Effects of I diotypic Inte ractions for Reco mmendation Communities in AISs", in Proceed ings of the 1st Inter national Conference o n AISs ( ICARIS-2002 ), pp 154-16 0, Canterbury, UK. [8] Compaq Syste ms Research Ce ntre. Eachmovie CF data set, http://www.re search.compaq.c om/SRC/eac hmovie/. [9] Delgado J, Ishii N and To moki U. Content-based Collabor ative Information Filteri ng: Actively Learni ng to Classify a nd Reco mmend Do cuments. I n: Coop erative Info rmation Agents I I. Learnin g, Mo bility and Electronic Commerce for I nformation Di scovery on the I nternet, ed. M . Klusch, G. W . E. Springer-Ve rlag, 1998 . [10] Delgado J. And Ishii, Mult i-agent L ear ning i n Reco mmender S ystems For Information Filtering o n the Internet Journal of Co-oper ative Infor mation Systems, vol. 10, pp . 81-100, 200 1. [11] Farmer JD, P ackard NH and Perelson AS, T he i mmune system, adap tation, and machine lear ning Ph ysica, vol. 22, pp. 187-204, 1986. [12] Fisher D, Hildrum K, Hon g J, N ewman M and Vuduc R, SWAMI: a fra mework for CF algor ithm develop ment and evaluation 19 99. Http://guir .berkeley.edu/pro jects/swami/. [13] Gokhale A, Impro vements to C F Algorithms 1999. W orcester Polytechnic I nstitute. Http://www.cs. wpi.edu/~cla ypool/ms /cf-improve/. [14] Goldsby R, Ki ndt T, Osborne B, Kuby Immunolo gy, Fourth Edition, W H Freeman, 200 0. [15] Hightower R R, Forrest S and Perelson AS. " The e volution o f emergent o rganization in i mmune system gene libraries," Proceedings of the 6th International Conference o n Genetic Algorithms, pp. 344--350, 1995. [16] Hunt J, King C a nd Cooke D, I mmunizing a gainst fraud, IE EE Colloq uium on Knowledge Di scovery a nd Data Mining, vol. 4 , pp. 1-4, 1 996. [17] Jerne NK, To wards a network theor y o f the immune s ystem Annals of I mmunology, vol. 1 25, no. C, pp. 373- 389, 19 73. [18] Kirkwood E and Lewis C. Und erstanding Medical Immunolo gy, John Wiley & So ns, Chichester, 1 989. [19] Kubi J (2002), Immunolog y, Fifth Edition by Richard A . Goldsby, Tho mas J. Kindt, Barb ara A. Osbo rne, W H Freeman. [20] Morrison T and Aickelin U: An Artificial Immune Syste m as a Recom mender Syste m for Web Sites, in Proce edings o f t he 1st International Co nference on ARtificial I mmune Systems (ICA RIS-2002 ), pp 161-169, Canterbury, UK, http://www.b ookmark.ac [21] Perelson A S and Weisbuch G, Immunology for p hysicists Revie ws of M odern P hysics, vol. 6 9, p p. 1219 -1267, 1997 . [22] Resnick P and Var ian HR, Rec ommender syste ms Communicatio ns of the ACM, vol. 40, p p. 56-58, 199 7.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment