Logic Learning in Hopfield Networks

Logic Learning in Hopfield Networks Saratha Sathasivam (Corresponding author) School of Mathematical Sciences, University of Scienc e Malaysia, Penang, Malaysia E-m ail : sa rat ha@ cs.u sm .my Wan Ahmad Tajuddin Wan Abdullah Department of Physics, Universiti Malaya, 50603 Kuala Lumpur, Malaysia E-mail: wat@um.edu.my The research is partly financed by an FRGS grant from the Mini stry of Higher Education, Malaysia. 1 Abstract Synaptic weights for neurons in logic programming can be calcula ted either by using Hebbian learning or by Wan Abdullah’s method. In other wor ds, Hebbian learning for governing events corresponding to some respective program clauses is equivalent with lea rning using Wan Abdullah’s method for the same respective program clauses. In this paper we will evaluate e xperimentally the equivalence between these two types of learning through compute r simulations. Keywords: logic programming, Hebbian learning, Wan Abdulla h’s method, program clauses. 1. Introduction Recurrent neural networks are essentially dynamical systems that fee d back signals t o themselves. Popularized by John Hopfield, these mode ls possess a rich c lass of dynamics characterize d by the existence of seve ral stable states ea ch with its own basin of att raction. The (Little-)Hopfield neural network [Little (1974), Hopfiel d (1982)] minimizes a Lyapunov function, also known as the energy function due t o obvious similarities with a physical spin ne twork. T hus, i t i s useful as a content addressable memory or an analog computer for s olving combinatorial-type opt imization problems because it always evolves in the direction that leads to lower network energy. This impli es that if a combinatorial optimization problem can be formulated as minimizing the network energy, then the network can be used to find optimal (or suboptimal) solutions by letti ng the network evolve freely. Wan Abdullah ( 1991,1992) and Pinkas (1991) independantly defined bi-directional mappings between propositional logic formulas and energy functi ons of symmetric neural networks. Both methods are applicable in finding whether the solutions obtaine d are models for a corresponding logic program. Subsequently Wan Abdullah (1991, 1993) has shown on see how Hebbian lea rning in an environme nt with some unde rlying logical r ules governing e vents is equivalent to hardwiring the network with these rules. In this paper, we will experimentally car ry out computer simulations to support this. This pape r is organized as follows. In section 2, we give an outline of doing logic programming on a Hopfield networ k and in section 3, He bbian learning of logical clauses is describe d. In section 4, we describe the proposed approach for comparing connec tion strengths obtained by Wan Abdullah’s method and Hebbian lea rning. Section 5 contains discussions re garding the results obtained from computer simulations. Finally concluding remarks regarding t his work occupy the last section. 2. Logic Programming on a Hopfield network In order to keep this paper self-contained we brie fly review t he Hopfield model (extensive treatments can be found els ewhere [Geszti (1990), H aykin (1994)]), and how logic programming can be c arried out on such architecture. The Hopfield model is a standard model for associative memory. The Hopfield dynamics is asynchronous, with each neuron updating it s state dete rministically. The s ystem consists of N formal neurons, each of which ca n be describe d by Ising variables ) ,.... 2 , 1 ( ), ( N i t S i = . Neurons then are bipolar, { ∈ i S -1,1}, obeying the dynamics ) sgn( i i h S → , where the field, ) 1 ( ) 2 ( i j j ij i J S J h + = ∑ , i and j running over all neurons N , ) 2 ( ij J is the synaptic or connectio strength from neuron j to neuron i, and ) 1 ( i J − is the threshold of neuron i . Restricting the c onnections t o be symmetric and zero-diagonal, ) 2 ( ) 2 ( ji ij J J = , 0 ) 2 ( = ii J , allows one to write a Lyapunov or energy function, i i i j i i j ij S J S S J E ∑ ∑ ∑ − − = ) 1 ( ) 2 ( 2 1 (1) which decreases monotonically with the dynamics . The two-connection model can be generalized to include higher order connect ions. This modifies t he “field” into ) 1 ( ) 2 ( ) 3 ( .... i j j ij k j j k ijk i J S J S S J h + + + = ∑ ∑ ∑ (2) where “…..” denotes still higher orders, and an energy functi on can be written as follows: − − = ∑ ∑ ∑ k j i j k i ijk S S S J E ) 3 ( 3 1 ..... ∑ ∑ ∑ − i i i i j j i ij S J S S J ) 1 ( ) 2 ( 2 1 (3) provided that ) 3 ( ] [ ) 3 ( ijk ijk J J = f or i , j, k distinct, with […] denoting permutations in cyclic order, a nd 0 ) 3 ( = ijk J for any i, j, k equal, and that similar s ymmetry re quirements are satisfied for higher order connections. The updating rule maintains )] ( sgn[ ) 1 ( t h t S i i = + (4) 2 In logic programming, a set of Horn clauses which are logic clauses of the form n B B B A ,..., , 2 1 ← where the ar row may be read “if” and the com mas “a nd”, is given and the a im is to find the set(s) of interpretation (i.e., tr uth values for the atoms in the clauses whic h satisfy the cl auses (which yields all the clauses true). In other words, we want to find ‘model s’ corresponding to the given logic program. In principle logic programming c an be seen as a problem in combinatoria l optimizati on, which may therefore be carried out on a Hopfield neural network. This is done by using the neurons to s tore the truth values of the atoms and writing a cost function which is minimized when all the clauses are satisfied. As an example, consider the following logic program, . , C B A ← . . B D ← . ← C whose three clauses translate respectively as C B A ¬ ∨ ¬ ∨ , B D ¬ ∨ and C . The underlying task of the program is to look f or interpretations of the atoms, in this case A, B, C and D w hich make up the model for t he given logic pr ogram. This can be seen as a combinat orial optimiza tion problem where the “inconsistency”, ) 1 ( ) 1 ( ) 1 ( 2 1 2 1 2 1 C B A P S S S E + + − = ) 1 ( ) 1 ( ) 1 ( 2 1 2 1 2 1 C B D S S S − + + − + (5) Where S A , etc. re present the truth va lues ( true as 1) of A , etc., is chosen as the cost function t o be minimized, as was done by Wan Abdullah. We can obser ve that the minimum value for E P is 0, and has otherwise va lue proportional to the number of unsa tisfied cla uses. The cost function (5), whe n programmed ont o a t hird order neural network yields synaptic strengths as given in Table 1. We address this method of doing logic programming in neural networks a s Wan Abdullah’s method . 3. Hebbian Learning of Logical Clauses The Hebbian learning rule for a two-neuron synaptic c onnection can be written as j i ij S S J 2 ) 2 ( λ ∆ = (6) where 2 λ is a l earning rate. F or connections of other orders n , between n neurons { S i , S j , ..., S m }, w e can generalize this to m j i n n m ij S S S J .... ) ( .... λ ∆ = (7) This gives the changes in synaptic stre ngths depe nding on the activities of the neurons. In an environment whe re sele ctive e vents oc cur, He bbian le arning will reflect the occurrences of the events. So, if the frequency of the events is dictated by some underlying logical rule, logic should be entrenched in the synaptic weights. Wan Abdullah (1991, 1993) has shown tha t Hebbian learning a s above corresponds t o hardwiring the neural ne twork with synaptic strengths obtained using Wan Abdull ah’s method, provided that the following is true: )! 1 ( 1 − = n n λ (8) We do not pr ovide a detailed a nalysis regarding Hebbian learning of logical clauses in this paper, but instead refer the interested reader to Wa n Abdullah’s papers. 4. Comparing Connect ion Strengths Obtained B y Hebbian Le arning With Those B y Wan Abdullah’s Method In the previous section, we have elaborated how synaptic weight s for ne urons can be equivalently calculated either by using Hebbian learning or by Wa n Abdullah’s method. Theoretically, information (synaptic strengt hs) produced by both met hods are similar. However, due to i nterference e ffects and redundancies, synaptic strengths could be different [Sathasivam (2006)], but the set of solutions for both cases should remain the same. Due to this, we cannot use direct c omparison of obtained synaptic strengths. I nstead, we carry out c omputer simulati on of artificially generated logic pr ograms and compare final states of the resulting neural ne tworks. To obta in the logic-programmed Hopfield ne twork based on Wan Abdullah’s method, the following algorithm is carried out: 3 i) Given a logic program, translate all the clause s in the logi c pr ogram into basic Boolean algebraic form. ii) Identify a neuron to each ground neuron. iii) Initialize all connections strengths to zero. iv) Derive a cost f unction that is associa ted with the negation of the conjuction of all the clauses, such that ) 1 ( 2 1 x S + r epresents the logical value of a neuron X , where x S is the ne uron corresponding to logical atom X . The value of x S is defined in such a way that it c arries the value s of 1 if X is tr ue and -1 i f X is false. Negation ( X does not occur) is represented by ) 1 ( 2 1 x S − ; a c onjunction logical c onnective is r epresented by multipli cation whe reas a disjuncti on connective is represented by addition. v) Obtain the values of c onnection strengths by comparing the cost f unction with the energy. vi) Let the neural network programmed with these connection strengths evolve until minimum energy is reached. Check whether the solution obtained is a global solution (the i nterpretation obtained is a model for the given logic program). We run the relaxation for 1000 trials and 100 c ombinations of neurons so as to r educe stat istical error. The selected tolerance va lue i s 0.001. All these values a re obtained by try and error technique, where we t ried several values as tolerance values, and selected the value which gives better performanc e than other values. To compare the information obtain in the s ynaptic strength, we make comparison between the s table states (states in which no neuron changes its value anymore) obtained by Wan Abdullah’s method with stable stat es obtained by Hebbian learning. The way we calculated the percentage of solutions reaching the global solutions is by comparing the energy for the stable states obtained by using Hebbian learning and Wan Abdullah’s method. If the c orresponding energy for both learning is same, then we conclude that the stable states for both learning a re t he same. This indicates, the model (set of interpretations) obta ined for bot h learning are similar. In all this, we assume that the global solutions for both networks a re the same due to bot h m ethods considering the same knowledge base (clauses). 5. Results and Discussion Figures 1 - 6 illustrate the graphs for gl obal minima ratio (ratio= (Number of global solutions)/ (Number of solutions=number of runs)) and Hamming distances from comput er si mulation that we have carried out. From t he graphs obtained, we observed that the ratio of global solutions is consistently 1 for all the cases, although we increased the network complexity by increasing the number of neurons (NN) and num ber of literals pe r clause (NC1, NC2, NC3). Due to we a re getting similar results for all the trials, to avoid graphs overlapping, we only presented the result obtained for the num ber of neurons (NN) = 40. Besides that, error bar f or some of the c ases could not be plotted because t he si ze of the point is bigger than the error bar. This indicates that the s tatistical error f or the corresponding point is so small. So, we couldn’t plot the error bar. Most of the neurons which are not involved in the clauses generated will be in the global states. The random generat ed program clause r elaxed to the final stat es, which seem a lso to be st able states, in less than five runs. Furthermore, the network never gets stuck in any suboptima l solutions. This indicates good solutions (global states) can be found in linear time or less wit h less complexity. Since all the solutions we obtained are global solution, so t he distance between t he stable states and the attractors are zero. Supporting this, we obt ained zero values for Hamming distance. T his indicates the stable states for both learning are the s ame. Therefore they are no different in the energy value. So, models for both lea rning are proved to be similar. Although the w ay of calculating synaptic weights are different, since the calculations revolve around the same knowledge base (c lauses), the set of interpretations will be sim ilar. T his implies that, Hebbian learning could extract the unde rlying l ogical rules in a given set of e vents and provide good solutions as well as Wan Abdullah’s method. The computer simulation results support this hypothesis. 6. Conclusion 4 In this paper, we had evaluated experimentally the logical equivalent between these two types of learning (Wan Abdullah’s method and Hebbian learning) for the same respec tive c lauses (same underlying logical rules) using computer simulation. The results support W an Abdull ah’s earlier proposed theory. References Geszti, T. (1990). Physical Models of Neural Netw orks. Singapore: World Scientific Publication. Haykin, S. (1994). Neural Network: A Comprehensive Foundation . N ew York: Macmillan. Hopfield, J. J. (1982). Neural Networks and Physical Systems wit h Emergent Collective Computational abilities. Proc. Natl. Acad. Sci. USA, 79, 2554-2558. Little, W. A. (1974). The existence of persiste nt states in the brain. Math. Biosci ., 19, 101-120. Pinkas, G. (1991). Energy minimization and the satisfia bility of propositional calculus. Neural Computation, 3, 282-291. Sathasivam, S. (2006). Logic Mining in Neural Networks. PhD Thesis. Univer sity of Malaya, Malaysia. Wan Abdullah, W . A. T. (1991). Neural Ne twork logic. In O. Benhar et al. (Eds.), Neural Networks: From Biology to High Energy Physics . Pisa: ETS Edi trice. pp. 135-142. Wan Abdullah, W. A. T. (1992). Logic pr ogramming on a neural ne twork. I nt. J. Intelligent Sys ., 7, 513-519. Wan Abdullah, W. A. T. (1993). The logic of neural net works. Phys. Lett. ,176A, 202-206. Table 1: Synaptic strengths for . , C B A ← & B D ← & ← C using Wan Abdullah’s method Synaptic Strengths Clause . , C B A ← . B D ← . ← C Total ) 3 ( ] [ ABC J 1/16 0 0 1/16 ) 3 ( ] [ ABD J 0 0 0 0 ) 3 ( ] [ ACD J 0 0 0 0 ) 3 ( ] [ BCD J 0 0 0 0 ) 2 ( ] [ AB J 1/8 0 0 1/8 ) 2 ( ] [ AC J 1/8 0 0 1/8 ) 2 ( ] [ AD J 0 0 0 0 ) 2 ( ] [ BC J -1/8 0 0 -1/8 ) 2 ( ] [ BD J 0 1/4 0 ¼ ) 2 ( ] [ CD J 0 0 0 0 ) 1 ( ] [ A J 1/8 0 0 1/8 ) 1 ( ] [ B J -1/8 -¼ 0 -3/8 ) 1 ( ] [ C J -1/8 0 1/2 3/8 ) 1 ( ] [ D J 0 1/4 0 ¼ 5 Gl obal M ini ma For NC1 0 0.5 1 1.5 2 0 0.1 0 .2 0. 3 0 .4 0. 5 0. 6 0.7 0 . 8 0.9 1 N C1 /NN Ra tio NN =40 Figure 1: Global Minima Ratio for NC1 Gl obal M ini ma For NC2 0 0.5 1 1.5 2 0 0.1 0 .2 0. 3 0 .4 0. 5 0. 6 0.7 0 . 8 0.9 1 N C2 /NN R atio NN =40 Figure 2: Global Minima Ratio for NC2 Gl obal M ini ma For NC3 0 0.5 1 1.5 2 0 0.1 0 .2 0. 3 0 .4 0. 5 0. 6 0.7 0 . 8 0.9 1 N C3 /NN R atio NN =40 Figure 3: Global Minima Ratio for NC3 6 Hammi ng Distance For NC 1 -1 -0 .5 0 0.5 1 0 0.1 0.2 0 . 3 0.4 0 .5 0. 6 0.7 0.8 0.9 1 NC 1 / N N Ra tio NN =40 Figure 4: Hamming Distance for NC1 Hammi ng Distance For NC 2 -1 -0 .5 0 0.5 1 0 0.1 0.2 0 . 3 0.4 0 .5 0. 6 0.7 0.8 0.9 1 NC 2 / N N Ra tio NN =40 Figure 5: Hamming Distance for NC2 Hammi ng Distance For NC 3 -1 -0 .5 0 0.5 1 0 0.1 0.2 0 . 3 0.4 0 .5 0. 6 0.7 0.8 0.9 1 NC 3 / N N Ra tio NN =40 Figure 6: Hamming Distance for NC3 7 8

Logic Learning in Hopfield Networks

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment