The Gn,m Phase Transition is Not Hard for the Hamiltonian Cycle Problem

Using an improved backtrack algorithm with sophisticated pruning techniques, we revise previous observations correlating a high frequency of hard to solve Hamiltonian Cycle instances with the Gn,m phase transition between Hamiltonicity and non-Hamilt…

Authors: J. Culberson, B. V, egriend

The Gn,m Phase Transition is Not Hard for the Hamiltonian Cycle Problem
Journal of Articial In telligence Researc h 9 (1998) 219-245 Submitted 3/98; published 11/98 The G n;m Phase T ransition is Not Hard for the Hamiltonian Cycle Problem Basil V andegriend basil@cs.ualber t a.ca Joseph Culb erson joe@cs.ualber t a.ca Dep artment of Computing Scienc e, University of A lb erta, Edmonton, A lb erta, Canada, T6G 2H1 Abstract Using an impro v ed bac ktrac k algorithm with sophisticated pruning tec hniques, w e re- vise previous observ ations correlating a high frequency of hard to solv e Hamiltonian cycle instances with the G n;m phase transition b et w een Hamiltonicit y and non-Hamiltonicit y . Instead all tested graphs of 100 to 1500 v ertices are easily solv ed. When w e articially restrict the degree sequence with a b ounded maxim um degree, although there is some increase in dicult y , the frequency of hard graphs is still lo w. When w e consider more regular graphs based on a generalization of knigh t's tours, w e observ e frequen t instances of really hard graphs, but on these the a v erage degree is b ounded b y a constan t. W e design a set of graphs with a feature our algorithm is unable to detect and so are v ery hard for our algorithm, but in these w e can v ary the a v erage degree from O (1) to O ( n ). W e ha v e so far found no class of graphs correlated with the G n;m phase transition whic h asymptotically pro duces a high frequency of hard instances. 1. In tro duction Giv en a graph G = ( V ; E ) ; j V j = n; j E j = m , the Hamiltonian cycle problem is to nd a cycle C = ( v 1 ; v 2 ; : : : ; v n ) suc h that v i 6 = v j for i 6 = j , ( v i ; v i +1 ) 2 E and ( v n ; v 1 ) 2 E . As for an y NP-C problem, w e exp ect solving it to require exp onen tial time in the w orst case on arbitrary graphs (assuming P 6 = NP). Ho w ev er, in recen t y ears researc hers examining v arious NP-C problems suc h as SA T and graph coloring ha v e disco v ered that the ma jorit y of graphs are easy for their algorithms to solv e. Only graphs with sp ecic c haracteristics or graphs whic h lie within a narro w band (according to some parameter) seem to b e hard to solv e for these problems. It is kno wn (P osa, 1976; Koml os & Szemer  edi, 1983) that under a random graph mo del ( G n;m ) as the edge densit y increases there is a sharp threshold (the phase transition) suc h that b elo w that edge densit y the probabilit y of a Hamiltonian cycle is 0, while ab o v e it the probabilit y is 1. Previous researc h (Section 2.1) suggested that there is a high correlation of dicult problems with instances generated with edge densit y near the phase transition. Using an impro v ed Hamiltonian cycle bac ktrac k algorithm (Section 3) that emplo ys v arious pruning op erators and an iterated restart tec hnique, w e observ e no hard instances at the transition for large n . Section 4 describ es our results on G n;m and related random graphs. In an attempt to nd a higher frequency of hard graphs, in Section 5 w e examine a lo w degree random graph class w e call Degreeb ound graphs. Ho w ev er, these graphs are also usually easy for our bac ktrac k algorithm, although w e do nd a few hard graphs. Analysis of these graphs indicates a test for non-Hamiltonian instances discussed in Section 5.3. In c  1998 AI Access F oundation and Morgan Kaufmann Publishers. All righ ts reserv ed. V andegriend & Culberson Section 6 w e examine a graph class based on a generalization of the knigh t's tour problem. These graphs are signican tly harder for our algorithm in general. In Section 7 w e presen t a constructed graph class whic h pro duces exp onen tial b eha vior for our bac ktrac k algorithm. Our exp erimen tal results pro vide evidence that the a v erage degree of a graph is not a sucien t indicator for hard graphs for the Hamiltonian cycle problem. With our bac ktrac k algorithm, the phase transition regions of the G n;m and Degreeb ound graph mo dels are generally asymptotically easy . 2. A Discussion of Hardness and Previous W ork The concept of hardness of instances and hard regions within graph classes, considered from an empirical basis, is not easy to dene. In order to clarify what w e mean, in this section w e presen t our notions of hardness, relating this to previous w ork. 2.1 What is Hardness? A pr oblem of size n is a set  n of instanc es . F or the Hamiltonian cycle problem,  n is the set of undirected graphs on n v ertices. An y discussion of the hardness of a particular instance of a problem is alw a ys with resp ect to an algorithm (or set of algorithms). In general, dieren t algorithms will p erform dieren tly on the instance. F urthermore, for eac h particular instance of Hamiltonian cycle there is an asso ciated algorithm that either cor- rectly answ ers NO or outputs a cycle in O ( n ) time. T o meaningfully talk ab out the hardness of an instance, w e m ust assume a xed algorithm (or a nite class of algorithms) that is appropriate for a large (innite) class of instances, and then consider ho w the algorithm p erforms on the instance. Hardness of an instance is alw a ys a measure of p erformance relativ e to an algorithm. W e are left with the question of ho w m uc h w ork an algorithm m ust do b efore w e consider the instance hard for it. Note that for a single instance the distinction b et w een p olynomial and exp onen tial time is mo ot. Ideally , w e w ould lik e to require the algorithm to tak e an exp onen tial (i.e. a n for some a > 1) n um b er of steps as size n increases. Note that empirical corrob oration of suc h is practically imp ossible for sets of large instances. In practice, w e m ust b e con ten t with evidence suc h as failure to complete within a reasonable time for larger instances. W e w ould also lik e an instance to exhibit some robustness b efore w e consider it hard for a giv en algorithm. Ideally , for graph problems w e w ould at a minim um require the instance to remain hard with high probabilit y under a random relab eling of the v ertices. Relab eling the v ertices pro duces an isomorphic cop y of the graph, preserving structural prop erties suc h as degree, connectivit y , Hamiltonicit y , cut sets, etc. The design of algorithms is t ypically based on iden tifying and using suc h prop erties, and as far as p ossible eciency should b e indep enden t of the arbitrary assignmen t of lab els. Let us refer to a (probabilistic) pr oblem class as a pair ( n ; P n ), where P n ( x ) is the probabilit y of the instance x giv en that w e are selecting from  n . Problem classes are sometimes called ensembles in the Articial In telligence literature (Hogg, 1998). The usual classes for graph problems are G n;p , where to generate an n v ertex graph, eac h pair of v ertices is included as an edge with probabilit y p , and G n;m where m distinct edges are 220 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem selected at random and placed in the graph. These t w o mo dels are related (P almer 1985). F or this pap er w e use the G n;m mo del. W e do not consider mean or a v erage run times in our denitions. The primary reason is that for exp onen tially small sets of exp onen tially hard instances, it is impractical to determine the a v erage with an y reasonable assurance. F or example, if 1 = 2 n of the instances require ( n 2 2 n ) time and the remainder are solv ed in O ( n 2 ) time then the a v erage time is quadratic, while if the frequency increases to 1 = 2 0 : 9 n the a v erage time is exp onen tial. Ev en for n = 100 it w ould b e utterly impractical to distinguish b et w een these t w o frequencies with empirical studies. F urthermore, and for similar reasons, if w e w an t to promote a class as a b enc hmark class for testing and comparing algorithms, lo w frequencies of hard instances are not generally sucien t. W e will sa y that a problem class is maximal ly har d (with resp ect to an algorithm or set of algorithms) if the instances generated according to the distribution are hard with probabilit y going to one as n go es to innit y . As an example of maximally hard classes, empirical evidence suggests that a v ariet y of hidden coloring graph generators based on the G n;p mo del are maximally hard for a large v ariet y of graph coloring algorithms (Culb erson and Luo, 1993). These hard classes are all closely related to a coloring phase transition in random graphs. In general, a phase transition is dened b y some parameterized probabilit y distribution on the set of instances. As the parameter is v aried past a certain threshold v alue, the asymptotic probabilit y of the existence of a solution switc hes sharply from zero to one. Phase transitions are commonly considered to b e iden tied with hard subsets of a par- ticular problem (Cheeseman, Kanefsky , & T a ylor, 1991). Man y NP-C problems can b e c haracterized b y a `constrain t' parameter whic h measures ho w constrained an instance is. Ev aluation of a problem using this constrain t parameter t ypically divides instances in to t w o classes: those that are solv able, and those that are unsolv able, with a sharp transition o ccurring b et w een them. When the problem is highly constrained, it is easily determined that no solution exists. As constrain ts are remo v ed, a solution is easily found. Dieren t researc hers (Cheeseman et al., 1991; F rank & Martel, 1995; F rank, Gen t, & W alsh, 1998) ha v e examined phase transitions on random graphs for the Hamiltonian cycle problem. The ob vious constrain t parameter is the a v erage degree (or a v erage connectivit y) of the graph. As the degree increases, the graph b ecomes less constrained: it b ecomes easier b oth for a Hamiltonian cycle to exist and for an algorithm to nd one. These researc hers ha v e examined ho w Hamiltonicit y c hanges with resp ect to the a v erage degree. F rank et al. (1998) and F rank and Martel (1995) exp erimen tally v eried that when using the G n;m mo del the phase transition for Hamiltonicit y is v ery close to the phase transition for bicon- nectivit y , whic h o ccurs when the a v erage degree is appro ximately ln n (or m = n ln n= 2) 1 . Cheeseman et al. (1991) exp erimen tally conrmed theoretical predictions b y Koml os and Szemer  edi (1983) that the phase transition (for the Hamiltonian cycle problem) o ccurs when the a v erage degree is ln n + ln ln n . The pap ers also pro vided empirical evidence that the time required b y their bac ktrac k algorithms increased in the region of the phase transition and noted that the existence of v ery hard instances app eared to b e asso ciated with this transition. 1. Note that the a v erage degree equals 2 m=n . 221 V andegriend & Culberson As men tioned ab o v e, the k -colorable G n;p class app ears maximally hard for all kno wn algorithms with resp ect to a phase transition dened b y n; p and k , where k  n= log b n and b = 1 = (1  p ). The Hamiltonian cycle G n;m class on the other hand do es not app ear maximally hard for an y v alue of m . In fact, for large n our algorithm almost nev er tak es more than O ( n ) bac ktrac k no des and O ( nm ) running time. W e will use a m uc h w eak er requiremen t and sa y an instance is quadr atic al ly har d if it requires at least n 2 searc h no des b y the bac ktrac k algorithm describ ed in section 3. Note that ( n 2 ) searc h no des w ould tak e our algorithm ( n 3 ) time. F or practical reasons, w e will also use a w eak er denition for robustness, and sa y that an instance is r obustly quadr atic al ly har d if our algorithm uses at least n 2 searc h no des when the iterated restart feature is used with a m ultiplying factor of 2. (See section 3 for program details). W e sa y a class is minimal ly har d if there is some constan t  > 0 suc h that the probabilit y of a hard instance is at least  as n ! 1 . In Section 4 w e examine G n;m random graphs using our bac ktrac k algorithm on graphs of up to 1500 v ertices. The empirical evidence w e collect suggests that in con trast to the graph coloring situation, the Hamiltonian cycle G n;m class is not minimally quadratically hard, ev en for m at or near the phase transition, and ev en if w e drop our minimal robustness requiremen t. Note that w e do not dispute the claim that hard instances are more lik ely at the phase transition than at other v alues of m , but rather claim that ev en at the transition the probabilit y of generating a hard instance rapidly go es to zero with increasing n . 2.2 Random Graph Theory and the Phase T ransition These results are not unexp ected when one reviews the theoretical w ork on this graph class. Since asymptotically the graph b ecomes Hamiltonian when an edge is added to the last degree 1 v ertex (Bollob as, 1984), an y algorithm that c hec ks for a minim um degree  2 will detect almost all non-Hamiltonian graphs. When the graph is Hamiltonian, v arious researc hers (Angluin & V alian t, 1979; Bollob as, F enner, & F rieze, 1987) ha v e pro v en the existence of randomized heuristic algorithms whic h can almost alw a ys nd a Hamiltonian cycle in lo w-order p olynomial time. In particular, it is sho wn (Bollob as et al., 1987) that there is a p olynomial time algorithm HAM suc h that lim n !1 Pr (HAM nds a Hamilton cycle) = 8 > < > : 0 if c n ! 1 e  e  2 c if c n ! c 1 if c n ! 1 where m = n= 2(ln n + ln ln n + c n ). F urthermore, as the authors p oin t out, this is the b est p ossible result in the sense that this is also the asymptotic probabilit y that a G n;m graph is Hamiltonian, and is the probabilit y that it has a minim um degree of 2. In other w ords, the probabilit y of nding a cycle is the same as the probabilit y of one existing. Giv en that it is trivial to c hec k the minim um v ertex degree of a graph, this do es not lea v e m uc h ro om for the existence of hard instances (for HAM and similar algorithms). Another relev an t theoretical result is that there is a p olynomial time algorithm whic h with probabilit y going to one, nds some Hamiltonian cycle when a graph has a hidden 222 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem Hamiltonian cycle together with extra randomly added edges(Bro der, F rieze, & Shamir, 1994). F or the algorithm to w ork, the a v erage degree of a v ertex needs only b e a constan t. They claim the result can b e easily extended to the case that the a v erage degree is a gro wing function of n . This is another indication that Hamiltonian graphs near the phase transition will b e easy to solv e b y some algorithm. F or a non-Hamiltonian graph to b e hard for an algorithm it m ust con tain a feature prev en ting the formation of a Hamiltonian cycle whic h the algorithm cannot easily detect. Supp ose a bac ktrac k algorithm do es not c hec k for v ertices of degree one. The algorithm ma y then require exp onen tial bac ktrac k b efore determining the non-Hamiltonicit y of the graph, since the only w a y it can detect this is b y trying all p ossible paths and failing. Ho w ev er, degree one v ertices are easily detectable, and so are not go o d indicators of hard instances. They also disapp ear at the phase transition. Similarly , an algorithm migh t not c hec k for articulation p oin ts, and as a result w aste exp onen tial time on what should b e easy instances. As n ! 1 , the probabilit y of an articulation p oin t existing (in G n;m ) go es to zero as fast as the probabilit y of the existence of a v ertex of degree less than t w o. Other features can lead to non-Hamiltonicit y of course, suc h as k -cuts that lea v e k + 1 or more comp onen ts (Bondy & Murt y , 1976), and these could require time prop ortional to n k to detect. Under the assumption that NP 6 =CO-NP there m ust also exist a set of non-Hamiltonian instances whic h ha v e no p olynomial pro of of their status. Ho w ev er, it seems that at the phase transition the larger the feature the less lik ely it is to o ccur. In fact, the theoretical results summarized ab o v e indicate this m ust happ en. Although w e kno w hard graphs exist, and w e ma y exp ect these lo calized t yp es of hard graphs to b e more frequen t near the phase transition than elsewhere when using G n;m to generate instances, w e also exp ect the probabilit y of suc h instances to go to zero as n increases. 3. An Ov erview of our Bac ktrac k Algorithm Our bac ktrac k algorithm comes from V andegriend (1998), and is based up on prior w ork on bac ktrac k Hamiltonian cycle algorithms (Ko ca y , 1992; Martello, 1983; Sh ufelt & Berliner, 1994). It has three signican t features whic h w e will discuss. First, it emplo ys a v ariet y of pruning tec hniques during the searc h that delete edges that cannot b e in an y Hamiltonian cycle. This pruning is usually based up on lo cal degree information. Second, b efore the start of the searc h the algorithm p erforms initial pruning and iden ties easily detectable non-Hamiltonian graphs. The third feature is the use of an iterated restart tec hnique. Additionally , the program pro vides the opp ortunit y to order the selection of the next v ertex during path extension using either a lo w degree rst ordering, a high degree rst ordering, or a random ordering. W e normally use the lo w degree rst ordering. A t eac h lev el of the searc h, after adding a new v ertex to the curren t path, searc h pruning is used. The pruning iden ties edges that cannot b e in an y Hamiltonian cycle and remo v es them from the graph. (Note that if the algorithm bac ktrac ks, it adds the edges deleted at the curren t lev el of the searc h bac k to the graph.) The rst graph conguration that the pruning lo oks for is a v ertex x with 2 neigh b ours a; b of degree 2. Since the edges inciden t on a and b m ust b e used in an y Hamiltonian cycle, the other edges inciden t on 223 V andegriend & Culberson x can b e deleted. The second graph conguration that the pruning lo oks for is a path P = ( v 1 ; : : : ; v k ) of forced edges (so v 2 : : : v k  1 are of degree 2). If k < n then the edge v 1 ; v k cannot b e in an y Hamiltonian cycle and can b e deleted. If as a result of pruning, the degree of an y v ertex drops b elo w 2, then no Hamiltonian cycle is p ossible and the algorithm m ust bac ktrac k. The use of these op erators ma y yield new v ertices of degree 2 and therefore the pruning is iterated un til no further c hanges o ccur. A pruning iteration tak es O ( n ) time to scan the v ertices to c hec k for v ertices with t w o degree 2 neigh b ors, and O ( n ) time to extend all forced degree t w o paths. Since the iterations terminate unless a new v ertex of degree t w o is created, at most n iterations can o ccur. A t most O ( m ) edges can b e deleted. On bac king up from a descendan t, the edges are replaced ( O ( m )) and the next branc h is tak en. Th us, an easy upp er b ound on the pruning time for a no de searc hing from a v ertex of degree d is O ( d ( n 2 + m )), but this is o v erly p essimistic. Note that along an y branc h from the ro ot of the searc h tree to a leaf, at most n v ertices can b e con v erted to degree 2. Also note that along eac h branc h eac h edge can b e deleted at most once. If the degree is high w e seldom tak e more than a few branc hes b efore success. The implemen tation is suc h that when sev eral v ertices ha v e t w o neigh b ors of degree t w o at the b eginning of an iteration, all redundan t edges are remo v ed in a single pass taking time prop ortional to n plus the n um b er of edges remo v ed and c hec k ed. In practice, on G n;m graphs it t ypically tak es O ( n + m ) time p er searc h no de on v ery easy Hamiltonian instances as evidenced b y CPU measuremen ts, with harder instances taking at most t wice as long p er searc h no de. Before the start of the recursiv e searc h, our algorithm prunes the graph as describ ed ab o v e. Then the algorithm c hec ks to see if the graph has minim um degree  2, is connected, and has no cut-p oin ts. If an y of these conditions are not true, then the graph is non- Hamiltonian and the algorithm is nished. Some non-Hamiltonian instances ma y b e v ery easy or v ery hard to detect, dep ending on whic h v ertex the algorithm c ho oses as a starting p oin t. In these cases lo cal features exist that could b e detected if the algorithm starts near them, but otherwise the algorithm ma y bac ktrac k man y times in to the same feature without recognizing that only the feature matters. The seemingly hard instance on G n  for n = 100 discussed in Section 4.2 is suc h a case. This is one t yp e of \thrashing," and is a common problem in bac ktrac king algorithms. F or example, Hogg and Williams (1994) noticed a sparse set of v ery hard 3-coloring problems that w ere not at the phase transition. Bak er (1995) sho w ed that these instances w ere most often hard as a result of thrashing, and that they could b e made easy b y bac kjumping or dep endency-directed bac ktrac king. T o impro v e our algorithm's a v erage p erformance w e use an iterated restart tec hnique. The idea is to ha v e a maxim um limit M on the n um b er of no des searc hed. When the maxim um is reac hed, the searc h is terminated and a new one started with the maxim um increased b y a m ultiple k (so M i +1 = k M i ). Initially , M = k n . In our exp erimen ts, w e used k = 2. By incremen ting the searc h in terv al in this w a y , the algorithm will ev en tually obtain a searc h size large enough to do an exhaustiv e searc h and th us guaran tee ev en tual completion. The total searc h will nev er b e more than double the largest size allo cated. Although random restarts are sometimes eectiv e on non-Hamiltonian graphs, they are more frequen tly eectiv e on Hamiltonian instances. During searc h, as edges are added to the set of Hamiltonian edges, the net eect is to prune edges from the graph. F or a 224 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem Hamiltonian graph to b e hard, the algorithm m ust select some set of edges whic h causes the reduced graph to b ecome non-Hamiltonian, and this non-Hamiltonian subgraph m ust itself b e hard to solv e. With iterated restart, for the instance to remain hard the algorithm m ust mak e suc h mistak es with high probabilit y . As a result, w e exp ect few er hard Hamiltonian instances. Random restarts are an in tegral part of randomized algorithms (Mot w ani & Ragha v an, 1995) and are used frequen tly in lo cal searc h and other tec hniques to escap e from lo cal optima (Johnson, Aragon, McGeo c h, & Sc hev on, 1991; Langley , 1992; Selman, Lev esque, & Mitc hell, 1992; Gomes, Selman, & Kautz, 1998). F urther discussion of the impact of restarts can b e found in the analysis of the exp erimen ts on G n;m graphs in Section 4. The algorithm also pro vides for the p ossibilit y of c hec king for comp onen ts and cut v ertices during recursiv e searc h after the pruning is completed at eac h searc h no de. The o v erhead of this extra w ork is O ( n + m ) p er searc h no de and rarely seems to pa y o. Except where noted these c hec ks w ere not used in this study . The exp erimen tal results rep orted in the remaining sections w ere run on a v ariet y of mac hines, the fastest of whic h is a 300 MHZ P en tium I I. All CPU times rep orted are either from this mac hine, or adjusted to it using observ ed sp eed ratios on similar tests. Our algorithm terminated execution after 30 min utes 2 . Exp erimen tal results are frequen tly rep orted as the ratio of the n um b er of searc h no des o v er the n um b er of v ertices. This no de ratio is used b ecause w e feel it pro vides a b etter basis for comparing results across dieren t graph sizes, since man y of our results are O ( n ). Note that the n um b er of searc h no des is calculated as the n um b er of recursiv e calls p erformed. W e used sev eral dieren t metho ds of v erifying the correctness of our algorithm and our exp erimen tal results. The algorithm w as indep enden tly implemen ted t wice, and p erforms automatic v erication of all Hamiltonian cycles found. W e p erformed m ultiple sets of exp er- imen ts on generalized knigh t's circuit graphs and compared the results (graph Hamiltonian or not) to our theoretical predictions. Initial sets of exp erimen ts on G n;m graphs and De- greeb ound graphs w ere executed using t w o dieren t pseudo-random n um b er generators, and w ere rep eated m ultiple times. Our source co de is a v ailable as an app endix. 4. G n;m Random Graphs W e consider random graphs of 16 to 1500 v ertices with m = d n= 2. F rom previous w ork (Cheeseman et al., 1991; Koml os & Szemer  edi, 1983) w e exp ect the phase transition to o ccur when d  ln n + ln ln n . Th us w e sp ecify the constrain t parameter (or degree parameter) k = d = (ln n + ln ln n ). 4.1 G n;m Using Restart F or the premiere exp erimen t, w e generate G n;m graphs with n um b er of v ertices n = 16 : : : 96 in steps of 4, n = 100 : : : 500 in steps of 100, n = 1000 and n = 1500. F or eac h size n , the degree parameter k ranges from 0 : 5 : : : 2 : 0 (step size 0.01 from k = 1 : 00 : : : 1 : 20, step size 2. Since the time limit of 30 min utes is at least t w o orders of magnitude greater than the t ypical running time, the limit is rarely used. On slo w er mac hines this limit w as increased. The Knigh t's tour graphs rep orted in Section 6 w ere run on a slo w er mac hine with a 30 min ute time limit, although some instances w ere run m uc h longer. 225 V andegriend & Culberson 0 20 40 60 80 100 0.6 0.8 1 1.2 1.4 1.6 1.8 2 % Hamiltonian Degree Parameter k 100 200 300 400 500 1000 1500 Figure 1: % of Hamiltonian graphs as a function of graph size and degree parameter for G n;m graphs. 0.10 for other ranges of k ). W e generate 5000 graphs for eac h data p oin t and execute our bac ktrac k algorithm once on eac h graph. This is a grand total of 4.76 million graphs, of whic h 1.19 million are of 100 or more v ertices. W e use the pruning describ ed in section 3, c hec k for comp onen ts and articulation p oin ts after the initial pruning, and use iterated restart with a m ultiplicativ e factor of 2. W e do not c hec k for comp onen ts or articulation p oin ts during the recursiv e searc h. W e exp ect the phase transition for biconnectivit y to b e v ery similar to the phase tran- sition for Hamiltonicit y (Cheeseman et al., 1991) and w e exp ect the phase transition for minim um degree greater than 1 to b e almost iden tical to the phase transition for Hamil- tonicit y (Bollob as, 1984; Koml os & Szemer  edi, 1983). Our exp erimen tal results matc hed these exp ectations v ery closely . F or the larger graphs of 100 to 1500 v ertices, the p ercen tage of Hamiltonian graphs is plotted against the degree parameter in Figure 1. W e found that the 50% p oin t at whic h half the graphs are Hamiltonian o ccurs when the degree parameter k  1 : 08  1 : 10. More in terestingly , all the curv es pass close to a xed p oin t near k = 1, and it seems they are approac hing a v ertical line at this p oin t. That is, they app ear to b e con v erging on k  1 as a phase transition, precisely as theory predicts. 226 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem n 100 200 300 400 500 1000 1500 No des 7 : 5 n 7 : 0 n 3 : 3 n 7 : 0 n 3 : 4 n 3 : 3 n 7 : 0 n T able 1: Maxim um Searc h No des on G n;m for Large n All graphs w ere solv ed, that is w ere either determined to b e non-Hamiltonian, or a Hamiltonian cycle w as found. W e are primarily in terested in asymptotic b eha vior, since theories concerning the relation of the phase transition to hard regions are necessarily asymptotic in nature. F or graphs of 100 v ertices or more, the longest running time w as under 11 seconds, on a graph of 1500 v ertices using 10,500 (or 7 : 0 n ) searc h no des to nd a Hamiltonian cycle. All of the 549,873 non-Hamiltonian graphs in this range w ere detected during the initial pruning of the graph, and th us no searc h no des w ere expanded. Of the 640,127 Hamiltonian G n;m graphs, the v ast ma jorit y ( 629,806 or 98 : 3%) used only n searc h no des, whic h means that the algorithm did not need to bac ktrac k at all 3 . No quadratically hard graphs w ere found in this range. T able 4.1 lists the maxim um n um b er of searc h no des expressed as a factor of n to illustrate the linearit y of the searc h tree. These results app ear to dier from those of F rank et al. (1998), who found graphs whic h to ok orders of magnitude more searc h no des to solv e. (Their hardest graph to ok o v er 1 million no des.) W e b eliev e this is due to t w o factors. Firstly , the algorithm used to generate the results in their pap er did not do an initial c hec k for biconnectivit y nor did it use all of the pruning tec hniques used in our algorithm. Secondly and more imp ortan tly , on the small random graphs they used (  30 v ertices) the probabilit y of obtaining certain hard congurations (suc h as biconnected and non-Hamiltonian or non-biconnected and minim um degree  2) is m uc h higher than when n is larger, as w e discussed in section 2.2. The exp erimen ts on small G n;m graphs (b et w een 16 and 96 v ertices) conrm this con- jecture. In this case w e do nd a small n um b er of quadratically hard graphs, and a few v ery hard graphs. W e consider for purp oses of this pap er, that a v ery hard graph on less than 100 v ertices is an y that tak es at least 100,000 searc h no des to solv e. The v ery hard graphs from this set of runs are giv en in T able 4.1. Note that the v ery hardest to ok less than t w o min utes to solv e, making our designation of \v ery hard" questionable. Also, note that the smallest graph in this set has 36 v ertices, somewhat larger than the 30 v ertex examples found b y F rank et al. (1998). This is lik ely b ecause w e do articulation p oin t c hec king initially and b etter pruning. Finally , all of these v ery hard graphs are non-Hamiltonian, and all o ccur in classes that pro duce less than 50% Hamiltonian graphs. The hardest Hamiltonian graph in con trast required only 19,318 searc h no des, on a graph of 68 v ertices with degree parameter 0.9. In Figure 2 w e plot the n um b er of graphs that are quadratically hard for these small graphs. F or n from 68 to 92, all non-Hamiltonian graphs w ere detected during initial pruning. One non-Hamiltonian graph at n = 96 required searc h (254 : 1 n no des). Notice that the n um b er of quadratically hard Hamiltonian graphs is far less than the n um b er of quadratically hard non-Hamiltonian graphs, and p eaks for larger n . This is in accordance with the discussion of random restarts in Section 3. 3. With 5% error in this measuremen t, this means that the algorithm migh t ha v e bac ktrac k ed o v er a maxim um of 0 : 05 n searc h no des. 227 V andegriend & Culberson V ertices Degree P arameter Seconds Searc h No des Ratio 36 1.11 94.7 1179579 32766.1 40 1.00 36.5 638946 15973.6 40 1.07 18.7 327603 8190.1 44 1.00 12.3 156694 3561.2 44 1.04 20.0 293664 6674.2 48 1.02 91.2 1280135 26669.5 48 1.09 107.0 1243647 25909.3 T able 2: The Hardest Small Graphs 0 10 20 30 40 50 60 20 30 40 50 60 70 80 90 Number of Hard Instances Number of Vertices non-Ham Ham Figure 2: The Num b er of Quadratically Hard Graphs for Small n . 228 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem W e ran additional tests for n from 32 to 54 in steps of 2, with the degree parameter ranging from 0.96 to 1.16 with step size 0.01, generating 5000 graphs at eac h p oin t. In this case, w e in v ok ed articulation p oin t c hec king at eac h searc h no de. Again all graphs w ere solv ed without timing out, and some v ery hard graphs w ere found, all of them non- Hamiltonian. One 50 v ertex graph required 9,844,402 searc h no des, and required close to 20 min utes to solv e. It is unclear whether the extra c hec king help ed; the smallest graph requiring at least 100,000 no des had 32 v ertices, while the smallest requiring o v er a million had 40 v ertices. Ov erall, the results w ere v ery similar to the rst set of exp erimen ts on small graphs. 4.2 G n  Using Restart Clearly , the more edges w e add to a graph, the more lik ely it is to b e Hamiltonian. It also seems that once a graph is Hamiltonian, adding more edges mak es it less lik ely to b e hard. In an attempt to nd hard graphs for larger n , w e mo died the G n;m generator so that instead of adding a xed n um b er of edges, it instead added edges un til ev ery v ertex has degree at least t w o, and then stops. In a sense this pro duces graphs exactly on the G n;m phase transition, since a minim um degree of t w o is the condition that asymptotically distinguishes Hamiltonian from non-Hamiltonian graphs with high probabilit y . W e refer to this distribution as the G n  mo del. Initially w e ran 1000 graphs with this generator for n from 100 to 500, but no hard instances w ere found. W e increased the searc h to 10,000 graphs at eac h n , and included a searc h at n = 1000. Out of all these graphs, w e found one v ery hard graph on 100 v ertices. Ev en after a second attempt using more than 26 million searc h no des, it w as still unsolv ed. Doing p ost-mortem analysis, w e c hec k ed for cut sets of size 2 and 3 that w ould lea v e 3 or 4 (or more) comp onen ts and found none. W e also c hec k ed the pruned graph using the o dd degree test men tioned in Section 5.3, but this to o failed to sho w it is non-Hamiltonian. Finally , w e set up our fast mac hine with unlimited time and no restarts. Three searc h no des and less than 0.1 seconds later it w as pro v en non-Hamiltonian. Detailed analysis (see the app endix) sho ws that the graph has a small feature that is easily detected when one of a few starting p oin ts is selected. Because w e use an exp onen tially gro wing sequence of searc hes, w e only use a few restarts. In a test of 100 random starts with a 3 second time limit 7 trials succeeded, using from 2 to 5 searc h no des eac h to pro v e the graph non-Hamiltonian. W e also ran 10,000 G n  graphs at eac h ev en v alue of n from 16 to 98. The smallest instances requiring at least 100,000 searc h no des w ere at n = 50. Only 5 graphs requiring more than a million no des w ere found for n < 100, t w o at n = 62, one at n = 70 and t w o at n = 98. Tw o of these (one at 62, one at 98) initially timed out, but w ere solv ed in second attempts in ab out 1/2 hour. Neither w as susceptible to an attac k b y 100 restarts as on the 100 v ertex graph. T able 4.2 sho ws the n um b er of non-Hamiltonian graphs for eac h n  100. All of these except the one men tioned ab o v e w ere detected during initial pruning. The remaining graphs w ere all easily sho wn to b e Hamiltonian, with a maxim um searc h ratio of 7.0. Clearly the probabilit y of non-Hamiltonian graphs dra wn from G n  is decreasing with n . It seems lik ely that the probabilit y of hard instances is also going to zero. 229 V andegriend & Culberson n 100 200 300 400 500 1000 Non-Ham 154 56 29 20 15 3 T able 3: Num b er of Non-Hamiltonian Graphs from G n  n k = 1 : 00 k = 1 : 50 k = 2 : 00 500 0.20 0.20 0.21 1000 0.43 0.50 0.60 1500 0.68 0.80 0.87 T able 4: CPU Seconds p er 1000 Searc h No des for G n;m Graphs 4.3 G n;m Without Using Restart W e w an ted to kno w ho w imp ortan t the restart feature is asymptotically . W e ran 1000 G n;m graphs for n from 100 to 1500, for eac h of the parameter settings in the premiere exp erimen t, but this time using the bac ktrac k algorithm without the iterated restart feature. As b efore, all non-Hamiltonian instances w ere detected during initial pruning. One quadratically hard Hamiltonian graph w as found at n = 300, with degree parameter 1.20, whic h required 163,888, or 1 : 82 n 2 searc h no des and to ok 28.5 seconds. A few other graphs w ere nearly quadratic, for example on n = 1500 there w ere 4 graphs that required 0 : 15 n 2 , 0 : 19 n 2 , 0 : 36 n 2 and 0 : 47 n 2 searc h no des. It seems that asymptotically , ev en in the absence of iterated restarts, the G n;m class do es not pro vide hard instances with high probabilit y . 4.4 G n;m Summary Based on a set of timing runs, w e presen t in T able 4.4 an indication of ho w running time p er searc h no de increases with the n um b er of v ertices n and degree parameter k . Because the times are usually so short, w e cannot get reliable n um b ers for n < 500. The times sho wn are for the ev aluation of 1000 searc h no des, and are a v eraged (total CPU divided b y total no des searc hed) o v er graphs that w ere solv ed in less than 1 : 1 n searc h no des. F or instances that require signican tly more searc h no des, the time p er 1000 no des seems to increase somewhat, but there are so few examples for large n that w e are unable to pro vide exact estimates. F or n = 1500 4 , the a v erage time p er 1000 no des for instances requiring more than 2 n searc h no des is 0.89 seconds at k = 1 : 00, 1.04 at k = 1 : 50 and 1.31 at k = 2 : 00. Note that this includes at least one instance that to ok 7 n searc h no des. This table indicates that the gro wth is appro ximately linear in n + m . The exp erimen tal evidence clearly indicates that G n;m random graphs are asymptotically extremely easy ev erywhere, despite the existence of a phase transition. Our results temp er the ndings of the v arious researc hers (Cheeseman et al., 1991; F rank et al., 1998; F rank & Martel, 1995) studying phase transitions and the Hamiltonian cycle problem. Cheese- man et al.'s explanation of their observ ed increase in dicult y near the phase transition w as that \on the b order [b et w een the regions of lo w and high connectivit y] there are man y 4. n = 1500 is the only v alue of n for whic h w e ha v e at least one instance requiring  2 n searc h no des at eac h of the three v alues of k . The times for 1000 and 1500 come from separate runs on 1000 graphs p er sample p oin t. 230 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem almost Hamiltonian cycles that are quite dieren t from eac h other . . . and these n umerous lo cal minima mak e it hard to nd a Hamiltonian cycle (if there is one). An y searc h pro- cedure based on lo cal information will ha v e the same dicult y ." (Cheeseman et al., 1991). Unfortunately , while their observ ations w ere accurate, their observ ed hardness w as due to their algorithms and the limited size of the graphs tested, not to in trinsic prop erties of the Hamiltonian cycle problem with resp ect to the phase transition on G n;m graphs. W e ha v e sho wn that an ecien t bac ktrac k algorithm nds the phase transition region of G n;m graphs easy in general. 5. Degreeb ound Graphs In tuitiv ely , the reason that it is so hard to generate a hard instance from G n;m is that b y the time w e add enough edges to mak e the minim um degree t w o, the rest of the graph is so dense that nding a Hamiltonian cycle is easy . Alternativ ely , w e see that to create a non-Hamiltonian prop ert y or feature, w e m ust ha v e regions of lo w degree, while at the same time meeting the minimal requiremen ts that mak e the instance hard to solv e. This problem can b e c haracterized as one of high v ariance of v ertex degrees. The only region where w e get ev en a few hard graphs from G n;m is when n is small enough that the a v erage degree is also lo w. T o a v oid the consequences of this degree v ariation, in this section w e use a dieren t random graph mo del G n ( d 2 = p 2 ; d 3 = p 3 ; : : : ) for whic h n is the n um b er of v ertices and d i = p i is the p ercen tage of v ertices of degree i . As an example G 100 ( d 2 = 50% ; d 3 = 50%) represen ts the set of graphs of 100 v ertices in whic h 50 are of degree 2 and 50 are of degree 3. W e refer to a graph generated under this mo del as a Degreeb ound graph. In this pap er w e only consider graphs whose v ertices are of degree 2 or 3. It is quite dicult to generate all graphs with a giv en degree sequence with equal proba- bilit y (W ormald, 1984). Instead, w e adopt t w o v ariations whic h generate graphs b y selecting a v ailable edges. In eac h case eac h v ertex is assigned a free v alence equal to the desired nal degree. In v ersion 1 pairs of v ertices are selected in random order, and added as edges if the t w o v ertices ha v e at least one free v alence eac h. This con tin ues un til either all free v alences are lled (a successful generation) or all v ertex pairs are exhausted (a failure). If failure o ccurs, the pro cess is rep eated from scratc h. Initial tests indicate ab out 1/3 of the attempts fail in general. F or eciency reasons, in the implemen tation an arra y of v ertices holds eac h v ertex once. P airs of v ertices, v ; w are selected at random from the arra y and if v 6 = w , and ( v ; w ) is not already an edge, then ( v ; w ) is added as an edge, and the free v alence of eac h of v and w is reduced b y one. When the free v alence of a v ertex is zero, the v ertex is deleted from the arra y . This step is rep eated un til only a small n um b er (t wice the maxim um degree) of v ertices remains, and then all p ossible pairs of the remaining v ertices are generated and tested in random order. In v ersion 2 an arra y initially holds eac h v ertex v deg [ v ] times. P airs of v ertices are randomly selected, and if not equal and the edge do es not exist, then the edge is added, and the copies of the t w o v ertices are deleted from the arra y . This is rep eated un til the arra y is empt y , or 100 successiv e attempts ha v e failed to add an edge. The latter case is tak en as failure, and the pro cess is rep eated from scratc h. This metho d seldom fails. 231 V andegriend & Culberson Neither of these t w o metho ds guaran tees a uniform distribution o v er the graphs of the giv en degree sequence. F or example, giv en the degree sequence on v e v ertices f 1 ; 1 ; 2 ; 2 ; 2 g , there are sev en p ossible (lab eled) graphs. One consists of t w o comp onen ts, an edge and a triangle. The other six are all four paths; th us all six are isomorphic to one another. Of the 10! p erm utations of the pairs of v ertices, 564,480 generate the graph on t w o comp onen ts, while for eac h four path there are 322,560 distinct p erm utations. The remaining p erm uta- tions (31.2 %) do not yield a legal graph. Th us, the rst graph is 1.75 times as lik ely as an y of the other six. Of course, a four path (coun ting all isomorphic graphs) is 3.428 times as lik ely as the t w o-comp onen t graph. On the other hand, a v ersion 2 test program (not our generator whic h prohibits degree one v ertices) consisten tly generated the rst graph ab out 8%{10% more often than an y of the others, based on sev eral million random trials. 5.1 Exp erimen tal Results on Degreeb ound Graphs W e test graphs of 100 : : : 500 v ertices (step size 100) 1000 and 1500 v ertices with the mean degree v arying from 2 : 6 : : : 3 : 0 (step size of 0.01 from 2.75 to 2.95, step size of 0.05 elsewhere). W e generate 1000 graphs for eac h data p oin t, execute our algorithm once on eac h graph, and collect the results. This test w as rep eated for eac h of the t w o v ersions. Figure 3 sho ws the p ercen tage of graphs whic h are Hamiltonian as the mean degree and graph size v aries 5 . There is a clear transition from a mean degree of 2.6 (near 0% c hance of a Hamiltonian cycle) to a mean degree of 3 (for whic h Robinson and W ormald, 1994 predict an almost 100% c hance of a Hamiltonian cycle on uniformly distributed graphs). F or a phase transition, w e w ould exp ect the slop e to gro w steep er as the graph size increases. Figure 3 sho ws this increase in steepness. Note that the double p oin ts on the curv e for n = 100 are due to una v oidable discretiza- tion. Since the total degree of a graph m ust b e ev en, when the generators detect that the total degree sp ecied is o dd, one of the minim um degree v ertices is selected and its degree incremen ted. Th us, for example, whether the fraction of degree 3 v ertices sp ecied is 0.81 or 0.82, the n um b er of degree three v ertices is 82. Discretization eects also o ccur for n = 300, 500 and 1500, but with lessened impact. In T able 5.1 w e summarize the observ ed hard instances from these graphs. W e note that sev eral instances exceeded our time b ounds, and although these are certainly at least quadratically hard, they are not included in the quadratically hard instances. The frequency of hard instances app ears to b e decreasing with n on these graphs. In particular there are no quadratically hard non-Hamiltonian instances o v er 1000 v ertices, except those that are to o hard to solv e with our program. In terestingly , there turns out to b e an O ( n + m ) time test whic h sho ws that most of the unresolv ed instances are non-Hamiltonian. This test is describ ed briey in Section 5.3. W e implemen ted the test as a separate program and tested eac h of the unresolv ed graphs, with the results indicated in the last column of T able 5.1. The remaining v e graphs remain unresolv ed. If this test w ere included in the initial pruning of our program, then the instances en umerated in the last column of T able 5.1 w ould all b e solv ed (pro v en non- Hamiltonian) without searc h. 5. F or these graphs, the mean degree is 2.0 plus the fraction of degree 3 v ertices. 232 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem 0 20 40 60 80 100 60 65 70 75 80 85 90 95 100 % Hamiltonian % Vertices of Degree 3 Version 1 n = 100 200 300 400 500 1000 1500 0 20 40 60 80 100 60 65 70 75 80 85 90 95 100 % Hamiltonian % Vertices of Degree 3 Version 2 n = 100 200 300 400 500 1000 1500 Figure 3: % of Hamiltonian graphs for Degreeb ound Graphs. 233 V andegriend & Culberson V ersion 1 Num b er of Quadratically Hard Timed Out V ertices No HC HC T otal No HC 100 5 0 0 0 200 18 0 3 3 300 8 0 11 10 400 1 0 14 14 500 0 0 14 14 1000 0 0 7 7 1500 0 1 6 6 V ersion 2 Num b er of Quadratically Hard Timed Out V ertices No HC HC T otal No HC 100 5 0 0 0 200 9 0 6 5 300 10 0 13 13 400 3 0 11 11 500 1 1 10 9 1000 0 1 6 4 1500 0 0 6 6 T able 5: Num b er of Hard Graphs for Degreeb ound Graphs Th us, although these classes ma y pro vide a small rate of hard instances for our curren t program, it is not clear they are ev en minimally hard. F urthermore, it app ears there exist simple impro v emen ts to our program that w ould eliminate most of these hard instances. In Figure 4 w e illustrate the distribution of the graphs that timed out. The other quadratically hard graphs had similar distributions. Ab out all that can b e concluded is that the hard instances seem to b e distributed o v er a mean degree range from 2.78 to 2.94. The bac ktrac k program is a little faster on Degreeb ound graphs than on G n;m graphs, as w e w ould exp ect giv en few er total edges. F or 1500 v ertices, the times p er 1000 searc h no des ranged from 0.27 seconds for the easiest (no bac ktrac k) instances to 0.56 seconds for the harder ones. 5.2 Analysis of Degreeb ound Graphs An analysis of the Degreeb ound graph class led us to conjecture that the prime factor determining the Hamiltonicit y of a graph w as whether or not the graph had a degree 3 v ertex with 3 neigh b ours of degree 2. W e lab el this a 3D2 conguration (or a 3D2 ev en t). A graph with a 3D2 conguration is non-Hamiltonian. The follo wing informal analysis pro vides evidence for our conjecture. Let E ( n;  ) represen t the exp ected n um b er of 3D2 congurations in a graph with n v ertices. Let D 2 = n b e the n um b er of degree 2 v ertices and D 3 = (1   ) n the n um b er of degree 3 v ertices. Note that the mean degree d = 2 D 2 +3 D 3 n = 2 n +3 n (1   ) n = 3   . Assuming equal probabilit y of all com binations, 234 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem Version 1 200 500 1000 1500 80 85 90 95 1 2 3 4 5 Number of Vertices % Degree 3 Number Failed Version 2 200 500 1000 1500 80 85 90 95 1 2 3 4 5 Number of Vertices % Degree 3 Number Failed Figure 4: Distribution of Timed Out Instances for Degreeb ound graphs 235 V andegriend & Culberson # of Mean Degree for 50% HC P oin t V ertices Exp erimen tal Theoretical 100 2.78 2.78 200 2.81 2.82 300 2.83 2.85 400 2.84 2.86 500 2.85 2.87 1000 2.88 2.90 1500 2.90 2.91 T able 6: Exp erimen tal and appro ximate theoretical v alues for the lo cation of the 50% Hamiltonian p oin t for Degreeb ound graphs of v arious sizes. E ( n;  ) = D 3  D 2 3   n  1 3  = n (1   )  n 3   n  1 3  = n (1   )( n )( n  1)( n  2) ( n  1)( n  2)( n  3) W e restrict ourselv es to the asymptotic case ( n ! 1 ) whic h giv es us E ( n;  )  n (1   )( n ) 3 n 3  n (1   )  3 When E ( n;  ) ! 0, the probabilit y of ha ving conguration 3D2 approac hes 0. W e w an t to nd  for whic h n (1   )  3 ! 0 as n ! 1 . This o ccurs when  = o ( n  1 = 3 ). Since a Hamiltonian cycle cannot exist if E (3D2 ) > 0, this tells us that the phase transition asymptotically o ccurs when the mean degree equals 3. Asymptotically , Degreeb ound graphs with d < 3 are exp ected to b e non-Hamiltonian while Degreeb ound graphs with d > 3 are exp ected to b e Hamiltonian (ignoring other conditions). This agrees with results of Robinson and W ormald (1994) who pro v ed that almost all 3-regular graphs are Hamiltonian. If w e let  = n  1 = 3 this giv es us E ( n;  )  1. Substituting this equation in our expression for mean degree giv es us d = 3  n  1 = 3 . T able 5.2 lists mean degrees for dieren t v alues of n using this form ula along with our exp erimen tally determined v alues for the p oin t where 50% of the graphs are Hamiltonian. They are remark ably similar. This suggests that the 3D2 conguration is the ma jor determinator of whether a Degreeb ound graph will b e Hamilto- nian or not. Minor eects (whic h w e ha v e ignored) come from propagation of deleted edges while pruning and other less probable cases suc h as those men tioned in Section 5.3. Since the 3D2 conguration is detected b y our algorithm b efore the searc h is started, this also implies that the phase transition will b e easy for our algorithm, since most non-Hamiltonian graphs are instan tly detected. This matc hes our exp erimen tal observ ations. 5.3 A Non-Hamiltonicit y T est for Sparse Graphs While preparing the nal v ersion of this pap er, w e observ ed that in the 3D2 conguration w e could replace the v ertex of degree three with a comp onen t of sev eral v ertices. In general, 236 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem if there are three v ertices of degree t w o that form a minimal cut then the graph is non- Hamiltonian. In fact, w e can replace the three v ertices b y a minimal cut of an y o dd n um b er c of degree 2 v ertices, and the claim of non-Hamiltonicit y remains true. Chec king all p ossible subsets of size c w ould b e v ery exp ensiv e, but fortunately there is an ev en more general condition that includes all of these as sp ecial cases and can b e tested in linear (i.e O ( n + m )) time. Let F b e a set of edges that are forced to b e in an y Hamiltonian cycle if one exists. F or example, edges inciden t on a v ertex of degree t w o are forced. Let G 0 = G  F b e the graph formed b y deleting the forced edges from G . Let C 1 : : : C h b e comp onen ts of G 0 , and dene the for c e d de gr e e of comp onen t C i to b e the n um b er of end p oin ts of forced edges (from F ) in C i . If an y comp onen t has an o dd forced degree, then G is non-Hamiltonian. The pro of of correctness of this test is simple. Observ e that if there is a Hamiltonian cycle in G then while tra v ersing the cycle eac h time w e en ter a comp onen t, there m ust b e a corresp onding exit. Since the forced edges act as a cut set (that separates the comp onen ts), they are the only edges a v ailable to act as en tries and exits to a comp onen t. All forced edges m ust b e used. Therefore, if there is a Hamiltonian cycle there m ust b e an ev en n um b er of forced edges connecting an y comp onen t to other comp onen ts, eac h con tributing one to the forced degree of the comp onen t. Eac h forced edge in ternal to (with b oth end p oin ts in) a comp onen t con tributes t w o to the forced degree, so if there is a Hamiltonian cycle the total forced degree of eac h comp onen t m ust b e ev en. T o obtain the results in the last column of T able 5.1, w e rst did the initial pruning, and then applied the test to the pruned graphs, using only the forced edges inciden t on degree t w o v ertices. 6. Generalized Knigh t's Circuit Graphs In this section w e examine a graph class based up on the generalized knigh t's circuit problem in whic h the size of the knigh t's mo v e is allo w ed to v ary along with the size of the (rectan- gular) b oard. An instance of the generalized knigh t's circuit problem is a graph dened b y the 4-tuple ( A; B )  n  m where A; B is the size of the knigh t's mo v e and n; m is the size of the b oard. The v ertices of the graph corresp ond to the cells, and th us j V j = nm . Tw o v ertices are connected b y an edge if and only if it is p ossible to mo v e from one v ertex to the other b y mo ving A steps along one axis and B along the other. (See V andegriend, 1998 for more information ab out this problem.) F or this graph class there is no easy w a y to dene phase transitions since there is no clear parameter whic h separates the Hamiltonian graphs from the non-Hamiltonian graphs (although V andegriend, 1998 sho ws that there are w a ys of iden tifying groups of non-Hamiltonian graphs). Th us to nd hard graphs, w e lo ok for graphs whic h tak e a signif- ican t amoun t of time to solv e relativ e to their size. W e p erform 1 trial p er graph (problem instance) and rep ort the ratio of searc h no des to n um b er of v ertices. W e examined a total of 300 generalized knigh t's circuit graphs o v er ranges of A; B ; n; m (Sp ecic A; B ; n triplets with m allo w ed to v ary , for A + B  9, n  13, m  60.) They ranged in size from 80 to 390 v ertices. Of the 300 instances examined, 121 graphs (40 %) w ere found to b e Hamiltonian and 141 graphs (47 %) w ere found to b e non-Hamiltonian. 237 V andegriend & Culberson searc h no des # of trials % of trials  2 n 1 0.8 5 n 43 35.5 10 n 37 30.6 20 n 11 9.1 50 n 8 6.6 100 n 8 6.6 200 n 2 1.7 500 n 5 4.1 1000 n 2 1.7 2000 n 1 0.8 5000 n 1 0.8 10000 n 1 0.8 20000 n 0 0.0 50000 n 1 0.8 T able 7: Histogram of the searc h no de ratio of our bac ktrac k algorithm on 121 Hamiltonian generalized knigh t's circuit instances. F or the remaining 38 graphs (13 %) our bac ktrac k algorithm failed (reac hed the 30 min ute time limit), whic h implies these graphs are v ery hard for our bac ktrac k algorithm. A ma jorit y (91%) of the non-Hamiltonian graphs w ere solv ed without an y searc h. Ho w- ev er, a signican t n um b er of the remaining graphs to ok man y searc h no des to solv e. 9 graphs (6.4%) to ok more than 10 n no des and 7 graphs (5.0%) to ok more than 100 n no des. The hardest graph to ok  11276 n searc h no des ( n = 324). So while the ma jorit y of the non-Hamiltonian graphs w ere easy , a signican t p ercen tage of these generalized knigh t's circuit graphs w ere quite hard for our algorithm. A larger v ariance in hardness w as observ ed with the Hamiltonian graphs. T able 6 sho ws the distribution with resp ect to the n um b er of searc h no des required. Unlik e G n;m and Degreeb ound graphs, these graphs could not b e solv ed in only n searc h no des. Almost all the graphs required at least 2 n searc h no des. 33% of the graphs required at least 10 n no des, 11% required at least 100 n no des and the hardest graph required  34208 n no des ( n = 198). 7. A Hard Constructed Graph Class It is w orth while when designing an algorithm to determine under what conditions and ho w frequen tly it migh t fail to p erform and just ho w badly it migh t do. The measure can b e in terms of ho w bad an appro ximation is, or ho w long an exact algorithm ma y tak e in the w orst case. There is a long tradition of designing instance sets that foil sp ecic com binato- rial algorithms (Johnson, 1974; Mitc hem, 1976; Olariu & Randall, 1989; Spinrad & Vija y an, 1985). Other sp ecial classes are in tended to b e more general, and are frequen tly based on certain features or constructs together with some randomization to hide the features (Cul- b erson & Luo, 1996; Bro c kington & Culb erson, 1996; Kask & Dec h ter, 1995; Ba y ardo Jr. & 238 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem Sc hrag, 1996). The G n;m class is frequen tly used to study graph algorithms o v er all p ossible graphs. In this section w e consider a sp ecial construction for a Hamiltonian graph whic h is extremely hard (exp onen tial increase in dicult y with size) for our bac ktrac k algorithm. It consists mostly of sp ecial constructs tied together with some randomly c hosen edges. It b ears some resem blance to graphs suc h as the Meredith graph (Bondy & Murt y , 1976) used to dispro v e certain theoretical conjectures. This graph remains dicult when w e v ary the neigh b our selection heuristic or pruning tec hniques used b y our bac ktrac k algorithm. The graph w e construct w e refer to as the In terconnected-Cutset ( ICCS ) graph. Our class is in tended merely to sho w that exp onen tially hard classes clearly exist for our algorithm, and man y other bac ktrac k algorithms using similar approac hes. W e do not claim our graphs are in trinsically hard, as there is a p olynomial time algorithm that will solv e this particular class. The basic concept w e use in constructing these graphs is the non-Hamiltonian edge, whic h w e dene as an edge whic h cannot b e in an y p ossible Hamiltonian cycle. Note that since the graphs are Hamiltonian, eac h v ertex m ust b e inciden t on at least t w o edges whic h are not non-Hamiltonian. Our goal is to force the algorithm to c ho ose a non-Hamiltonian edge at some p oin t. The k ey observ ation is that once suc h an edge is c hosen, the algorithm m ust bac ktrac k to x that c hoice. With m ultiples of these bad c hoices, after bac ktrac king to x the most recen t bad c hoice, the algorithm m ust ev en tually bac ktrac k to an earlier p oin t to x a less recen t bad c hoice, whic h means the more recen t c hoice m ust b e redone, with the algorithm making the bad c hoice again. The amoun t of w ork p erformed b y the algorithm is at least exp onen tial in the n um b er of bad c hoices. See V andegriend (1998) for more details. The ICCS graph is comp osed of k iden tical subgraphs ICCS S arranged in a circle. T o force the desired cycle w e ha v e a degree 2 v ertex b et w een eac h subgraph. Since eac h subgraph has a Hamiltonian path b et w een the connecting v ertices, the ICCS graph is Hamiltonian. Due to the construction of the ICCS subgraph, extra non-Hamiltonian edges can b e added b et w een dieren t subgraphs. These edges help prev en t comp onen ts from forming during the searc h, whic h greatly reduces the eectiv eness of the comp onen t c hec king searc h pruning. See Figure 5. Hea vy lines are forced edges that m ust b e in an y Hamiltonian cycle. Figure 6 con tains a sample ICCS subgraph. Non-Hamiltonian edges are denoted b y dashed lines, and forced edges are denoted b y hea vy lines. T o see that the dashed lines cannot b e part of an y Hamiltonian cycle observ e that an y path through the ICCS S m ust en ter and exit on an S C v ertex, and b et w een an y t w o S C v ertices in sequence the path can visit at most one S I v ertex. Th us, eac h suc h path uses at least one more v ertex from S C than from S I . Since initially j S C j = j S I j + 1, an y Hamiltonian cycle can en ter and exit the ICCS S only once, and m ust alternate b et w een S C and S I v ertices. Since the S T v ertices only ha v e one edge leading to an S I v ertex, these edges are forced. This also allo ws us to in terconnect subgraphs without adding new Hamiltonian cycles b y connecting v ertices of S C of t w o dieren t subgraphs (since these additional edges are all non-Hamiltonian edges). By in terconnecting the subgraphs in this fashion, w e strongly reduce the eectiv eness of c hec king for comp onen ts or cut-p oin ts during the searc h. In the curren t implemen tation, for eac h v ertex in eac h S C w e randomly c ho ose a 239 V andegriend & Culberson ICCS S ICCS S ICCS S ICCS S Figure 5: A sample ICCS graph. S T S T C S S I C S S D to vertices in other subgraphs connecting edges to adjacent subgraphs Figure 6: A sample ICCS subgraph ICCS S . 240 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem v ertex in another S C and add the edge. Th us, the a v erage n um b er of suc h edges p er v ertex is a little less than t w o, since some edges ma y b e rep eated. One additional design elemen t w as added to handle v arious degree selection heuristics that our algorithm could use. A t eac h stage in the searc h, the neigh b ours of the curren t endp oin t of the partial path are arranged in a list to determine the order in whic h they will b e c hosen b y our bac ktrac k algorithm. There are 3 main heuristics: sorting the list to visit lo w er degree neigh b ours rst, sorting to visit higher degree neigh b ours rst, and visiting in random order. (Our bac ktrac k algorithm normally uses the lo w er degree rst heuristic.) The S D v ertex in the ICCS subgraph is used to fo ol the lo w degree rst heuristic. The S D v ertex is only inciden t to the t w o S T v ertices and to t w o v ertices in S I , whic h mak es it degree 4. When the algorithm en ters a subgraph from the degree 2 connecting v ertex, it reac hes one of the S T v ertices. F rom the S T v ertex, the c hoices are the S D v ertex (degree 4) and the one S I v ertex (degree j S C j  2, b ecause it is not connected to the S D v ertex and the other S T v ertex). If j S C j > 6 then the S D v ertex will ha v e a lo w er degree and th us will b e c hosen rst. The high degree rst heuristic a v oids follo wing the edge from the S T v ertex to the S D v ertex, and instead go es to the S I v ertex. F rom there it c ho oses one of the S C v ertices (not including S D or the other S T v ertex, whic h are not adjacen t). F rom this p oin t, its c hoice is one of the S I v ertices (maxim um degree = j S C j  2) or one of the S C v ertices in a dieren t subgraph (degree  j S C j if that subgraph has not y et b een visited). Since the S C v ertex normally will ha v e a higher degree, the algorithm will follo w the non-Hamiltonian edge to that v ertex. If the next neigh b our is c hosen at random, then from a S T v ertex, the algorithm has a 50% c hance of making the wrong c hoice. Similarly , at eac h S C v ertex the algorithm has a small c hance of follo wing a non-Hamiltonian edge. As the n um b er of subgraphs is increased, the probabilit y of the algorithm making all the righ t c hoices rapidly approac hes 0. Another reason wh y the ICCS subgraph is exp ected to b e hard for a bac ktrac k algorithm is that there are man y p ossible paths b et w een the t w o S T v ertices. If a non-Hamiltonian edge has previously b een c hosen, then the bac ktrac k algorithm will try all the dieren t com binations of paths (and fail to form a Hamiltonian cycle) b efore it bac ktrac ks to the bad c hoice. W e p erformed exp erimen ts on v arious ICCS graphs. W e v aried the n um b er of subgraphs from 1 to 4, and v aried the indep enden t set size ( j S I j ) from 6 to 8. W e used our bac ktrac k algorithm as sp ecied in Section 3 with the addition of c hec king for comp onen ts and cut- p oin ts during the searc h. W e executed our algorithm 5 times p er graph. Our results are listed in T able 7 for the lo w degree rst heuristic. Our exp erimen ts using the other degree selection heuristics exhibited similar results. W e ha v e also p erformed similar exp erimen ts using a randomized heuristic algorithm (F rieze, 1988; P osa, 1976). Due to the signican t dierence in op eration b et w een this algorithm and bac ktrac k algorithms, it easily solv ed these small ICCS graphs. Ho w ev er its p erformance rapidly decreased as the graphs w ere increased in size. The a v erage degree of ICCS graphs with more than one subgraph lies within the fol- lo wing range: j S I j  2 : 5 + 9 : 5 j S I j + 1  d  j S I j  2 + 8 j S I j + 1 241 V andegriend & Culberson n # S j S I j Min Median Max 14 1 6 14 14 210 28 2 6 606 616 3,777 42 3 6 10,467 47,328 112,795 56 4 6 6,538,842 32,578,160 36,300,827 16 1 7 16 48 112 32 2 7 13,056 21,797 70,949 48 3 7 1,350,084 5,247,287 8,027,520 18 1 8 18 54 270 36 2 8 283,164 430,620 750,211 54 3 8 > 1 : 2  10 8 T able 8: Searc h no des required b y our bac ktrac k algorithm on ICCS graphs. F rom this form ula w e see that as the size of eac h indep enden t set is increased, the mean degree increases linearly . Ho w ev er, as the n um b er of subgraphs is increased, the mean degree remains constan t. The ICCS graphs remain hard o v er a v ery wide range of mean degrees (from O (1) to O ( n )). Therefore the a v erage degree in this case is not a relev an t parameter for determining hardness. 8. Conclusions and F uture W ork Our bac ktrac k Hamiltonian cycle algorithm found G n;m graphs easy to solv e, along with a ma jorit y of Degreeb ound graphs. W e ha v e also p erformed similar exp erimen ts (V ande- griend, 1998) using a randomized heuristic algorithm (F rieze, 1988; P osa, 1976) whic h had a high success rate on G n;m graphs, less so on Degreeb ound graphs. More in terestingly , the existence of a phase transition for b oth problems did not clearly corresp ond to a high fre- quency of dicult instances. W e susp ect that other prop erties pla y a more imp ortan t role than do es the a v erage degree. This is supp orted b y our results on generalized knigh t's circuit graphs, whic h are all highly regular (with man y symmetries), and for whic h the ma jorit y ha v e a v erage degrees b et w een 4 and 8, compared to a mean degree  3 on Degreeb ound graphs. These results should not b e surprising, since it has b een sho wn that asymptotically for randomly generated graphs, when the edge is added that mak es the last v ertex degree 2, then with high probabilit y the graph is Hamiltonian (Bollob as, 1984). In addition, ecien t algorithms ha v e b een sho wn to solv e these instances in p olynomial time with high proba- bilit y (Bollob as et al., 1987). Since v ertices of degree less than 2 are a trivially detectable coun ter-indicator, it is hardly surprising that asymptotically determining Hamiltonicit y of graphs in G n;m is easy . W e also observ e that the p erformance of our bac ktrac k algorithm can widely v ary for a single graph due to the selection of the initial v ertex. Multiple restarts of our bac ktrac k algorithm after a time limit w as reac hed often resulted in sup erior p erformance. W e suggest a little randomization of the algorithm b e used while empirically iden tifying in trinsically hard random instances of an y problem. 242 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem Ac kno wledgemen ts This researc h w as supp orted b y Natural Sciences and Engineering Researc h Council Gran t No. OGP8053. References Angluin, D., & V alian t, L. G. (1979). F ast probabilistic algorithms for Hamiltonian circuits and matc hings. J. Comput. System Sci. , 18 (2), 155{193. Bak er, A. (1995). Intel ligent Backtr acking on Constr aint Satisfaction Pr oblems . Ph.D. thesis, Univ ersit y of Oregon. Ba y ardo Jr., R. J., & Sc hrag, R. (1996). Using csp lo ok-bac k tec hniques to solv e exception- ally hard sat instances. In Pr o c. of the Se c ond Int'l Conf. on Principles and Pr actic e of Constr aint Pr o gr amming , V ol. 1118 of L e ctur e Notes in Computer Scienc e , pp. 46{60. Bollob as, B., F enner, T. I., & F rieze, A. M. (1987). An algorithm for nding Hamilton paths and cycles in random graphs. Combinatoric a , 7 (4), 327{341. Bollob as, B. (1984). The ev olution of sparse graphs. In Bollob as, B. (Ed.), Gr aph The ory and Combinatorics , pp. 35{57. Academic Press, T oron to. Bondy , J. A., & Murt y , U. S. R. (1976). Gr aph The ory with Applic ations . Elsevier, Ams- terdam. Bro c kington, M., & Culb erson, J. C. (1996). Camouaging indep enden t sets in quasi- random graphs.. In Johnson, & T ric k (Johnson & T ric k, 1996), pp. 75{88. Bro der, A. Z., F rieze, A. M., & Shamir, E. (1994). Finding hidden Hamiltonian cycles. R andom Structur es and A lgorithms , 5 (3), 395{410. Cheeseman, P ., Kanefsky , B., & T a ylor, W. M. (1991). Where the really hard problems are. In Mylop oulos, J., & Reiter, R. (Eds.), IJCAI-91: Pr o c e e dings of the Twelfth Inter- national Confer enc e on A rticial Intel ligenc e , pp. 331{337 San Mateo, CA. Morgan Kaufmann. Culb erson, J. C., & Luo, F. (1996). Exploring the k {colorable landscap e with iterated greedy .. In Johnson, & T ric k (Johnson & T ric k, 1996), pp. 245{284. F rank, J., Gen t, I. P ., & W alsh, T. (1998). Asymptotic and nite size parameters for phase transitions: Hamiltonian circuit as a case study . Information Pr o c essing L etters , In pr ess . F rank, J., & Martel, C. (1995). Phase transitions in the prop erties of random graphs. In CP'95 Workshop: Studying and Solving R e al ly Har d Pr oblems , pp. 62{69. F rieze, A. M. (1988). Finding Hamilton cycles in sparse random graphs. Journal of Com- binational The ory, Series B , 44 , 230{250. 243 V andegriend & Culberson Gomes, C. P ., Selman, B., & Kautz, H. (1998). Bo osting com binatorial searc h through randomization. In Pr o c e e dings of the Fifte enth National Confer enc e on A rticial In- tel ligenc e (AAAI-98) , pp. 431{437. AAAI Press/ The MIT Press. Hogg, T. (1998). Whic h searc h problems are random?. In Pr o c e e dings of the Fifte enth National Confer enc e on A rticial Intel ligenc e (AAAI-98) , pp. 438{443. AAAI Press/ The MIT Press. Hogg, T., & Williams, C. P . (1994). The hardest constrain t problems: A double phase transition. A rticial Intel ligenc e , 69 , 359{377. Johnson, D. S. (1974). Appro ximation algorithms for com binatorial problems. Journal of Computer and System Scienc es , 9 , 256{278. Johnson, D. S., Aragon, C. R., McGeo c h, L. A., & Sc hev on, C. (1991). Optimization b y sim ulated annealing: An exp erimen tal ev aluation; part I I, graph coloring and n um b er partitioning. Op er ations R ese ar ch , 39 (3), 378{406. Johnson, D. S., & T ric k, M. A. (Eds.). (1996). Cliques, Coloring, and Satisability: Se c ond DIMA CS Implementation Chal lenge (1993) , V ol. 26. American Mathematical So ciet y . Kask, K., & Dec h ter, R. (1995). GSA T and lo cal consistency . In Mellish, C. S. (Ed.), IJCAI-95 : Pr o c e e dings of the F ourte enth International Joint Confer enc e on A rticial Intel ligenc e , pp. 616{622 San Mateo, CA. Morgan Kaufmann. Ko ca y , W. (1992). An extension of the m ulti-path algorithm for nding Hamilton cycles. Discr ete Mathematics , 101 , 171{188. Koml os, M., & Szemer  edi, E. (1983). Limit distribution for the existence of a Hamilton cycle in a random graph. Discr ete Mathematics , 43 , 55{63. Langley , P . (1992). Systematic and nonsystematic searc h strategies. In A rticial Intel ligent Planning Systems: Pr o c e e dings of the First International Confer enc e , pp. 145{152. Martello, S. (1983). Algorithm 595: An en umerativ e algorithm for nding Hamiltonian circuits in a directed graph. A CM T r ansactions on Mathematic al Softwar e , 9 (1), 131{138. Mitc hem, J. (1976). On v arious algorithms for estimating the c hromatic n um b er of a graph. The Computer Journal , 19 , 182{183. Mot w ani, R., & Ragha v an, P . (1995). R andomize d A lgorithms . Cam bridge Univ ersit y Press, New Y ork. Olariu, S., & Randall, J. (1989). W elsh-Po w ell opp osition graphs. Information Pr o c essing L etters , 31 (1), 43{46. P almer, E. M. (1985). Gr aphic al Evolution: an intr o duction to the the ory of r andom gr aphs . John Wiley & Sons, T oron to. P osa, L. (1976). Hamiltonian circuits in random graphs. Discr ete Mathematics , 14 , 359{364. 244 The G n;m Phase Transition is Not Hard f or the Hamil tonian Cycle Pr oblem Robinson, R. W., & W ormald, N. C. (1994). Almost all regular graphs are Hamiltonian. R andom Structur es and A lgorithms , 5 (2), 363{374. Selman, B., Lev esque, H., & Mitc hell, D. (1992). A new metho d for solving hard satisabilit y problems. In Pr o c e e dings of the T enth National Confer enc e on A rticial Intel ligenc e (AAAI-92), San Jose, CA , pp. 440{446. Sh ufelt, J. A., & Berliner, H. J. (1994). Generating Hamiltonian circuits without bac ktrac k- ing from errors. The or etic al Computer Scienc e , 132 , 347{375. Spinrad, J. P ., & Vija y an, G. (1985). W orst case analysis of a graph coloring algorithm. Discr ete Applie d Mathematics , 12 (1), 89{92. V andegriend, B. (1998). Finding Hamiltonian cycles: Algorithms, graphs and p erformance. Master's thesis, Departmen t of Computing Science, Univ ersit y of Alb erta. Online at \h ttp://www.cs.ualb erta.ca/ ~ basil/". W ormald, N. C. (1984). Generating random regular graphs. Journal of A lgorithms , 5 , 247{280. 245

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment