HyFlex: A Benchmark Framework for Cross-domain Heuristic Search

Automating the design of heuristic search methods is an active research field within computer science, artificial intelligence and operational research. In order to make these methods more generally applicable, it is important to eliminate or reduce …

Authors: Edmund Burke, Tim Curtois, Matthew Hyde

HyFlex: A Benchmark Framework for Cross-domain Heuristic Search
Noname man uscript No. (will b e inserted b y the editor) HyFlex: A Benc hmark F ramew ork for Cross-domain Heuristic Searc h Edm und Burk e · Tim Curtois · Matthew Hyde · Gabriela Oc hoa · Jos´ e A. V´ azquez-Rodr ´ ıguez Received: date / Accepted: date Abstract Automating the design of heuristic searc h methods is an activ e researc h field within computer science, artificial intelligence and operational research. In order to mak e these methods more generally applicable, it is important to eliminate or reduce the role of the human exp ert in the process of designing an effectiv e methodology to solv e a given computational searc h problem. Researc hers developing such metho dolo- gies are often constrained on the num ber of problem domains on which to test their adaptiv e, self-configuring algorithms; which can b e explained by the inherent difficult y of implemen ting their corresp onding domain sp ecific softw are comp onents. This pap er presents HyFlex , a soft w are framework for the developmen t of cross- domain search metho dologies. The framework features a common softw are interface for dealing with different combinatorial optimisation problems, and provides the al- gorithm comp onents that are problem sp ecific. In this wa y , the algorithm designer do es not require a detailed knowledge the problem domains, and thus can concen- trate his/her efforts in designing adaptive general-purp ose heuristic search algorithms. F our hard combinatorial problems are fully implemented (maximum satisfiability , one dimensional bin pac king, permutation flow shop and personnel scheduling), each con- taining a v aried set of instance data (including real-world industrial applications) and an extensive set of problem sp ecific heuristics and searc h op erators. The framework forms the basis for the first In ternational Cross-domain Heuristic Searc h Challenge (CHeSC), and it is currently in use by the international research communit y . In sum- mary , HyFlex represents a v aluable new b enchmark of heuristic search generality , with whic h adaptiv e cross-domain algorithms are b eing easily dev elop ed, and reliably com- pared. Keyw ords hyper-heuristics · combinatorial optimisation · search metho dologies, self-adaptation, adaptation Automated Scheduling, Optimisation and Planning (ASAP) researc h group Universit y of Nottingham, Sc hool of Computer Science, Jubilee Campus, W ollaton Road, Nottingham NG8 1BB, UK E-mail: { ekb,tec,mvh,gxo } @cs.nott.ac.uk 2 1 Introduction There is a renew ed and growing researc h in terest in tec hniques for automating the design of heuristic searc h metho ds. The goal is to remo v e or reduce the need for a human exp ert in the pro cess of designing an effective algorithm to solve a search problem, and consequen tly raise the lev el of generality at whic h searc h metho dologies can op erate. Ev olutionary algorithms and metaheuristics hav e been successfully applied to solve a v ariet y of real-world complex optimisation problems. Their design, how ev er, has b ecome increasingly complex. In order to make these metho dologies widely applicable, it is imp ortant to provide self-managed systems that can configure themselv es ‘on the fly’; adapting to the c hanging problem (or search space) conditions, based on general high-lev el guidelines provided by their users. Researc hers pursuing these goals within combinatorial optimisation, are often lim- ited b y the n um ber of problems domains a v ailable to them for testing their adaptive metho dologies. This can be explained b y the difficulty and effort required to implemen t state-of-the-art softw are comp onents, such as the problem mo del, solution representa- tion, ob jective function ev aluation and searc h operators; for man y different combina- torial optimisation problems. Although several b enchmark problems in combinatorial optimisation are a v ailable (T aillard, 1993; Argelic h et al, 2009; ESICUP, 2011; Beasley, 2010; TSPLIB, 2008) (to name just a few); they contain mainly the data of a set of in- stances and their best known solutions. They generally do not incorporate the softw are necessary to enco de the solutions and calculate the ob jectiv e function, let alone exist- ing searc h op erators for the given problem. It is the researc her who needs to pro vide these in order to later test their high-level adaptiv e search metho d. T o ov ercome such limitations, we prop ose HyFlex , a mo dular and flexible Jav a class library for designing and testing iterative heuristic search algorithms. It provides a n umber of problem do- main mo dules, each of which encapsulates the problem-sp ecific algorithm comp onents: solution represen tation, fitness ev aluation, instance data, and a repository of asso ciated problem-sp ecific heuristics. Imp ortantly , only the high-level control strategy needs to b e implemen ted b y the user, as H yFlex pro vides an easy to use in terface with which the problem domains can b e accessed. Indeed, HyFlex can b e considered as an extension of the notion of a b enchmark for combinatorial optimisation. Instead of providing only a data-set for a given problem domain, HyFlex also provides the problem sp ecific soft- w are surrounding it. Thus, HyFlex acts as a b enchmark for cross-domain optimisation and more general searc h metho dologies. A n umber of tec hniques and researc h themes within operational researc h, computer science and artificial intelligence would b enefit from the prop osed framew ork. Among them: h yp er-heuristics (Burke et al, 2003b,a, 2010c; Ross, 2005), adaptiv e memetic algorithms (Krasnogor and Smith, 2001; Jakob, 2006; Ong et al, 2006; Smith, 2007; Neri et al, 2007), adaptiv e op erator selection (Fialho et al, 2008, 2010; Maturana and Saubion, 2008; Maturana et al, 2010), reactive search (Battiti, 1996; Battiti et al, 2009), v ariable neighborho o d searc h (Mladenovic and Hansen, 1997) and its adaptive v ariants (Bra ysy, 2002; Pisinger and Ropk e, 2007); and generally the dev elopment of adaptive parameter control strategies in evolutionary algorithms (Eib en et al, 2007; Lob o et al, 2007). HyFlex can be seen, then, as a unifying benchmark, with whic h the performance of differen t adaptiv e tec hniques can be reliably assessed and compared. Indeed, HyFlex is currently used to support an international research comp etition: the First Cross- Domain Heuristic Search Challenge (CHeSC, 2011). The challenge is analogous to the athletics Decathlon ev ent, where the goal is not to excel in one even t at the expense 3 of others, but to ha ve a go o d general p erformance on eac h. The comp etition will also pro vide a set of state-of-the-art initial results on the HyFlex b enchmark. Comp etitors will submit one Jav a class file representing their hyper-heuristic or high-level search strategy . This class file will then b e run in HyFlex through the common interface. This ensures that the comp etition is fair, b ecause all of the comp etitors must use the same problem representation and search op erators. Moreov er, due to the common interface, the comp etition will consider not only hidden instances, but also hidden domains. An in teresting feature of CHeSC is the Leaderb oard, a table which ranks participan ts according to their b est score on a rehearsal comp etition conducted every w eek. This rehearsal comp etition is based on a set of results submitted by the participan ts who c hose to do so. It has brough t substantial dynamism and in terest to the challenge. CHeSC curren tly has 43 registered teams from 23 differen t countries. This article is structured as follows. Section 2 describ es the an tecedents and ar- c hitecture of the HyFlex framework. It also includes examples of how to implement and run hyper-heuristics within the framework. Section 3 presents the four problem domains whic h are currently implemen ted: maximum satisfiability (MAX-SA T), one- dimensional bin pac king, p ermutation flow shop, and p ersonnel scheduling. F or each domain, details are given on the instance data, solution initialisation method, ob jectiv e function ev aluation, and the set of problem sp ecific heuristics. Section 4 illustrates the implemen tation of three high-level search strategies using HyFlex: an iterativ e hyper- heuristic, a multiple neighbourho o d iterated lo cal search algorithm, and a multi-meme memetic algorithm. They are not in tended to b e state-of-the-art adaptiv e approac hes in their categories. Instead, they were selected to illustrate the wide range of algorithm designs that can b e implemented within HyFlex. Section 5 presents a comparative study of the three algorithms. The goal is not to determine the b est performing algo- rithm, but instead to illustrate their difference in b ehavior across the different problem domains. Finally , section 6 summarises our con tribution and suggests directions for future researc h. 2 The HyFlex F ramew ork 2.1 Overview of HyFlex HyFlex (Hyp er-heuristics Flexible framework) is a softw are framework designed to enable the developmen t, testing and comparison of iterative general-purp ose heuristic searc h algorithms (such as hyper-heuristics). T o achiev e these goals it uses mo dularity and the concept of decomp osing a heuristic search algorithm into tw o main parts (see Figure 1): 1. A general-purp ose part: the algorithm or hyper-heuristic. 2. The problem-sp ecific part: provided by the HyFlex framework. In the hyper-heuristics literature, this idea is also referred to as the domain barrier b et ween the problem-specific heuristics and the hyper-heuristic (Burke et al, 2003a; Co wling et al, 2000). HyFlex extends the conceptual domain-barrier framework by main taining a population (instead of a single incum b ent solution) in the problem do- main la yer. Moreov er, a richer v ariety of problem sp ecific heuristics and search op era- tors is provided. Another relev ant anteceden t to HyFlex is PISA (Bleuler et al, 2003), 4 Fig. 1 Mo dularity of heuristic search algorithms. Separation between the problem-sp ecific and the general-purpose parts, b oth of which are reusable and interc hangeable through the HyFlex interface. a text-based softw are interface for multi-ob jective evolutionary algorithms. PISA pro- vides a division betw een the application-sp ecific and the algorithm-sp ecific parts of a m ulti-ob jectiv e evolutionary algorithm. In HyFlex, the interface is not text-based. In- stead, it is given by an abstract Jav a class. This allows a more tight coupling b etw een the mo dules and ov ercomes some of the sp eed limitations encountered in PISA. While PISA is designed to implement evolutionary algorithms, HyFlex can b e used to im- plemen t b oth p opulation-based and single p oint metaheuristics and h yp er-heuristics. Moreo ver, it provides a rich v ariety of fully implemented combinatorial optimisation problems including real-w orld instance data. The framework is written in jav a whic h is familiar to and commonly used by many researc hers. It also benefits from ob ject orientation, platform indep endence and auto- matic memory management. At the highest level the framework consists of just tw o abstract classes: ProblemDomain and Hyp erHeuristic. The structure of these classes is sho wn in the class diagram of figure 2. In the diagram, the signatures adjacen t to circles are public metho ds and fields, and the signatures adjacent to diamonds are protected. Abstract metho ds are denoted by italics, and the implementations of these metho ds are necessarily differen t for each problem domain class. 2.1.1 The ProblemDomain Class As shown in figure 2, an implemen tation of the ProblemDomain class provides the follo wing elements, each of whic h is easily accessed and managed with one or more metho ds. 1. A user-configurable memory (a p opulation) of solutions, which can b e managed by the h yp er-heuristic through metho ds such as setMemorySize and copySolution . 2. A routine to randomly initialise solutions, initialiseSolution( i ) , where i is the index of the solution index in the memory . 2.1.2 Description Problem form ulation : ‘SA T’ refers to the bo olean satisfiability problem. This problem in volv es determining if there is an assignment of the b o olean v ariables 5 T able 1 MAX-SA T instances name source v ariables clauses 1 contest02-Mat26.sat05-457.resh uffled-07 CRIL (2007) 744 2464 2 hidden-k3-s0-r5-n700-01-S2069048075.sat05-488.reshuffled-07 CRIL (2007) 700 3500 3 hidden-k3-s0-r5-n700-02-S350203913.sat05-486.reshuffled-07 CRIL (2007) 700 3500 4 parity-games/instance-n3-i3-pp CRIL (2009) 525 2276 5 parity-games/instance-n3-i3-pp-ci-ce CRIL (2009) 525 2336 6 parity-games/instance-n3-i4-pp-ci-ce CRIL (2009) 696 3122 7 highgirth/3SA T/HG-3SA T-V250-C1000-1 Argelich et al (2009) 250 1000 8 highgirth/3SA T/HG-3SA T-V250-C1000-2 Argelich et al (2009) 250 1000 9 highgirth/3SA T/HG-3SA T-V300-C1200-2 Argelich et al (2009) 300 1200 10 MAXCUT/SPINGLASS/t7pm3-9999 Argelic h et al (2009) 343 2058 of a formula, whic h results in the whole formula ev aluating to true. If there is suc h an assignmen t then the formula is said to b e satisfiable, and if not then it is unsatisfiable. An example formula is given in equation 2, which is satisfied when x 1 = f alse x 2 = f alse x 3 = tr ue and x 4 = f alse . ( x 1 ∨ ¬ x 2 ∨ ¬ x 3 ) ∧ ( ¬ x 1 ∨ x 3 ∨ x 4 ) ∧ ( x 2 ∨ ¬ x 3 ∨ ¬ x 4 ) (1) HyFlex implements one of SA T’s related optimisation problems, the maximum sat- isfiabilit y problem (MAX-SA T), in whic h the ob jective is to find the maximum n umber of clauses of a given Bo olean formula that can b e satisfied by some assign- men t. The problem can also b e form ulated as a minimisation problem, where the ob jectiv e is to minimise the num b er of unsatisfied clauses. Solution initialisation : The solutions are initialised b y randomly assigning a true or false v alue to each v ariable. Ob jectiv e function : The fitness function returns the n umber of ‘broken’ clauses, whic h are those which ev aluate to false. Instance data : The ten training instances and their sources are summarised in T able 2. 2.1.3 Sear ch Op er ators This domain con tains a total of 9 search op erators, summarised by F ukunaga (2008). Before describing them, find b elow four relev ant definitions. Let T b e the state of the formula b efore the v ariable is flipp ed, and let T 0 b e the state of the form ula after the v ariable is flipp ed. Net gain of a v ariable is defined as the num b er of broken clauses in T minus the n umber of broken clauses in T 0 . P ositiv e gain of a v ariable is the num b er of broken clauses in T that are satisfied in T 0 . Negativ e gain of a v ariable is the n umber of satisfied clauses in T that are brok en in T 0 . Age of a v ariable is the num b er of v ariable flips since it was last flipp ed. Mutational heuristics h 1 : GSA T: Flip the v ariable with the highest net gain, and break ties randomly (Selman et al, 1992). 6 h 2 : HSA T: Iden tical functionality to GSA T, but ties are broken by selecting the v ariable with the highest age (Gen t and W alsh, 1993). h 3 : W alkSA T: Select a random broken clause BC. If an y v ariables in BC ha ve a negativ e gain of zero, randomly select one of these to flip. If no such v ariable exists, flip a random v ariable in BC with probability 0.5, otherwise flip the v ariable with minimal negative gain (Selman et al, 1994). h 4 : Nov elty: Select a random broken clause BC. Flip the v ariable v with the highest net gain, unless v has the minimal age in BC. If this is the case, then flip it with 0.3 probability . Otherwise flip the v ariable with the second highest net gain (McAllester et al, 1997). Ruin-recreate heuristics h 5 : A prop ortion of the v ariables is randomly reinitialised. Lo cal searc h heuristics h 6 : This is a first-improv ement local search. In each iteration, flip a v ariable selected completely at random. h 7 : This is a first-improv ement lo cal search. In each iteration, flip a randomly selected v ariable from a randomly selected brok en clause. Crosso v er heuristics h 8 : Standard one p oint crossov er on the b o olean strings of v ariables. h 9 : Standard tw o p oint crossov er on the b o olean strings of v ariables. 3. A set of problem sp ecific heuristics, which are used to mo dify solutions. These are called by the user’s hyper-heuristic with the applyHeuristic( i, j, k ) method, where i is the index of the heuristic to call, j is the index of the solution in memory to mo dify , and k is the index in memory where the resulting solution should b e put. Each problem-sp ecific heuristic in eac h problem domain is classified in to one of four groups, shown b elo w. The heuristics b elonging to a sp ecific group can b e accessed b y calling getHeuristicsOfType( ty pe ) . – Mutational or p erturbation heuristics: p erform a small change on the solution, b y swapping, changing, removing, adding or deleting solution comp onents. – Ruin-recreate (destruction-construction) heuristics: partly destroy the solution and rebuild or recreate it afterwards. These heuristics can be considered as large neigh b ourho o d structures. They are, how ev er, different from the m utational heuristics in that they can incorp orate problem sp ecific construction heuristics to rebuild the solutions – Hill-climbing or lo cal search heuristics: iteratively make small changes to the solution, only accepting non-deteriorating solutions, until a lo cal optimum is found or a stopping condition is met. These heuristics differ from mutational heuristics in that they incorp orate an iterativ e improv ement pro cess, and they guaran tee that a non-deteriorating solution will b e pro duced. – Crossov er heuristics: take tw o solutions, combine them, and return a new solu- tion. 4. A v aried set of instances that can b e easily loaded using the metho d loadInstance( a ) , where a is the index of the instance to b e loaded. 5. A fitness function, whic h can be called with the getFunctionValue( i ) method, where i is the index of the required solution in the memory . HyFlex problem do- mains are alw ays implemented as minimisation problems, so a low er fitness is alw ays sup erior. The fitness of the best solution found so far in the run can b e obtained with the getBestSolutionValue() method. 7 Fig. 2 Class diagram for the HyFlex framew ork. 6. Two parameters: α and β , (0 < = [ α, β ] < = 1), whic h are the ‘in tensity’ of muta- tion and ‘depth of search’, resp ectively , that control the b ehaviour of some searc h op erators. 2.1.4 The HyperHeuristic Class The Hyp erHeuristic class is designed to allow algorithms whic h implement this class to b e compared and benchmark ed across one or more of the problem domains av ailable (for example, in a comp etition). Users create cross-domain heuristic algorithms by creating implementations of this abstract class. Each class must contain a toString() metho d, to giv e the metho dology a name. It m ust also contain a solve() method, in whic h the functionality of the particular metho dology is written. The solve() method w ould normally contain a loop, whic h con tinues while the time limit (defined b y the user) has not been exceeded. In the lo op, the code should pro vide a mec hanism for selecting b etw een the a v ailable problem-sp ecific heuristics, and c ho ose to whic h solutions in memory to apply the heuristics. This class could choose to work with a memory size of 1 f or a single p oint search, or a large memory could b e main tained for a p opulation based approach. The memory can b e easily defined and 8 main tained through calling metho ds of the ProblemDomain class, where the memory is stored. A h yp er-heuristic class automatically records the length of time for whic h it has b een running, and this can b e monitored through metho ds such as hasTimeExpired() and getElapsedTime() . The solve metho d is the only method which must b e implemented, all other common functionalit y is provided by the HyFlex softw are, such as the timing function and the recording of the best solution. 2.2 Running a Hyp er-Heuristic Algorithm 1 shows the ease with which a h yp er-heuristic can be run on a problem domain. An ob ject is created for the problem domain (in this example MAX-SA T), and for the h yp er-heuristic, each with a random seed. Then a problem instance is loaded from the selection av ailable in the problem domain ob ject. In this example we choose the instance with index 0. The problem domain is now set up for the hyper-heuristic. W e set the time for whic h the hyper-heuristic will run, in milliseconds. Then the h yp er-heuristic ob ject is giv en a reference to the problem domain ob ject. No w that the setup is complete, the run() metho d of the hyper-heuristic is called, to start the searc h pro cess. The hyper-heuristic will run for 60 seconds in this example, and the b est solution found during that time is retriev able with the getBestSolutionValue() metho d, as sho wn in algorithm 1. Algorithm 1 Jav a co de for running a hyper-heuristic on a problem domain ProblemDomain problem = new SA T(1234); Hyp erHeuristic HHOb ject = new ExampleHyp erHeuristic1(5678); problem.loadInstance(0); HHOb ject.setTimeLimit(60000); HHOb ject.loadProblemDomain(problem); HHOb ject.run(); System.out.prin tln(HHOb ject.getBestSolutionV alue()); 2.3 An Example Hyp er-Heuristic This section pro vides an example hyper-heuristic, to illustrate the ease with whic h a h yp er-heuristic can b e created. This is done by extending the HyperHeuristic abstract class, and implementing only one metho d. All of the common functionality is provided b y the HyFlex soft ware, suc h as the timing function and the recording of the best solution. This example demonstrates exactly how to use certain elements of HyFlex functionalit y , including the solution memory . After the run() method of the hyper-heuristic is called (see section 2.2), the hyper- heuristic abstract class p erforms some housekeeping tasks, such as initialising the timer, and then calls the solve metho d of the chosen hyper-heuristic. In our example this is an ob ject of the class ExampleHyperHeuristic1 . Algorithm 2 shows the code for the solve() metho d in ExampleHyperHeuristic1 . It shows that very few lines of co de are 9 necessary in order to implement a h yp er-heuristic metho d with the HyFlex framework. Algorithm 2 is written in pseudo co de, but eac h line corresp onds to no more than one line of actual Jav a co de. The solve() metho d is the only substantial metho d which needs to b e implemen ted. Indeed the only other necessary method is toString() , which requires one line to giv e the h yp er-heuristic a name. F rom Algorithm 2, we can see that the solve() metho d takes the problem domain ob ject as an argumen t, and first chec ks for the num b er of search operators a v ailable within it. W e also initialise a v alue to store the curren t ob jectiv e function v alue. It is also necessary to initialise at least one solution in the memory . The default memory size is 2, and we initialise the solution at index 0, which means we build an initial solution with the metho d sp ecified in the problem domain (generally a fast randomised constructive heuristic). The solution at index 1 remains uninitialised, and therefore has a v alue of null . An implemented hyper-heuristic must alwa ys contain a while lo op which chec ks if the time limit has expired. The code within the loop sp ecifies the main functionality of the hyper-heuristic. In this example, we choose a random op erators, and then apply it to the solution at index 0. The modified solution is put in the memory at index 1 (previously not initialised). Note that a random num ber generator rng is provided by the Hyp erHeuristic abstract class. This is created when the h yp er-heuristic ob ject’s constructor is called, and is the reason why that constructor requires a random seed. If the new solution is sup erior to the old solution, it is accepted, and the new solution ov erwrites the old one in memory . The copySolution method of the problem domain class is employ ed to manage this. If the new solution is not sup erior, then the new solution is accepted with 0.5 probability . Algorithm 2 Pseudoco de for the solv e method of ExampleHyperHeuristic1 . This is called when the run() method of the hyper-heuristic is called (see algorithm 1) Require: A ProblemDomain ob ject, problem in t num b erOfHeuristics = problem.getNumberOfHeuristics double curren tOb jV alue = Double.POSITIVE-INFINITY problem.initialiseSolution(0) while hasTimeExpired = F ALSE do in t h = rng.nextInt(n umberOfHeuristics) double newOb jV alue = problem.applyHeuristic(h, 0, 1) double delta = currentOb jV alue - newOb jV alue if delta > 0 then problem.cop ySolution(1, 0) curren tOb jV alue = newOb jV alue; else if rng.nextBoolean = TRUE then problem.cop ySolution(1, 0) curren tOb jV alue = newOb jV alue; end if end if end while 10 2.4 Summary of HyFlex Description In this section, we ha ve giv en an ov erview of the HyFlex framework, and demonstrated that it is very easy to create and run a hyper-heuristic using the framew ork. The contri- bution of HyFlex is that the hyper-heuristic developer now do es not need exp ertise in an y of the problem domains. The dev elop er is therefore free to fo cus their research ef- forts in to developing hyper-heuristic methodologies which can be sho wn to be generally successful across a range of problem domains. 3 HyFlex Problem Domains Curren tly , four problem domain mo dules are implemented(whic h can b e downloaded from CHeSC (2011)): maxim um satisfiability (MAX-SA T), one-dimensional bin pack- ing, p erm utation flow shop, and p ersonnel scheduling. Each domain includes 10 training instances from different sources, and num b er of problem-sp ecific heuristics of the types discussed in section 2.1. 3.1 Maximum Satisfiabilit y (MAX-SA T) 3.1.1 Description Problem formulation : ‘SA T’ refers to the b o olean satisfiability problem. This prob- lem in volv es determining if there is an assignmen t of the bo olean v ariables of a form ula, whic h results in the whole form ula ev aluating to true. If there is suc h an assignment then the formula is said to be satisfiable, and if not then it is unsatisfiable. An example form ula is given in equation 2, whic h is satisfied when x 1 = f alse x 2 = f alse x 3 = tr ue and x 4 = f alse . ( x 1 ∨ ¬ x 2 ∨ ¬ x 3 ) ∧ ( ¬ x 1 ∨ x 3 ∨ x 4 ) ∧ ( x 2 ∨ ¬ x 3 ∨ ¬ x 4 ) (2) HyFlex implements one of SA T’s related optimisation problems, the maxim um satisfiabilit y problem (MAX-SA T), in which the ob jectiv e is to find the maximum n umber of clauses of a giv en Boolean formula that can b e satisfied b y some assignmen t. The problem can also be formulated as a minimisation problem, where the ob jective is to minimise the n umber of unsatisfied clauses. Solution initialisation : The solutions are initialised by randomly assigning a true or false v alue to each v ariable. Ob jectiv e function : The fitness function returns the num b er of ‘broken’ clauses, whic h are those which ev aluate to false. Instance data : The ten training instances and their sources are summarised in T able 2. 3.1.2 Sear ch Op er ators This domain con tains a total of 9 searc h operators, summarised by F ukunaga (2008). Before describing them, find b elow four relev ant definitions. Let T b e the state of the form ula b efore the v ariable is flipp ed, and let T 0 b e the state of the formula after the v ariable is flipp ed. 11 T able 2 MAX-SA T instances name source v ariables clauses 1 contest02-Mat26.sat05-457.resh uffled-07 CRIL (2007) 744 2464 2 hidden-k3-s0-r5-n700-01-S2069048075.sat05-488.reshuffled-07 CRIL (2007) 700 3500 3 hidden-k3-s0-r5-n700-02-S350203913.sat05-486.reshuffled-07 CRIL (2007) 700 3500 4 parity-games/instance-n3-i3-pp CRIL (2009) 525 2276 5 parity-games/instance-n3-i3-pp-ci-ce CRIL (2009) 525 2336 6 parity-games/instance-n3-i4-pp-ci-ce CRIL (2009) 696 3122 7 highgirth/3SA T/HG-3SA T-V250-C1000-1 Argelich et al (2009) 250 1000 8 highgirth/3SA T/HG-3SA T-V250-C1000-2 Argelich et al (2009) 250 1000 9 highgirth/3SA T/HG-3SA T-V300-C1200-2 Argelich et al (2009) 300 1200 10 MAXCUT/SPINGLASS/t7pm3-9999 Argelic h et al (2009) 343 2058 Net gain of a v ariable is defined as the num b er of broken clauses in T minus the n umber of broken clauses in T 0 . P ositiv e gain of a v ariable is the num b er of broken clauses in T that are satisfied in T 0 . Negativ e gain of a v ariable is the num b er of satisfied clauses in T that are brok en in T 0 . Age of a v ariable is the num b er of v ariable flips since it was last flipp ed. Mutational heuristics h 1 : GSA T: Flip the v ariable with the highest net gain, and break ties randomly (Selman et al, 1992). h 2 : HSA T: Iden tical functionalit y to GSA T, but ties are brok en b y selecting the v ariable with the highest age (Gent and W alsh, 1993). h 3 : W alkSA T: Select a random broken clause BC. If any v ariables in BC hav e a negativ e gain of zero, randomly select one of these to flip. If no such v ariable exists, flip a random v ariable in BC with probabilit y 0.5, otherwise flip the v ariable with minimal negative gain (Selman et al, 1994). h 4 : Nov elty: Select a random brok en clause BC. Flip the v ariable v with the highest net gain, unless v has the minimal age in BC. If this is the case, then flip it with 0.3 probabilit y . Otherwise flip the v ariable with the second highest net gain (McAllester et al, 1997). Ruin-recreate heuristics h 5 : A prop ortion of the v ariables is randomly reinitialised. Lo cal searc h heuristics h 6 : This is a first-improv ement lo cal searc h. In eac h iteration, flip a v ariable selected completely at random. h 7 : This is a first-improv ement lo cal searc h. In eac h iteration, flip a randomly se- lected v ariable from a randomly selected broke n clause. Crosso v er heuristics h 8 : Standard one p oint crossov er on the b o olean strings of v ariables. h 9 : Standard tw o p oint crossov er on the b o olean strings of v ariables. 12 3.2 One Dimensional Bin Pac king 3.2.1 Description Problem form ulation : The classical one dimensional bin packing problem consists of a set of pieces, which must b e pack ed into as few bins as p ossible. Each piece j has a weigh t w j , and each bin has capacity c . The ob jectiv e is to minimise the num b er of bins used, where eac h piece is assigned to one bin only , and the weigh t of the pieces in each bin does not exceed c . T o av oid large plateaus in the search space around the b est solutions, we employ an alternative fitness function to the num b er of bins. A mathematical formulation of the bin packing problem is shown in equation 3, taken from (Martello and T oth, 1990). Minimise n X i =1 y i Sub ject to n X j =1 w j x ij ≤ cy i , i ∈ N = { 1 , . . . , n } , n X i =1 x ij = 1 , j ∈ N , y i ∈ { 0 , 1 } , i ∈ N , x ij ∈ { 0 , 1 } , i ∈ N , j ∈ N , (3) Where y i is a binary v ariable indicating whether bin i contains pieces, x ij indicates whether piece j is pack ed in to bin i , and n is the num b er of a v ailable bins (and also the n umber of pieces as we know we can pack n pieces into n bins). Solution initialisation : Solutions are initialised by first randomising the order of the pieces, and then applying the ‘first-fit’ heuristic (Johnson et al, 1974). This is a constructiv e heuristic, which pac ks the pieces one at a time, each into the first bin into whic h they will fit. Ob jectiv e function : A solution is given a fitness calculated from equation 4, where: n = num b er of bins, f ul lness i = sum of all the pieces in bin i , and C = bin capacity . The function puts a premium on bins that are filled completely , or nearly so. It returns a v alue betw een zero and one, where low er is better, and a set of completely full bins w ould return a v alue of zero. F itness = 1 − P n i =1 ( f ull ness i /C ) 2 n ! (4) Instance data : The ten training instances and their sources are summarised in T able 3. 3.2.2 Sear ch Op er ators This domain contains a total of 8 search op erators, some of which are taken from (Bai et al, 2007). Mutational heuristics 13 T able 3 Bin packing instances name source capacit y no. pieces 1 falk enauer/u1000-00 ESICUP (2011) 150 1000 2 falk enauer/u1000-01 ESICUP (2011) 150 1000 3 schoenfieldhard/BPP14 ESICUP (2011) 1000 160 4 schoenfieldhard/BPP832 ESICUP (2011) 1000 160 5 10-30/instance1 Hyde (2011) 150 2000 6 10-30/instance2 Hyde (2011) 150 2000 7 triples1002/instance1 Hyde (2011) 1000 1002 8 triples2004/instance1 Hyde (2011) 1000 2004 9 test/testdual4/binpack0 ESICUP (2011) 100 5000 10 test/testdual7/binpack0 ESICUP (2011) 100 5000 h 1 : Select tw o differen t pieces at random, and swap them if there is space. If one of the pieces does not fit into the new bin then put it into an empty bin. h 2 : This heuristic selects a bin at random from those with more pieces than the a verage. It then splits this bin in to tw o bins, each containing half of the pieces from the original bin. h 3 : Remov e all of the pieces from the low est filled bin, and repack them in to the other bins if possible, with the b est-fit heuristic. Ruin-recreate heuristics h 4 : Remov e all the pieces from the x highest filled bins, where x is an integer determined b y the ‘in tensity of mutation’ parameter. Repack the pieces using the b est-fit heuristic. h 5 : Remov e all the pieces from the x low est filled bins, where x is an integer de- termined by the ‘intensit y of mutation’ parameter. Repack the pieces using the b est-fit heuristic. Lo cal searc h heuristics These heuristics implement first-improv ement local searc h operators. In eac h iter- ation, a neighbour is generated, and it is accepted immediately if it has sup erior or equal fitness. If the neigh b our is worse, then the change is not accepted. h 6 : A first-improv ement lo cal searc h. In eac h iteration, select t wo different pieces at random, and swap them if there is space, and if it will pro duce an improv ement in fitness. h 7 : A first-impro v ement lo cal searc h. T ake the largest piece from the lo west filled bin, and exchange with a smaller piece from a randomly selected bin. If there is no such piece that pro duces a v alid packing after the swap, then exchange the first piece with two pieces that ha ve a smaller total size. If there are no such pieces then the heuristic does nothing. Crosso v er heuristics h 8 : Exon shuffling crosso v er (Rohlfshagen and Bullinaria, 2007). The bins from b oth paren ts are ordered by wasted space, least first. Then all of the mutually exclusiv e bins are added to the offspring. In the second phase, the remaining bins from the paren ts are added to the offspring b y remo ving an y duplicate pieces. 14 3.3 Perm utation Flow Shop 3.3.1 Description Problem formulation : The p ermutation flo w shop problem consists of finding the order in whic h n jobs are to be pro cessed in m consecutive machines. The jobs are pro cessed in the order machine 1, machine 2, . . . , mac hine m . Machines can only pro cess one job at a time and jobs can b e pro cessed b y only one machine at a time. No job can jump ov er an y other job, meaning that the order in which jobs are pro cessed in mac hine 1 is maintained throughout the system. Moreov er, no machine is allo wed to remain idle when a job is ready for pro cessing. All jobs and machines are av ailable at time 0. Eac h job i requires a pro cessing time on mac hine j denoted by p ij . Giv en a permutation π = π (1) , . . . , π ( n ), where π ( q ) is the index of the job as- signed in the q -th place, a unique schedule is obtained by calculating the starting and completion time of each job on each machine. The starting time star t π ( q ) ,j of the q -th job on mac hine j is calculated as: star t π ( q ) ,j = max { star t π ( q ) ,j − 1 , star t π ( q − 1) ,j } , with star t π (0) ,j = 0 and star t π ( q ) , 0 = 0 , and its completion time is calculated as: C π ( q ) ,j = star t π ( q ) + p π ( q ) ,j . Giv en a schedule, let C i b e the time when job i finishes its pro cessing on machine m . The ob jectiv e is to find the pro cessing order of n jobs in such a wa y that the resultan t sc hedule minimises the completion time of the last job to exit the shop, i.e. minimises max i C i . Solution initialisation : Solutions are created with a randomised v ersion of the widely used NEH algorithm (Naw az et al, 1983), which works as follows. First a random p erm utation of the jobs is generated. Second, a schedule is constructed from scratch b y assigning the first job in the permutation to an empt y sc hedule; the second job is then assigned to places 1 and 2 and fixed where the partial schedule has the smallest mak espan; the third job is assigned to places 1, 2 and 3 and fixed to the place where the partial sc hedule has the smallest mak espan, and so on. Ob jectiv e function : The fitness function returns max i C i . Representing the comple- tion time of the last job in the schedule. Instance data : The ten training instances and their sources are summarised in T able 4. 3.3.2 Sear ch Op er ators A total of 15 searc h op erators are implemen ted for this problem domain. Mutational heuristics h 1 : Reinserts a randomly selected job into a randomly selected position in the p erm utation, shifting the rest of the jobs as required. h 2 : Swaps tw o randomly selected jobs in the p ermutation. h 3 : Randomly shuffles the entire p ermutation. 15 T able 4 P ermutation flowshop instances instance name source no. jobs no. machines 1 100x20/1 T aillard (2010) 100 20 2 100x20/2 T aillard (2010) 100 20 3 100x20/3 T aillard (2010) 100 20 4 100x20/4 T aillard (2010) 100 20 5 100x20/5 T aillard (2010) 100 20 6 200x10/2 T aillard (2010) 200 10 7 200x10/3 T aillard (2010) 200 10 8 500x20/1 T aillard (2010) 500 20 9 500x20/2 T aillard (2010) 500 20 10 500x20/4 T aillard (2010) 500 20 h 4 : Creates a new solution using NEH (describ ed ab ov e) and using the current p erm utation to rank the jobs. h 5 : Shuffles k randomly selected elements in the permutation, where k = 2 + b α · ( n − 2) c , and α is the mutati on intensit y parameter. Ruin-recreate heuristics h 6 : Remov e l , l = b α · ( n − 1) c , randomly selected jobs and reinsert them in an NEH fashion. This heuristic resembles the main comp onent of the iterated greedy heuristic prop osed b y Ruiz and St ¨ utzle. (2007b) for the permutation flo w shop and later by Ruiz and St ¨ utzle. (2007a) for the p ermutation flow shop with sequence dep enden t setup times. h 7 : Remov e l , where l is as ab ov e, randomly selected jobs, reinsert them in an NEH fashion but this time, at every iteration of the NEH pro cedure the b est q , q = b β · ( l − 1) c + 1, sequences generated so far are considered for the reinsertion. Lo cal searc h heuristics h 8 : This is a steep est descen t local searc h. A t ev ery iteration eac h job is remov ed from its current p osition and assigned in to all remaining p ositions. The job is fixed to the p osition that leads to the b est sc hedule. This is rep eated un til no impro vemen t is observed. h 9 : This is a first impro vemen t lo cal search. At every iteration each job is remov ed from its current p osition and assigned into the remaining p ositions. This time, if an improv ement mo vemen t is found, this is immediately accepted, and the searc h contin ues with the next job. This is rep eated until no improv ement is observ ed. h 10 : This is a random single lo cal search pass. In this, r = b β ( n − 1) c + 1 randomly selected jobs are tested (one at a time) on all p ositions and fixed to the b est p ossible place. This is only done once. h 11 : This is a first improv ement random single lo cal searc h pass. This is as h 9 but jobs are assigned to the first place that improv es the current schedule, i.e. jobs are not necessarily tested in all p ositions. This is only done once. Crosso v er heuristics The follo wing crosso ver heuristics tak e t wo p ermutations as an input and return a single new p ermutation as offspring. These op erators hav e b een designed for p erm utation representation problems, including scheduling problems. h 13 : Partially mapp ed crossov er (PMX): first proposed by Goldb erg and Lingle (1985), as a recombination op erator for the trav eling salesman prob em (TSP). 16 It builds an offspring by choosing a subsequence of a tour from one parent and preserving the order and position of as many elements (cities in the case of TSP) as possible. A subsequence of a tour is selected by randomly c ho osing t wo cut p oints, which serves as b oundaries for the swapping op erations. h 12 : Order crossov er (O X): proposed by Da vis (1985) for order-based p ermutation problems. It builds an offspring p ermutation by c ho osing a subsequence of a solution from one parent and preserving the relative order of elements from the other parent. The OX op erator exploits the prop erty that the relative order of the elemen ts (as opp osed to their sp ecific p ositions) is imp ortant. h 14 : Precedence preserv ative crossov er (PPX): indep endently developed for the v e- hicle routing problems by Blanton and W ain wright (1993), and for sc heduling problems by Bierwirth et al (1996). PPX transmits precedence relations of op- erations given in tw o paren tal permutations to one offspring at the same rate, while no new precedence relations are introduced. h 15 : This op erators selects a single crossov er p oint and pro duces a new p ermutation b y copying all of the elements from one parent, up to the crossov er p oint. Then the remaining elements are copied from the other paren t, in the order that they app ear. 3.4 Personnel Sc heduling 3.4.1 Description Problem formulation : Most of the personnel scheduling instances could justifiably b e lab elled as a new and differen t problem rather than just a different instance. This is b ecause most instances contain unique constraints and ob jectives, not just different instance parameters (suc h as the n umber of employ ees, shift t yp es, planning p erio d length, constrain t priorities etc). The reason for this v ariety is that each instance is tak en from a different organisation or workplace and eac h workplace has its o wn set of rules and requiremen ts. How ever, there is clearly a similar structure b etw een in- stances and there are some constraints that are nearly alwa ys present. F or example, co ver constraints, holiday requests, maximum and minimum workloads etc. The result of this v ariety though is that it is arguably imp ossible to provide a standard mathe- matical mo del for ‘The Personnel Scheduling Problem’ and we will not attempt to do so here. How ever, for more information on the constrain ts and ob jectiv es present in the instances used here (and an integer programming form ulation of one of them) we refer the reader to Curtois (2010). Solution initialisation : The solution is initialised using lo cal search heuristic h 5 whic h adds shifts to each employ ee’s schedule in a greedy , first improv ement manner. Instance data : The instances used are listed in T able 5. 3.4.2 Sear ch Op er ators A total of 12 searc h op erators are implemen ted for this problem domain. Mutational heuristics h 1 : This heuristic randomly un-assigns a num b er of shifts. The num b er of shifts un-assigned is prop ortional to the in tensity of mutation parameter. 17 T able 5 P ersonnel scheduling instances shift name source staff t yp es days 1 BCV-3.46.1 Curtois (2009) 46 3 26 2 BCV-A.12.2 Curtois (2009) 12 5 31 3 OR TEC02 Curtois (2009) 16 4 31 4 Ikegami-3Shift-D A T A1 Ik egami and Niwa (2003) 25 3 30 5 Ikegami-3Shift-D A T A1.1 Ikegami and Niwa (2003) 25 3 30 6 Ikegami-3Shift-D A T A1.2 Ikegami and Niwa (2003) 25 3 30 7 CHILD-A2 Curtois (2009) 41 5 42 8 ERR VH-A Curtois (2009) 51 8 42 9 ERR VH-B Curtois (2009) 51 8 42 10 MER-A Curtois (2009) 54 12 42 Ruin-recreate heuristics The ruin and recreate heuristics implemen ted are based on the one presented b y Burke et al (2008). The heuristic works b y un-assigning all the shifts in one or more randomly selected employ ees’ schedules b efore heuris- tically rebuilding them. They are rebuilt b y firstly satisfying ob jectives related to requests to w ork certain days or shifts and then b y satisfying ob jectiv es related to week ends. F or example min/max week ends on/off, min/max consecutiv e w ork- ing or non-working week ends, b oth days of the week end on or off etc. Other shifts are then added to the employ ee’s schedule in a greedy fashion (first improv ement) attempting to satisfy the rest of the ob jectives. h 2 : Burke et al (2008) observed that it was best to un-assign and rebuild only 2-6 w ork patterns at a time (for instances of all sizes). F or this reason the first ruin and recreate heuristic un-assigns x schedules where x is calculated using the in tensity of mutation parameter as follows: x = Round(intensityOfMutation * 4) + 2 h 3 : This heuristic provides a larger change to the solution by setting x using: x = Round(intensityOfMutation * Number of employees in roster) h 4 : This heuristic creates a small p erturbation in the solution by using x = 1. Lo cal Searc h Heuristics h 5 : This is a first improv ement lo cal search which adds shifts to employ ees’ sched- ules. h 6 : This is a first improv emen t lo cal searc h whic h sw aps shifts b etw een t wo different emplo yees. An example of the type of sw ap this lo cal search ma y make is sho wn in Figure 3. The figure shows a section of a roster sho wing the the first ten days of the schedules for four employ ees: ‘A’, ‘B’, ‘C’ and ‘D’. The coloured squares lab elled ‘D’, ‘E’ and ‘N’ denote three different shifts t yp es (Early , Da y and Nigh t) h 7 : This is a first improv ement lo cal searc h whic h sw aps shifts in a single employ ee’s sc hedule. An example of the type of swap this lo cal search may make is shown in Figure 4. h 8 : This is based on the ejection c hain metho d described b y Burk e et al (2007). The maxim um search time for it is set as: the depth of search parameter multiplied b y 5 seconds. 18 h 9 : This is another version of the ejection chain metho d which incorp orates a greedy heuristic method for generating entire sc hedules for single employ ees. The max- im um search time for it is set as: the depth of search parameter m ultiplied b y 5 seconds. Fig. 3 An example of the types of swap made b y h 6 Fig. 4 An example of the types of swap made b y h 7 Crosso v er heuristics h 10 : This heuristic w as presented by Burke et al (2001). It op erates by identifying the best x assignments in eac h parent and making these assignmen ts in the offspring. The b est assignmen ts are identified by measuring the c hange in ob- jectiv e function when eac h shift is temp orarily unassigned in the roster. The b est assignments are those that cause the largest increase in the ob jectiv e func- tion v alue when they are unassigned. The parameter x ranges from 4-20 and is calculated using the in tensity of mutation parameter as b elow: x = 4 + round((1 - intensityOfMutation) * 16) h 11 : This heuristic was published in (Burke et al, 2010b). It creates a new roster by using all the assignments made in the parents. It makes those that are common to b oth paren ts first and then alternately selects an assignment from each parent and mak es it in the offspring unless the co ver ob jectiv e is already satisfied. h 12 : This heuristic creates the new roster by making assignments which are only common to b oth paren ts. 4 Algorithms This section presents three example algorithms created within the HyFlex softw are framew ork. W e present these algorithms in order to sho w the range of algorithms that can b e easily implemen ted in HyFlex. The results of these three algorithms are presen ted in section 5, to sho w the div ersity of their results across the differen t problem 19 instances and problem domains. Recall from section 2.1.1 that HyFlex problem domains are alw ays implemented as minimisation problems, so a low er fitness is sup erior. 4.1 Iterated Lo cal Search Iterated local searc h is a relatively straightforw ard algorithm. As often happens with man y simple but sometimes very effective ideas, the same principle has b een redisco v- ered multiple times, leading to different names (Baxter, 1981; Martin et al, 1992). The term iter ate d lo c al se ar ch w as proposed byLourenco et al (2002). The implementation rep orted here, first proposed in in (Burke et al, 2010a), contains a p erturbation stage during which a neigh b orho o d mov e is selected uniformly at random (from the av ailable p o ol) and applied to the incum b ent solution. This p erturbation phase is then follow ed b y an impro vemen t phase, in which all lo cal search heuristics are tested and the one pro ducing the b est improv ement is used. If the resulting new solution is b etter than the original solution then it replaces the original solution, otherwise the new solution is simply discarded. This last stage corresp onds to a greedy (only improv ements) ac- ceptance criterion. The pseudo-co de of this iterated lo cal search algorithm is shown b elo w (Algorithm 3). Algorithm 3 Iter ate d Lo c al Se ar ch . s 0 = GenerateInitialSolution s ∗ = Lo calSearc h( s 0 ) rep eat s 0 = P erturbation ( s ∗ ) s ∗ 0 = Lo calSearc h( s 0 ) if f ( s ∗ 0 ) < f ( s ∗ ) then s ∗ = s ∗ 0 end if un til time limit is reached 4.2 T abu Search Hyp er-heuristic with Adaptive Acceptance (TS-AA) The functionality of this h yp er-heuristic can b e split into t wo parts, the heuristic selection mec hanism and the mo ve acceptance criteria. The pseudoco de for TS-AA can b e seen in Algorithm 4. Heuristic selection mechanism for TS-AA : This h yp er-heuristic implemen ts the heuristic selection mechanism prop osed in (Burke et al, 2003b).The algorithm main- tains a v alue for each of the problem-sp ecific heuristics, excluding the crosso ver t yp e heuristics. The crosso ver heuristics are not used at all by this hyper-heuristic. The heuristic’s v alue represents how well it has p erformed recently , and all heuristics hav e a v alue of zero at the b eginning of the search. The mechanism also incorp orates a dy- namic tabu list of problem-sp ecific heuristics that are temp orarily excluded from the a v ailable heuristics. A t eac h iteration, the heuristic with the highest v alue is selected (breaking ties ran- domly), from those not in the tabu list. Therefore, the heuristics which ha ve performed 20 w ell recently will b e chosen more often. If the heuristic finds a b etter solution, then its v alue is increased. If it finds a w orse solution, its v alue is decreased. Acceptance criterion for TS-AA : The acceptance criterion accepts all impro ving solutions. Other solutions are accepted with a probability β . whic h changes dep ending on whether the search appears to be progressing or stuc k in a local optimum. The β v alue begins at zero, thus, initially , it is an accept-only-impro ving strategy . How ever, if the solution do es not impro ve for 0.1 seconds, then β is increased by 5%, making it more lik ely that a w orse solution is accepted. It is increased to 10% if there is no further impro vemen t in the next 0.1 seconds. Conv ersely , if the search is progressing well, with no decrease in fitness in the last 0.1 seconds, then β is reduced b y 5%, making it less lik ely for a worse solution to b e accepted. These mo difications are intended to help the searc h navigate out of lo cal optima, and to fo cus the search if it is progressing well. 4.3 Memetic Algorithm This algorithm illustrates a population based approac h implemented with HyFlex. It represen ts a steady-state ev olutionary algorithm that incorporates multiple memes (a memetic algorithm). The pseudoco de is given in Algorithm 5. First a p opulation of 10 solutions is generated, each one initialised with the initialiseSolution() method pro vided by each problem domain. Two solutions are selected with a binary tournament metho d, and then a crossov er type heuristic (selected uniformly at random from the a v ailable set) is applied to pro duce one offspring. With 0.1 probability , the offspring is p erturb ed with a mutation heuristic (selected uniformly at random from the a v ailable set). Then the solution is further modified with either a lo cal search heuristic or a ruin-recreate heuristic, chosen with a 0.5 probability (also selected each uniformly at random). If the new solution is equal to or b etter than the w orst of the parents, then the offspring replaces it. 5 Exp erimen ts and Results This section compares the three algorithms describ ed in section 4 implemented with HyFlex. Exactly the same algorithms are used for each domain and instance. No domain-sp ecific (or instance-sp ecific) tuning process is applied. The goal is not to determine whic h is the b est p erforming algorithm, but instead to illustrate the b e- ha viour of differen t algorithmic designs in HyFlex. The 10 training instances for each domain, as describ ed in section 2 (T ables 2-5), were considered. F or each instance and algorithm, 5 runs w ere conducted, each lasting 10 CPU minutes. This experimental setup resembles that designed for the CHeSC comp etition. The exp eriments were con- ducted on a PC (running Windo ws XP) with a 2.33GHz In tel(R) Core(TM)2 Duo CPU and 2GB of RAM. The follo wing subsections presen t our results from three different p ersp ectiv es: ordinal data analysis (5.1), distribution of b est ob jectiv e function v alues on one selected instance per domain (5.2), performance b ehaviour ov er time on one example instance of the bin pac king domain (5.3). 21 Algorithm 4 The T abu Search Hyp er-heuristic with Adaptive Acceptance Create a initial solution s Initialise the v alue of each heuristic to 0 α = 1 β = 0 t = tabu tenure = num b er of heuristics-1 rep eat Create a cop y of the current solution: s 0 ← s H = heuristic with highest v alue apply H to s 0 if f unc ( s 0 ) < f unc ( s ) then { the new solution is sup erior } incr easeV al ue ( H, α ) else if f unc ( s 0 ) < f unc ( s ) then { the new solution is worse } empt y the tabu list decr easeV al ue ( H, α ) add H to the tabu list else if f unc ( s 0 ) = f unc ( s ) then add H to the tabu list release heuristics in tabu list for longer than t iterations end if if f unc ( s 0 ) < f unc ( s ) then s ← s 0 else if random [1 , 100] < β then s ← s 0 end if end if if 0.1s since last improv ement then { mak e it more likely to accept worse solutions } β ← β + 5 end if if 0.1s since last decrease in fitness then { mak e it less likely to accept worse solutions } β ← β − 5 end if un til time limit is reached 5.1 Borda count Ordinal data analysis methods can be applied to compare alternativ e searc h algorithms or metaheuristics (T albi, 2009). This approac h is adequate because our empirical study considers different domains and instances with v aried magnitudes and ranges of the ob jectiv e v alues. Let us assume that m instances (considering all the domains) and n comp eting algorithms in total are considered. F or each experiment (instance) an ordinal v alue o k is given represen ting the rank of the algorithm compared to the others (1 ≤ o k ≤ n ). Ordinal metho ds aggregate and summarise m linear orders o k in to a single linear order O . W e use here a straight forw ard ordinal aggregation metho d kno w as the Borda c ount voting metho d (after the F renc h mathematician Jean-Charles de 22 Algorithm 5 Memetic algorithm population = Create a initial p opulation of 10 solutions rep eat s1 = binaryT ournament(population); s2 = binaryT ournament(population); h = randomly selected crosso ver heuristic s 0 = applyheuristic( h , s1, s2); Apply randomly selected m utation heuristic to s 0 if rand < 0 . 5 then h = randomly selected local search heuristic else h = randomly selected ruin-recreate heuristic end if applyheuristic( h , s 0 ); if f ( s 1) worse than f ( s 2) then s 1 ← s 0 else s 2 ← s 0 end if un til time limit is reached T able 6 Borda count results for all domains Domain TS-AA ILS M A MAX-SA T 12 27 21 1D Bin P acking 24 17 19 P ermutation Flow Shop 30 17 13 P ersonnel Scheduling 13 16 30 T otal 79 77 83 Borda, who first prop osed it in 1770). An algorithm having a rank o k in a giv en instance is simply giv en o k p oin ts, and the total score of an algorithm is the sum of its ranks o k across the m instances. The metho ds are, therefore, compared according to their total score, with the smallest score representing the b est p erforming algorithm. In our comparativ e study , the num b er of instances, m , is 40 (10 for each domain). Therefore, for a giv en domain the b est possible score is 10, while the best possible total score (considering all the domains) is 40. The ranks were calculated using as a metric the median of the best ob jective functions obtained across the 5 runs p er instance. T able 6 shows the total Borda scores for the three comp eting algorithms, including the total scores p er domain. Notice that although TS-AA pro duces the best scores in t wo domains: MAX-SA T and p ermutation flow shop; the ILS algorithm obtains the b est o verall scores, although b y a minimal difference. T ables 7-10 sho w the Borda count (ranks) for each instance on the four domains, where 1 represents the b est rank. These tables are useful to assess how homogeneous the results are for the ten instances on eac h domain. F or example, for p ermutation flow shop and personnel scheduling (T ables 9-10) a single algorithm is consistently ranking 3rd, whereas this is not the case for MAX-SA T and bin packing (T ables 7-8). 23 T able 7 Borda coun t results for MAX- SA T MAX-SA T TS-AA ILS M A Instance1 2 3 1 Instance2 2 3 1 Instance3 1 3 2 Instance4 1 2 3 Instance5 1 2 3 Instance6 1 2 3 Instance7 1 3 2 Instance8 1 3 2 Instance9 1 3 2 Instance10 1 3 2 T otal 12 27 21 T able 8 Borda coun t results for 1D Bin Pac king Bin P acking TS-AA ILS M A Instance1 3 1 2 Instance2 3 1 2 Instance3 3 2 1 Instance4 2 3 1 Instance5 2 1 3 Instance6 3 1 2 Instance7 3 1 2 Instance8 3 1 2 Instance9 1 3 2 Instance10 1 3 2 T otal 24 17 19 T able 9 Borda count results for permu- tation flowshop Flo w Shop TS-AA ILS M A Instance1 3 2 1 Instance2 3 1 2 Instance3 3 2 1 Instance4 3 2 1 Instance5 3 2 1 Instance6 3 2 1 Instance7 3 2 1 Instance8 3 1 2 Instance9 3 1 2 Instance10 3 2 1 T otal 30 17 13 T able 10 Borda count results for person- nel scheduling P ersonnel Sched. TS-AA ILS M A Instance1 1 2 3 Instance2 2 1 3 Instance3 1 1 3 Instance4 2 1 3 Instance5 1 2 3 Instance6 2 1 3 Instance7 1 2 3 Instance8 1 2 3 Instance9 1 2 3 Instance10 1 2 3 T otal 13 16 30 5.2 Distribution of the b est ob jective function v alues In addition to the Borda aggregation metho d presented ab o ve, the boxplots shown in figures 5-8 illustrate the magnitude and distribution of the best ob jectiv e v alues (at the end of the run) for a selected instance of each domain. Eac h figure represents the result of 10 runs from eac h algorithm. Arbitrarily we selected instance n um b er 1 from each domain, but similar distributions of results can b e observed in the other instances. F rom figures 5-8, it can b e observ ed that the p erformance of the three algorithms differs significantly o ver the four problem instances. F or example, in the max-sat in- stance (figure 5) the memetic algorithm (MA) performs the b est, while it performs the w orst in personnel scheduling instance(figure 8). The tabu search h yp er-heuristic (TS-AA) clearly performs the w orst on the instances of bin pac king and flo w shop (Figures 6-7), but p erforms the b est on the p ersonnel scheduling instance. The scale of Figure 8 means that it is difficult to see the difference b etw een TS-AA and ILS. This is b ecause the p ersonnel sc heduling domain applies penalties to solution that violates the constrain ts, and the memetic algorithm pro duced p o or solutions in this instance. 24 In summary , these b oxplots sho w that it is challenging to design an algorithm whic h op erates w ell o ver all the problem domains. When an algorithm impro ves on one domain, its solution quality may reduce on another domain. This can also b e true in a single domain, when an algorithm improv es on a particular problem instance, and its performance reduces on other instances of that domain. The challenge is to design online learning mec hanisms that can adapt on the fly , and thus select the most adequate heuristic at each decision step, using the feedback gathered from the search pro cess. Fig. 5 Distribution of ob jective func- tion v alues for the MAX-SA T instance 1: contest02-Mat26.sat05-457.resh uffled-07 Fig. 6 Distribution of ob jective function v alues for the bin packing instance 1: falkenauer/u1000-00 Fig. 7 Distribution of ob jective function v alues for the permutation flow shop in- stance 1: 100x20/1 Fig. 8 Distribution of ob jective function v alues for the p ersonnel scheduling in- stance 1: BCV-3.46.1 5.3 Progress of algorithms during a run Figure 9 shows the progress of the three algorithms during one 10 minute run on instance 1 of Bin P acking. A low er fitness v alue represen ts a b etter solution. This in- formation is easily av ailable from within HyFlex by calling the getFitnessTrace() 25 Fig. 9 1D Bin Pac king trace on instance 1, showing the progress of the three example algorithms ov er 10 minutes metho d, and it is automatically recorded during the run. They sho w that the p erfor- mance of the algorithms can differ greatly dep ending on how long they are left to run. Iterated lo cal search (ILS) and the memetic algorithm (MA) both finish the run at appro ximately the same fitness. Ho wev er, the memetic algorithm finds better quality solutions more quickly . The tabu searc h h yp er-heuristic (TS-AA) b egins the run b y finding b etter solutions than ILS, but TS-AA stagnates, and by the end of the run ILS has found a b etter solution. This ability to easily obtain useful information for analysis is another w ay that HyFlex can sav e a significant amount of time for researchers. 6 Conclusions This pap er has presen ted and described the HyFlex soft ware framework for the de- v elopment of cross-domain heuristic search metho dologies. HyFlex provides multiple problem domains, eac h con taining a set of problem instances and search op erators to apply . Therefore, it represen ts a nov el extension of the notion of b enchmark for com- binatorial optimisation, with whic h cross-domain algorithms can be easily developed, and reliably compared. Researc hers from different communities and themes within computer science, artificial intelligence and op erational research, can p otentially b ene- fit from HyFlex, as it provides a common b enchmark in which to test the p erformance and b eha vior of single-p oint and population-based self-configuring search heuristics. When using HyFlex, researchers can concentrate their efforts on designing their adap- tiv e metho dologies, rather than implementing the required set of problem domains. This pap er describ es the architecture of HyFlex, including examples of ho w to create and run hyper-heuristics within the framework. The four problem domains are presen ted and discussed, and three example h yp er-heuristics are analysed, with their results. The results show that the hyper-heuristics all hav e differing p erformances on the 26 four problem domains. No one algorithm is sup erior to the other tw o algorithms on all four problem domains. Although these are not state-of-the-art adaptiv e algorithms, the results suggest that there is still considerable scop e for future research when desiging adaptiv e and self-configuring algorithms that can learn from the search pro cess and select the most suitable searc h op erators. There is curren tly ample evidence that HyFlex is useful to the researc h communit y , due to the num b er of researchers whic h are currently emplo ying it for their research and teaching. The HyFlex framework w as made publicly av ailable in August 2010. In May 2011, the softw are had b een downloaded o ver 460 times, and the asso ciated w eb-pages describing it had b een visited ov er 11,844 times. The communit y has also resp onded well to a call for participation in the In ternational Cross-domain Heuristic Searc h Challenge (CHeSC), which would not b e p ossible without the HyFlex softw are. In May 2011, the comp etition had 43 registered participants and teams from 23 differen t coun tries. HyFlex can b e extended to include new domains, additional instances and op erators in existing domains, and multi-ob jective and dynamic problems. The current softw are in terface can also be extended to incorp orate additional feedbac k information from the domains to guide the adaptive searc h controllers. It is our vision that the HyFlex framew ork will contin ue to facilitate and increase in ternational in terest in dev eloping adaptiv e heuristic search metho dologies, that can find wider application in practice. References Argelich J, Li CM, Many a F, Planes J (2009) Maxsat ev aluation 2009 b enchmark data sets. W ebsite, http://www.maxsat.udl.cat/ Bai R, Blazewicz J, Burke EK, Kendall G, McCollum B (2007) A simulated annealing hyper- heuristic metho dology for flexible decision supp ort. T ech. rep., Universit y of Nottingham Battiti R (1996) Reactive search: T ow ard self–tuning heuristics. In: Ra yward-Smith VJ, Osman IH, Reev es CR, Smith GD (eds) Modern Heuristic Search Methods, John Wiley & Sons Ltd., Chichester, pp 61–83 Battiti R, Brunato M, Mascia F (2009) Reactive Search and Intelligen t Optimization, Op era- tions Research/Computer Science Interfaces Series, vol 45. Springer Baxter J (1981) Local optima av oidance in dep ot location. Journal of the Op erational Researc h Society 32:815–819 Beasley JE (2010) Or-library: collection of test data sets for a v ariety of op erations research (or) problems. W ebsite, http://people.brunel.ac.uk/ ~ mastjjb/jeb/info.htmll Bierwirth C, Mattfeld DC, Kopfer H (1996) On permutation represen tations for scheduling problems. In: V oigt H, Eb eling W, Rechen b erg I, Sch wefel H (eds) LNCS 1141: Pro ceedings of the 4th Parallel Problem Solving from Nature Conference (PPSN’96), Berlin, Germany , pp 310–318 Blanton JL Jr, W ainwrigh t RL (1993) Multiple vehicle routing with time and capacity con- straints using genetic algorithms. In: Pro ceedings of the 5th International Conference on Genetic Algorithms, Morgan Kaufmann Publishers Inc., San F rancisco, CA, USA, pp 452– 459, URL http://portal.acm.org/citation.cfm?id=645513.657758 Bleuler S, Laumanns M, Thiele L, Zitzler E (2003) PISA—A Platform and Programming Language Independent In terface for Search Algorithms. In: Conference on Ev olutionary Multi-Criterion Optimization (EMO 2003), Springer, Berlin, LNCS, vol 2632, pp 494–508 Braysy O (2002) A reactive v ariable neighborho o d search for the vehicle-routing problem with time windows. INFORMS Journal on Computing 15(4):347–368 Burke EK, Cowling P , De Causmaeck er P , V anden Berghe G (2001) A memetic approac h to the nurse rostering problem. Applied Intelligence 15(3):199–214 Burke EK, Hart E, Kendall G, Newall J, Ross P , Schulen burg S (2003a) Hyp er-heuristics: An emerging direction in modern search tec hnology . In: Glov er F, Kochen b erger G (eds) Handbo ok of Metaheuristics, Kluw er, pp 457–474 27 Burke EK, Kendall G, Soub eiga E (2003b) A tabu-search hyper-heuristic for timetabling and rostering. Journal of Heuristics 9(6):451–470 Burke EK, Curtois T, Qu R, V anden Berge G (2007) A time predefined v ariable depth search for nurse rostering. T ech. rep., School of Computer Science, Universit y of Nottingham Burke EK, Curtois T, P ost G, Qu R, V eltman B (2008) A hybrid heuristic ordering and v ariable neighbourhoo d search for the nurse rostering problem. Europ ean Journal of Op erational Research 188(2):330–341 Burke EK, Curtois T, Hyde M, Kendall G, Ochoa G, P etrovic S, V azquez-Ro driguez JA, Gendreau M (2010a) Iterated lo cal searc h vs. hyper-heuristics: T ow ards general-purpose search algorithms. In: IEEE Congress on Evolutionary Computation (CEC 2010), Barcelona, Spain, pp 3073–3080 Burke EK, Curtois T, Qu R, V anden Berge G (2010b) A scatter search metho dology for the nurse rostering problem. Journal of the Op erational Research So ciety 61:1667–1679 Burke EK, Hyde M, Kendall G, Ochoa G, Ozcan E, W oo dward J (2010c) Handb o ok of Metaheuristics, International Series in Operations Researc h & Management Science, vol 146, Springer, chap A Classification of Hyp er-heuristic Approaches, pp 449–468. DOI DOI:10.1007/978- 1- 4419- 1665- 5, chapter 15 CHeSC (2011) The First Cross-domain Heuristic Search Challenge, CHeSC 2011. W ebsite, http://www.asap.cs.nott.ac.uk/chesc2011/ Cowling P , Kendall G, Soub eiga E (2000) A hyperheuristic approach to scheduling a sales summit. In: Burke EK, Erb en W (eds) Pro ceedings of the 3rd International Conference on the Practice and Theory of Automated Timetabling (P A T A T 2000), Konstanz, Germany , pp 176–190 CRIL (2007) Sat comp etition 2007 b enchmark data sets. Cen tre de Recherc he en Informatique de Lens, W ebsite, http://www.cril.univ- artois.fr/SAT07/ CRIL (2009) Sat comp etition 2009 b enchmark data sets. Cen tre de Recherc he en Informatique de Lens, W ebsite, http://www.cril.univ- artois.fr/SAT09/ Curtois T (2009) Staff rostering b enchmark data sets. W ebsite, http:///www.cs.nott.ac.uk/ ~ tec/NRP/ Curtois T (2010) A hyflex mo dule for the p ersonnel scheduling problem. T ec h. rep., School of Computer Science, Universit y of Nottingham Davis L (1985) Job shop scheduling with genetic algorithms. In: Grefenstette JJ (ed) Pro- ceedings of the 1st International Conference on Genetic Algorithms and their Applications, Hillsdale, NJ, USA, pp 136–140 Eiben AE, Michalewicz Z, Schoenauer M, Smith JE (2007) Parameter Setting in Evolutionary Algorithms, Springer, chap Parameter Control in Evolutionary Algorithms, pp 19–46 ESICUP (2011) European special interest group on cutting and packing b enchmark data sets. W ebsite, http://paginas.fe.up.pt/ ~ esicup/ Fialho A, Costa LD, Schoenauer M, Sebag M (2008) Extreme v alue based adaptive op erator selection. In: Parallel Problem Solving from Nature PPSN X, Lecture Notes in Computer Science, vol 5199, Springer Berlin / Heidelb erg, pp 175–184 Fialho A, Da Costa L, Schoenauer M, Sebag M (2010) Analyzing bandit-based adaptive op er- ator selection mechanisms. Annals of Mathematics and Artificial Intelligence – Special Issue on Learning and Intelligen t Optimization DOI 10.1007/s10472- 010- 9213- y F ukunaga AS (2008) Automated disco very of lo cal searc h heuristics for satisfiability testing. Evolutionary Computation (MIT Press) 16(1):31–1 Gent I, W alsh T (1993) T ow ards an understanding of hill-climbing pro cedures for sat. In: Pro- ceedings of the 11th National Conference on Artificial Intelligence (AAAI’93), W ashington D.C., USA, pp 28–33 Goldberg DE, Lingle R (1985) Alleles, lo ci, and the tra veling salesman problem. In: Grefen- stette JJ (ed) Pro ceedings of the 1st International Conference on Genetic Algorithms and their Applications, Hillsdale, NJ, USA, pp 154–159 Hyde MR (2011) One dimensional packing b enchmark data sets. W ebsite, http://www.cs. nott.ac.uk/ ~ mvh/packingresources.shtml Ikegami A, Niwa A (2003) A subproblem-centric mo del and approach to the nurse scheduling problem. Mathematical Programming 97(3):517–541 Jakob W (2006) T ow ards an adaptiv e multimeme algorithm for parameter optimisation suiting the engineers’ needs. In: Parallel Problem Solving from Nature - PPSN IX, 9th International Conference, Reykjavik, Iceland, September 9-13, 2006, Procedings, Springer, Lecture Notes in Computer Science, vol 4193, pp 132–141 28 Johnson D, Demers A, Ullman J, Garey M, Graham R (1974) W orst-case p erformance b ounds for simple one-dimensional pack aging algorithms. SIAM Journal on Computing 3(4):299–325 Krasnogor N, Smith JE (2001) Emergence of profitable searc h strategies based on a simple inheritance mechanism. In: Proceedings of the 2001 Genetic and Evolutionary Computation Conference, Morgan Kaufmann Lobo F G, Lima CF, Michalewicz Z (eds) (2007) P arameter Setting in Evolutionary Algorithms, Studies in Computational Intelligence, vol 54. Springer Lourenco HR, Martin O, Stutzle T (2002) Iterated Lo cal Searc h, Klu wer Academic Publishers,, Norwell, MA, pp 321–353 Martello S, T oth P (1990) Knapsac k Problems: Algorithms and Computer Implemen tations. John Wiley and Sons, Chichester Martin O, Otto SW, F elten EW (1992) Large-step Marko v chains for the TSP incorp orating local searc h heuristics. Op erations Research Letters 11(4):219–224 Maturana J, Saubion F (2008) A compass to guide genetic algorithms. In: Pro ceedings of the 10th international conference on Parallel Problem Solving from Nature, Springer-V erlag, Berlin, Heidelb erg, pp 256–265 Maturana J, Lardeux F, Saubion F (2010) Autonomous operator management for evolutionary algorithms. Journal of Heuristics 16(6):1881–909 McAllester D, Selman B, Kautz H (1997) Evidence for inv ariants in lo cal search. In: Pro ceed- ings of the 14th National Conference on Artificial Intelligence (AAAI), Providence, Rho de Island, USA, pp 459–465 Mladenovic N, Hansen P (1997) V ariable neighborho o d search. Computers and Op erations Research 24(11):1097–1100 Naw az M, Jr EE, Ham I (1983) A heuristic algorithm for the m -machine, n -job flow-shop sequencing problem. OMEGA-International Journal of Management Science 11(1):91–95 Neri F, T oiv anen J, Cascella GL, Ong YS (2007) An adaptiv e multimeme algorithm for design- ing HIV multidrug therapies. IEEE/ACM T rans Comput Biology Bioinform 4(2):264–278 Ong YS, Lim MH, Zhu N, W ong KW (2006) Classification of adaptive memetic algorithms: a comparative study . IEEE T ransactions on Systems, Man, and Cyb ernetics, Part B 36(1):141– 152 Pisinger D, Ropke S (2007) A general heuristic for vehicle routing problems. Computers and Operations Researc h 34:2403– 2435 Rohlfshagen P , Bullinaria J (2007) A genetic algorithm with exon shuffling crossov er for hard bin packing problems. In: Pro ceedings of the 9th annual conference on Genetic and evolu- tionary computation (GECCO’07), London, U.K., pp 1365–1371 Ross P (2005) Hyp er-heuristics. In: Burke EK, Kendall G (eds) Search Metho dologies: Intro- ductory T utorials in Optimization and Decision Supp ort T echniques, Springer, chap 17, pp 529–556 Ruiz R, St ¨ utzle TG (2007a) An iterated greedy heuristic for the sequence dep endent setup times flowshop problem with makespan and w eighted tardiness ob jectiv es. European Journal of Op erational Research 187(10):1143–1159 Ruiz R, St ¨ utzle TG (2007b) A simple and effective iterated greedy algorithm for the p ermuta- tion flowshop scheduling problem. European Journal of Operational Research 177:2033–2049 Selman B, Levesque H, Mitchell D (1992) A new metho d for solving hard satisfiability prob- lems. In: Pro ceedings of the 10th National Conference on Artificial Intelligence (AAAI’92), San Jose, CA, USA, pp 440–446 Selman B, Kautz H, , Cohen B (1994) Noise strategies for improving lo cal search. In: Pro- ceedings of the 11th National Conference on Artificial Intelligence (AAAI’94), Seattle, W A, USA, pp 337–343 Smith JE (2007) Co-evolving memetic algorithms: A review and progress rep ort. IEEE T rans- actions in Systems, Man and Cyb ernetics, part B 37(1):6–17 T aillard E (1993) Benchmarks for basic scheduling problems. Europ ean Journal of Op erational Research 64(2):278–285 T aillard E (2010) Flow shop b enchmark data sets. W ebsite, http://mistic.heig- vd.ch/ taillard/ T albi EG (2009) Metaheuristics: from design to implementation. Wiley TSPLIB (2008) Tsplib: a library of sample instances for the tsp (and related problems). W eb- site, http://www.iwr.uni- heidelberg.de/iwr/comopt/soft/TSPLIB95/TSPLIB.html

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment