ELIMINATION from Design to Analysis

ELIMINATION from Design to Analysis
Notice: This research summary and analysis were automatically generated using AI technology. For absolute accuracy, please refer to the [Original Paper Viewer] below or the Original ArXiv Source.

Elimination is a word puzzle game for browsers and mobile devices, where all levels are generated by a constrained evolutionary algorithm with no human intervention. This paper describes the design of the game and its level generation methods, and analysis of playtraces from almost a thousand users who played the game since its release. The analysis corroborates that the level generator creates a sawtooth-shaped difficulty curve, as intended. The analysis also offers insights into player behavior in this game.


💡 Research Summary

The paper “ELIMINATION from Design to Analysis” presents a comprehensive case study on the design, procedural generation, and post-release analysis of “Elimination,” a word puzzle game for browsers and mobile devices.

The core gameplay of Elimination involves being presented with a string of letters (a “challenge”) that does not itself form a word. The player must eliminate (click to remove) some letters so that the remaining letters form a valid English word. Each level consists of 10 such consecutive challenges with a decreasing time limit, encouraging speed but not brute force. A key design philosophy was to avoid blocking player progression; the next level always unlocks automatically regardless of success, making the experience more casual.

A central contribution of the paper is the detailed description of the level generation system. All 30 levels in the game were generated automatically without human curation using a constrained evolutionary algorithm called FI-2Pop (Feasible-Infeasible Two-Population). The generator works by “mixing” two or three source words into a single challenge string. The evolutionary process is guided by a constraint function that penalizes strings exceeding a target length and a fitness function that aims to create puzzles of a specific discoverability profile. The fitness function calculates all possible English words that can be formed from the challenge string, splits them into “long” and “short” words based on a maxSeq parameter, and then rewards challenges where long words are not immediately visible (not appearing as contiguous subsequences) while short words are more apparent. Five key parameters (corpusFreq, targetLength, maxSeq, number of 2X letters, and source word lengths) were manually tuned by the designers to create an intended “sawtooth” difficulty curve, where difficulty increases over five levels and then drops sharply on the sixth to give players a breather and promote a flow state.

Following the public release, playtraces from 975 users over 98 days were collected and analyzed. The analysis served two main purposes: validating the design and understanding player behavior. To validate the difficulty curve, the average player score per level (inverted to represent difficulty) was plotted. The resulting curve showed a rough sawtooth pattern, corroborating the designers’ intent. A linear regression model predicting level score from the five generation parameters achieved an R² of 19.43%, indicating a moderate relationship, with source word frequency and target challenge length being the most significant predictors.

Deeper analysis into individual player choices within challenges yielded valuable insights. A regression model predicting how often a specific possible word was selected by players revealed that the length of the word (wordLength) and the number of its consecutive letters visible in the challenge (maxSequence) were the most influential factors. Interestingly, players selected longer words more frequently. The authors posit this is not only due to higher score rewards but also a consequence of the game’s mechanics: the process of eliminating letters to find a short word often accidentally completes a longer word first. The analysis also uncovered an emergent property unknown to the designers: in some specific challenge strings, certain short words are “impossible” to select due to the order of eliminations required inadvertently forming other valid words first.

In conclusion, the paper successfully documents the full lifecycle of a PCG-driven game from design to data-informed analysis. It demonstrates that designer-intuitive parameter tuning can achieve broad design goals like a sawtooth difficulty curve but has limitations in precise control, suggesting room for improvement via automated tuning methods. Furthermore, it highlights the importance of collecting detailed playtrace data, which can reveal unexpected player behaviors and hidden properties of the generated content and game rules, offering richer insights than aggregated statistics alone.


Comments & Academic Discussion

Loading comments...

Leave a Comment