Learning Sequences

Learning Sequences Da vid Eppstein Computer Science Department Donald Bren School of Information & Co mpu ter Sciences Universit y of Cal ifornia, Irvine eppstein@u ci.edu Abstract. W e describe the algorithms used by the ALEKS computer learning system for manipulating com binatorial d escriptions of human learners’ states of knowledge, generating all states t hat are p ossible according to a description of a learning space in terms of a p artial order, and using Ba yesian statistics to determine the most lik ely state of a stud ent. As we d escrib e, a representation of a kn owledge space using learning sequ ences (basic words of an antimatro id) allows more general learning spaces to b e implemented with simila r algorithmic complexity . W e show how to deﬁne a learning space from a set of learning sequ ences, ﬁ n d a set of learning sequences that concisely represen ts a given learning space, generate all states of a learning space represen ted in this w ay , and integrate this state generation proced ure into a knowledge assessmen t algorithm. W e also describe some related theoretical results concerning pro jections of learning spaces, decomp osition and dimension of learning spaces, and algebraic representa tion of learning spaces. 1 In tro duction ALEKS (short for Assessmen t and Learnin g in Kno wledge Spaces) is a computer s ystem, and a co m- pan y bu ilt aroun d th at system, whic h helps students learn knowle d ge-based academic systems suc h as mathematics by assessing their kno wledge and p ro viding lessons in concepts that the assessmen t judges them as ready to learn. Rather than b eing based on numerical test scores and letter grades, ALEKS is based on a com binatorial description of the concepts kno wn to the stud en t, in the form of a le arning sp ac e (Doignon and F a lmagne , 19 99 ). I n this form ulation, feasible states of kno wledge of a studen t are represent ed as sets of facts or concepts; the task of the assessmen t routine is to determine whic h facts the student kno ws. Not all sets of facts are assumed to form feasible kn owledge states; therefore, if the assessmen t rou tin e can ﬁn d a sequence of questions the answ ers to eac h of whic h roughly halv e the n umb er of remaining states co n s isten t with the answ ers, then the stud en t’s kno wledge can b e assessed from a n u m b er of qu estions approximat ely equal to th e logarithm (base t w o) of the n umb er of feasible kno wledge states, a n umb er that ma y b e signiﬁcan tly smaller than the num b er of facts in the learning space. Th at is, informally , the learnin g space mo del allo ws the sys tem to mak e inferences ab out the stud en t’s knowledge of concepts that ha ve n ot b een directly tested, from the information it has ab out the concepts that hav e b een tested; these inferences can s igniﬁ can tly reduce the num b er of qu estions needed to accurately assess the stud en t’s kno wledge, and thereby greatly reduce the tedium of in teracting with the system. In addition to sp eeding students’ inte r actions with the system in th is wa y , the com binatorial kno wledge model used by ALEKS allo ws it to determine sets of concepts whic h it judges the studen t r eady to learn, and presen t a selection of lessons b ased on those concepts to the stud en t, rather than forcing all student s to p ro ceed through the cu r riculum in a rigid linear ordering of lessons. The actual assessment and inference r outine in the ALEKS system is based on a Ba ye sian form ulation in whic h the system computes likel iho o ds of eac h f easible kno wledge state from the student s’ answers, agg regates these v alues to infer lik eliho o ds that the s tu den t can answe r eac h not-y et-ask ed qu estion, and uses these lik eliho o ds to selec t th e most informative question to ask next. Once the stud ent’s kn owledge is assessed, the system generates a fringe of concepts that it judges the stu d en t ready to learn, by calculating the feasible kn owledge states that diﬀer from the assessed state by the addition of a sin gle concept, and pr esen ts the stud en ts with a selection of online lessons based on that fringe. As of its 2006 implement ation, ALEKS ’s assessment pro cedu re m ust run in Ja v a, interacti vely on the u ser’s PC, so eﬃcien t compu tation at in teractiv e sp eeds is essentia l to its op eration. As w e describ e in more detail in the next chapter, the ALEKS sys tem uses a data representati on for its learning spaces based on a p artial ord er structure of p rerequisites for eac h concept. A clev er state generation pro cedure allo ws for the eﬃcien t generation of states in learning spaces of, typical ly , 50 to 100 facts. F or larger learning spaces, the com bin atorial exp losion in the total num b er of states mak es it infeasible to generate all state s; instead, the system r ep eatedly samples a subs et of the concepts in su c h a w a y that eac h unsamp led concept is “near” a samp led one, generates a learning space from the restriction of the prerequisite partial ord er to the sampled concepts, assesses the student ’s knowle d ge of the samp le, and uses that sampled assessmen t to reﬁne the p ortion of the learning s p ace within w hic h furth er assessmen t is ju dged necessary . Although quite successful, this p artial order based deﬁn ition of a lea r n ing space suﬀers from some ﬂa ws. Primary among these ﬂaws is a lac k of adaptabilit y: the structure of pr erequisite orderin gs b et ween concepts must b e dev elop ed by h um an kno wledge engineers, and is diﬃcult to c hange in an y automated w a y . F r om the p attern of r esp onses to ALEKS’s assessments, it ma y b e p ossible to infer that some sets of facts are highly un lik ely to b e found as kn o wledge states of student s; eliminating suc h states f rom the sys tem would help reduce the num b er of q u estions required for assessmen t. More signiﬁcan tly , some concepts ma y not readily learnable by the students when th e system judges that they are; th ese concepts should b e remov ed from the selection of online lessons present ed to the s tuden ts to a vo id the frustration of un learnable lessons. S imilarly , the p attern of student assessmen t an s w ers may lead u s to conclude that some additional states are presen t among the students b u t not a v ailable to the system’s repr esen tation; addin g these states to the system w ould allo w for more accurate assessment pro cedu res. Th e abilit y to mo dify the learning sp ace represent ed in the system is of sp ecial in terest in th e con text of in ternationalization of ALEKS, as we w ould exp ect th e diﬀerent educational systems and cultures in diﬀerent coun tries to lead to students w ith diﬀerent t ypical knowledge states. It would b e of great interest to the ALEKS designers to devel op automated adaption systems that can tak e taking adv ant age of ALEKS’s large base of user data and redu ce th e h uman engineering eﬀort needed to adapt the system to new concepts and new cultures. An automated adaption pro cedur e, based on the generalizatio n of the concept of a fringe from states in a learning sp ace to learning spaces in a system of s uc h spaces, wa s dev elop ed b y Thi´ ery (2001); how ever, this adaption p ro cedure was not u sed by ALEKS b ecause the partial order b ased learning space represen tation is insuﬃciently ﬂexible to allo w the generation of new learning spaces by the insertion and remo v al of states. A secondary ﬂa w r elates to th e mathematical deﬁnition of learnin g sp aces. The spaces r epre- sen table by p artial orders in the imp lemen tation of ALEKS do not comprise all p ossible learning spaces in the theory dev elop ed b y Doig n on and F almagne (199 9 ), but rather form a signiﬁcan tly restricted class of sp aces known as quasi-or dinal sp ac es , in w hic h the family of feasible sets can b e sho wn to b e closed und er set u nions and set in tersections. Closur e under set unions is justiﬁed p eda- gogica lly , b oth at a lo cal scale of learnabilit y (learning one concept d o es not pr eclude the p ossibilit y of learning a diﬀeren t on e) and more globally (if studen t x kno ws one set of concepts, and student y k n o ws an other, then it is r easonable to assume that there may b e a third student wh o com bines the knowledge of these t w o students). Ho wev er, closure u nder set intersectio ns seems muc h hard er to justify p edagogica lly: it implies that any concept has a single s et of prerequisites, all of wh ic h m ust b e learned pr ior to the concept. On the cont rary , in p ractice it ma y b e that some concepts ma y b e learned via m ultiple path wa ys that cannot b e condens ed in to a sin gle set of prerequisites. The practical eﬀec t of this ﬂa w is that the partial order based repr esen tation for learning sp aces used b y ALEKS do es not allo w certain learnin g spaces to b e d eﬁned; instead one m ust deﬁne a larger sp ace form ed by the family of all in tersections of sets in th e desired learning space, and the larger num b er of sets in this in tersection closure of the learning sp ace ma y lead to ineﬃciencies in the assessment p ro cedure. In add ition, and more seriously , the inabilit y to accurately describ e the prerequ isite s tr ucture for certain concepts ma y lead to situations where the s y s tem incorrectly assesses a student as b eing ready to learn a concept. In this c hapter w e outline algorithms and protot yp e imp lemen tations for a more ﬂexible r ep- resen tation, one that w ould allo w the implementa tion of Thi ´ ery’s automatic adaptation pro cedure and allo w more general deﬁnitions of learning sp aces than the existing partial order b ased represen- tation allo ws, while preserving the scalabilit y and eﬃcien t implemen tabilit y of th at repr esen tation. W e b eliev e that these goa ls ma y b e ac hieve d b y using a repr esen tation based of le arning se quenc e s : orderings of the learning space’s concepts int o linear sequences in whic h those concepts could b e learned. A learning sp ace m a y ha v e an enorm ous num b er of p ossible learning sequ ences, bu t, as we sho w, it is p ossib le to correctly and accurately represent an y learning s p ace usin g only a sub set of these sequences, and in many cases the num b er of sequen ces needed to d eﬁne the space can b e v ery small. F or instance, f or th e qu asi-ordinal spaces currently in use by ALEKS, a repr esen tation based on learning sequences can b e constructed in whic h the num b er of learning sequences equals the maxim um num b er of concepts in the fr inge of a feasible state. W e sh o w how to generate eﬃcien tly all states of a learnin g space deﬁned from a set of learnin g sequences, allo wing for similar and similarly eﬃcien t assessmen t pro cedur es to the ones curr en tly used by ALEK S. Additionally , we sho w h o w to ﬁnd eﬃciently a r epresen tation of this type for any learning space, using an optimal n umb er of example sequences, and ho w to adapt an y sp ace d eﬁned in this wa y b y adding or r emo v- ing sets from its family of feasible states. W e detail this learning sequence based represent ation, and the eﬃcien t algo rith m s based on it, after d escribing in more d etail ALE KS’s existing partial order b ased representat ion. In add ition w e inv estigate m ore generally the th eory of and algorithms for learning spaces and related com binatorial structures. In particular we examine the mathematical stru ctur e of p ro jections of learning spaces, the exten t to w hic h it is p ossible to d ecomp ose learnin g spaces eﬃcien tly in to unions of simpler learning spaces, deﬁn itions of learning spaces via the algebraic prop erties of their union op eration, and relations b et we en diﬀerent d eﬁnitions of d imension for a learning space. These theoretical in v estigations are detailed in later sections of this c hapter. 2 Learning Spaces from P artial Or ders W e outline in this section the represent ation of learning spaces already in use by the 2006 imple- men tation of ALEKS. As we describ e, this represen tation leads to eﬃcien t assessment algorithms, but is only capable of represen ting a limited su bset of the p ossible lea rn ing s p aces, the so-called quasi-or dinal sp ac e s . A p artial or der is a relation < among a set of ob jects, satisfying irr eﬂexivity ( x 6 < x ) and tr ansitivity ( x < y and y < z implies x < z ). Although deﬁned as a mathematical ob ject, a p artial order ma y b e repr esen ted concisely for computational purp oses by its Hasse diagr am , a directed graph con taining an edge x → y whenever x < y and there d o es n ot exist z with x < z < y . That is, w e connect a pair of items in the partial order b y an edge wh enev er the p air b elongs to th e c overing r elation of the partial ord er. The original partial order ma y b e reco ve red easil y from th e Hasse diagram representi n g it: x < y if and only if there exists a directed path from x to y in the Hasse diagram. A B C D E F G H Ø A B AC AB ACE ABC ABD ABCE ABCD ABDF ABCDE ABCDF ABDFH ABCDEF ABCDFH ABCDEFG ACBDEFH ABCDEFGH Fig. 1. Left: a partial order, sho wn as a Hasse diagram in wh ic h eac h edge is directed from the concept at its lo wer endp oint to the concept at its up p er endp oint. Right : the learning space d eriv ed from the partial order on the left. T o d eriv e a learnin g space fr om a partial order on a set of concepts, we in terpr et th e edges of the Hasse diagram as describ ing prerequisite relations b et we en concepts. That is, if x and y are concepts, represente d as v ertices in a Hasse diagram cont aining the edge x → y , then w e take it as giv en that y ma y not b e learned unless x has also already b een learned. F or ins tance, in elemen tary arithmetic, one cannot p erform m ulti-digit addition w ith ou t ha ving already lea rn ed how to d o single digit add ition, so a learning sp ace inv olving these tw o co n cepts s hould b e represented b y a Hasse diagram con taining a path fr om the v ertex represen ting single-digit addition to the v er tex represent ing multi-dig it add ition. With this inte r p retation, a state of kno wledge in th e learning space may b e form ed as a lower set : a set S of the concepts in a giv en partial order, satisfying the requirement that, for any edge x → y of the Hasse diagram, either x ∈ S or y / ∈ S . Figure 1 sho ws an examp le of a Hasse d iagram on eigh t concepts, and the 19 states in the learning sp ace derived from this Hasse diagram. W e call a learning space derived from a Hasse diagram in this w a y a quasi-or dinal sp ac e . A quasi-ordinal space must satisfy th e follo wing three prop erties: Accessibilit y . F or ev ery n onempt y state S in the learning space, there is a concept x ∈ S such that S \ { x } is also a state in the learnin g space. I n learning terms, an y state of knowledge ma y b e reac hed by learning one concept at a time. “Accessibilit y” is the usual name for this prop erty in the combinatorics literature; in learning theory p ap ers, it has also b een referred to as “downgradabilit y” (Doble et al., 2001). Union Closure. If S and T are states of kno w ledge in the lea rn ing s pace, then S ∪ T is also a state in the learning space. In learning terms, the knowle d ge of t wo individuals m ay b e p o oled to form a s tate of k n o wledge that is also f easible. In tersection Closure. If S and T a r e states of kn o wledge in the learning space, then S ∩ T is also a state in the learning sp ace. W e are una ware of a natural learning based in terpr etation of this pr op ert y . A family of states satisfying only accessibilit y and union closure f orm s a mathematical structure kno wn as an antimatr oid (Korte et al., 1991 ), and it is this more general general class of structure that w e hop e to capture with our learning sequence representa tion of a learning space. The partial order based structure deﬁned b y ALEKS allo w s only a sp ecial sub class of an timatroids satisfying also the in tersection closure pr op ert y; mathematically such a structure forms a lattic e of sets , or, equiv alent ly , a distributive lattic e . Conv ersely , it follo ws from results of Birkhoﬀ (193 7 ) that any distributive lattice can b e r epresen ted via a partial order in this wa y . See Doignon and F almagne (1999) for related r epresen tation theorems f or learning spaces. In what follo ws w e review b rieﬂy the algorithms used by ALEKS to p erform assessmen ts usin g this quasi-ordin al learning space r epresen tation. 2.1 The F ringe The fringe of an y state S in a learning space is deﬁned to b e the set of concepts that, when add ed to or remo v ed from S , lead to another state in the learning space. F ringes are imp ortan t to ALEK S b ecause they describ e the concepts that the assessed s tuden t is most ready to learn, or has the most s haky learning of. W e ma y distinguish the outer fringe of concepts a stud en t is r eady to learn, those concepts x suc h that S ∪ { x } is also a state in the learning sp ace, from the inner fringe of concepts that a student ma y hav e most r ecen tly learned, those concepts x suc h that S \ { x } is also a state. The frin ge is the u nion of the outer and inner fringes. In a learning sp ace (or more generally , in any medium) eac h state may b e uniquely identi ﬁ ed by the pair of its outer and in n er frin ges. In a qu asi-ordinal space, the inner fringe of S consists of the maximal concepts in S , and th e outer fringe consists of the minimal concepts not in S . The fringe of S may b e calculated easily in a single pass through the edges x → y of the Hasse d iagram: if x → y an d x / ∈ S , then y cannot b e in the fr inge, and if x → y and y in S , then x cannot b e in the frin ge. W e in itialize a set F to consist of the whole domain, and r emo v e x or y fr om F whenever w e disco v er a co vering relation that preve nts it fr om b elonging to th e fringe; the remaining set at the end of this scan is the frin ge. 2.2 State Generation T o determine the lik eliho o d that a student knows eac h concept in a learning space, from a giv en set of results on ask ed questions, ALEKS uses an algorithm b ased on listing all states in the learning space. The algorithm used by ALEKS can b e explained most easily in terms of r everse se ar ch (Avis and F uku da, 1996), a general technique for develo pin g generation algorithms for man y t yp es of combinatoria l s tructures. Supp ose we ha v e c h osen a top olo gic al or dering (also kno wn as a line ar extension ) of the concepts in a p artial order; that is, a sequen ce of the concepts su c h that, if x < y in the order, then x m ust app ear prior to y in the sequence. F or instance, one ma y sort the concepts b y the length of the longest path leading to eac h concept in the Hasse diagram, with ties broke n arbitrarily among concepts h a ving p aths of the same length; th e resulting sorted s equence is a top ological ordering. Then, giv en an y state S in the knowledge space, one ma y ﬁnd another state S \ { x } where x is c hosen to b e the concept b elonging to S that has the latest p osition in the top ological ordering. W e call S \ { x } th e pr e de c essor of S . In this wa y , we disam biguate the accessibilit y pr op ert y of learning s p aces and m ake a concrete c hoice of whic h concept to remo ve to form the p redecessor of Fig. 2. Th e Hasse d iagram of a worst case example for the partial order state generation algorithm. Eac h of 2 n/ 3 items (b ottom) is connected b y an edge to eac h of n/ 3 items (top). The corresp ond ing learning space h as 2 2 n/ 3 + 2 n/ 3 − 1 states, 2 2 n/ 3 of whic h cause the state generation algorithm to p erform n / 3 prerequ isite c hec ks p er state, so th e total time is Ω ( n ) p er state. an y state. If we r ep eat this r emo v al pro cess, starting f rom an y state, we will ev ent u ally r eac h the empt y set, s o the graph formed b y connecting eac h state to its p r edecessor is a tree, r o oted at the empt y set. Reve r s e searc h, ap p lied to this predecessor relationship, amoun ts to p erforming a depth ﬁrst tr av ers al of this tree. T o b e more sp eciﬁc, we p erform a recurs iv e tra versal of the tree deﬁn ed ab ov e, m ain taining as w e do for eac h state S a set c hildren( S ) of the concepts that ma y b e added to S to form another state that is a child of S in the predecessor tree. T h en, when the recursion steps f rom S to S ∪ { x } , w e calculate children( S ∪ { x } ) fr om c h ild ren( S ) by remo ving fr om c hild r en( S ) an y y o ccurring p rior to x in the top ologic al ord er, and adding to c hildren( S ) an y concept y reac hable from x b y a Hasse diagram edge x → y su c h that all other prererequisites of y already b elong to S . Once w e hav e calculate d c hildren ( S ∪ { x } ), w e may output S ∪ { x } as one of our states and cont inue recursively to eac h s tate S ∪ { x, y } for eac h y in c hildren( S ∪ { x } ), and so on. V ery little is needed in the w a y of data structures b ey ond the Hasse diagram itself to imp lemen t this recur siv e tra ve rsal eﬃcien tly . Pr imarily , w e need a w a y of q u ic kly determin in g whether all prerequisites of some concept y b elong to the set S we are tr av ers ing. Th is m a y b e d one eﬃcien tly b y main taining, for eac h y , a co u n t of the pr erequisites of y that do n ot y et b elong to S , decrementing that count for eac h successor of x wh enev er we step fr om S to S ∪ { x } , and incremen ting the coun t again when our recursiv e tra v ersal return s from S ∪ { x } to S . In this wa y we ma y test prerequisites in constant time, and use time prop ortional to the num b er of Hasse diagram edges ou t of x to up d ate coun ts w henev er we step f rom state to state. Alternativ ely , the m etho d used within the 2006 implemen tation of ALEKS is to store a b itmap rep r esen tation of S , and mask it ag ainst a bitmap representing the predecessors of y whenev er su c h tests are needed; theoretically this m etho d requires time prop ortional to th e n u mb er of items in the learning sp ace p er test, but in practice it is fast b ecause t ypical mo dern computer arc hitectures allo w for the testing of prerequisite relations for 32 concepts sim ultaneously in a single mac hine instruction. The total time sp en t cop ying lists of c hildren in this pro cedure can b e c harged using amortized time analysis a gainst the time to generate eac h c hild stat e. Therefore, the b ottlenec k for time analysis of the pro cedure is the part of th e time sp ent testing successors of x and testing whether to add those su ccessors to c hildren( S ∪ { x } ). As describ ed ab o v e, eac h such test can b e implemen ted in linear time, so the total time for th e algorithm is prop ortional to the sum, o ver all states in the learning space, of the num b er of Hasse diagram edges that are outgoing fr om the last concept in eac h state. In t ypical examples th e Hasse d iagrams are relativ ely sparse and the time p er state ma y approac h a constant, bu t even in the worst case (Figure 2) th e time is no more than O ( n ) p er s tate in a learning s pace with n concepts. It is this level of state generation eﬃciency that w e hop e to approac h or meet with ou r more general learning space representa tion. 2.3 Assessmen t Procedure While a student is b eing assessed, h e or she will ha ve answered some of the assessmen t questions correctly an d some incorrectly . W e desire to infer from these r esu lts lik eliho o ds that the stu den t understand s eac h of the concepts in the knowledge sp ace, ev en those concepts not explicitly tested in the assessment. The inference method used b y ALEKS (F alma gne and Doignon, 1988; F almagne et al., 1990) applies more generally to any system of sets, and can b e in terpreted u sing Ba y esian probability metho ds. W e b egin with a prior p robabilit y d istribution on the feasible states of the learning space. In th e most simple case, we can assume an un informativ e uniform pr ior in whic h eac h state is equally lik ely , b ut the metho d allo ws us to incorp orate prior p robabilities based on an y easily compu table function of the state, such as the n umb er of concepts in the set repr esented b y the state. These prior probilities ma y inco rp orate k n o wledge ab out th e stu den t’s age or grades, or results from assessmen ts in pr evious sessions that the student may ha ve had with ALEKS; for instance, if a previous session assessed the stud ent as b eing in a certain state , w e could u se an a priori pr obabilit y distr ib ution based on d istance from that state in our next assessmen t. Ho w ever, in the 2006 implement ation of ALEKS only uniform prior pr obabilities are u sed. W e also assume a cond itional p robabilit y that the studen t, in a giv en state, will answ er a question correctly or incorrectly: if the question tests a concept within the set repr esen ted by the state, we assume a high pr obabilit y of a correct answer, while if a question tests a concept n ot within the set represente d b y the state, we assum e a low probabilit y of a correct answers. Answers opp osite to what th e giv en s tate pr edicts can b e ascrib ed to careless mistak es or luc ky guesses, and w e assume that incidences of such answe rs are ind ep endent of eac h other. ALEK S’ test questions are d esigned so that lu cky guesses are very rare; therefore, the necessit y for accurate assessment in the p resence of careless mistak es is of m u ch greater signiﬁcance to the design of the assessment pro cedure, b ut this also means that it is necessary to ascrib e diﬀeren t rates to th ese t wo t yp es of ev en ts. With these assumptions, we may use Ba yes’ rule to calculate p osterior pr obabilities of eac h state. T o do so, we calculate for eac h state a lik eliho o d, the pro du ct of its prior pr obabilit y with a set of terms, one term p er already-ask ed assessmen t question. Th e v alue of eac h term d ep ends on wh ether the student answ ered the qu estion correctly and wh ether the concept tested by the question b elongs to the giv en state. On ce we hav e calculated these lik eliho o d s for all stat es, they ma y b e con verte d to probabilities b y d ivid ing them all by a common constan t of p rop ortionalit y , the s u m of the lik eliho o ds of all states. F rom these p osterior p robabilities of states w e wish to calculate p osterior probabilities of in- dividual concepts. T o do so, we s u m the probabilities of a ll states conta inin g a given concept. ALEKS’s assessmen t pr o cedure cal culates the probabilities of all concepts in this wa y , and then c ho oses to test the stu den t on the most inf ormativ e concept: the one with probabilit y closest to 50% of b eing kno wn. Even tually , all concepts will ha v e probabilities b oun ded well a w a y from 50%, at wh ic h p oin t the ev aluation pro cedure terminates. Although describ ed a b o ve as a separate sum for eac h concept, ALEKS implemen ts this probabil- it y calculation via a sin gle pass through the states of the learning space. When the state generatio n pro cedure steps from a state S to a c hild state S ∪ { x } , it calculate s the lik eliho o d (pro d uct of terms for eac h answe red qu estion) from the new state b y multiplying the old lik eliho o d b y a single term for the questions b ased on concept x . I t totals th e like liho o d of all states descending fr om S ∪ { x } in the recursiv e searc h, and adds this total like liho o d to that of concept x . Then, when retur n ing from S ∪ { x } to S , it adds the total likeli h o o d calculate d f or S ∪ { x } in to that for S . I n this wa y , the likelihoo d for eac h concept is calculated in constan t time p er state, and the constan t of p ro- p ortionalit y needed to turn these lik eliho o ds in to probabilities is calculate d as the total lik eliho o d at the ro ot of the r ecursion. If x < y in the p artial order d eﬁning the system’s learning s pace, x will b e assessed as havi n g higher p robabilit y than y of b eing known; therefore, the set of concepts returned b y this assessmen t algorithm is guarant eed to b e a feasible kno wledge state for the giv en quasi-ordinal s pace. 2.4 Hierarc hical Sampling Sc heme Although the assessmen t p ro cedure describ ed ab o ve w orks w ell for p artial orders of 50 to 1 00 concepts, it b ecomes to o slo w for larger learning spaces due to the need for the algorithm to list all states in the space and the com binatorial explosion in th e num b er of states generated for those spaces. Th erefore, the ALEKS s y s tem resorts to a samp ling sc heme that allo w s it s assessmen t pro cedure to run at inte ractiv e sp eeds for m uc h larger learnin g spaces. This sampling sc heme is based on three concepts, all dep ending on the details of the deﬁnition of th e learning space in terms of partial orders: distance b et ween concepts, deﬁnition of smaller learnin g s paces from sampled concepts, and b oun ding concept likel iho o ds from th eir prerequisites and p ostrequisites. T o generate s maller samples of the set of concepts used to deﬁn e a learnin g space, ALEKS u ses a notion of distance b etw een tw o concepts in a partial ord er. T o deﬁne th e d istance b et we en x an d y , deﬁne ∆ x,y to b e the set of items whose comparison to x is diﬀerent f rom its comparison to y . That is, ∆ x,y = { x, y } ∪ { z | z < x ∧ z 6 < y } ∪ { z | z < y ∧ z 6 < x } ∪ { z | x < z ∧ y 6 < z } ∪ { z | y < z ∧ x 6 < z } . Then the distance d ( x, y ) b et we en x and y is deﬁn ed to b e | ∆ x,y | − 1. Th is distance s atisﬁes the mathematical axioms deﬁn in g a metric sp ac e : d ( x, x ) = 0, d ( x, y ) = d ( y , x ). F or any x , y , and z , ∆ x,z ⊂ ∆ x,y ∪ ∆ y , z , and the un ion is not disjoint as y b elongs to b oth sides, so d ( x, z ) ≤ d ( x, y ) + d ( y , z ). ALEKS then chooses a suitable distance threshh old δ , and a sample S of the concepts of th e learning s p ace such that every concept is within distance δ of a mem b er of S . Although there is no m athematica l pro of of suc h a fact, the inten t of this samplin g tec hn ique is that assessment on a nearb y sample concept is lik ely to b e informativ e ab out the assessment of eac h un s ampled concept. Once a sample of concepts has b een chose n , ALEKS m ust form a learning space describing the kno wledge states p ossib le for that sample, so that it can ap p ly its assessment pr o cedure to the sample. F or quasi-ordinal learning spaces, th is pro cess of forming a learnin g sp ace on the sample is v ery simple: one need merely restrict the giv en p artial order deﬁn ing the sp ace to the sampled concepts, and build a learning space f r om the r estricted order (Figure 3). Finally , the assessmen t of lik eliho o ds on the sampled concepts is used to b ound the lik eliho o ds of the remaining uns amp led concepts, to determine which ones the stud en t is like ly to kno w or n ot to know. If x < y , y b elongs to the samp le, and the student knows y with probability p , then the student is tak en to know the easier concept x with probability at least p . Similarly , if x < y , x b elongs to the sample, and the student kn o ws x with pr ob ab ility p , then the s tu den t is tak en to kno w the harder concept y with pr obabilit y at most p . Ho wev er, th er e is something of a mismatc h b et ween these like liho o d b ounds and the distance-based sampling p ro cedure: it is p ossible for th e nearb y samples to an unsamp led concept x to all b e incomparable to x , in wh ic h case w e cannot ﬁnd any useful b ounds for the likeli h o o d of x . This sampling process, sample lea rn ing space co n struction, and likel iho o d b oun d, are used together rep eatedly to reﬁne the p ortion of the learning space that is relev an t for th e student. A D E G H Ø A AE AD ADE ADH ADEH ADEG ADEGH Fig. 3. Th e Hasse diagram of the p artial order of Figure 1 , restricted to the sampled set of concepts { A, D, E , G, H } , and the s maller learning space generated from the restricted p artial order. Initially , all states are considered relev ant , and a sample with a h igh distance threshh old is c hosen. After sev eral s teps of r eﬁnemen t, a larger num b er of concepts h av e lik eliho o ds th at can b e b ounded a w ay to one sid e or another of 50%, and a sample is chosen with a sm aller distance thr eshhold among only those r emaining informativ e concepts. Even tually , this reﬁnement p ro cess con verge s with all concepts h a ving likeli h o o ds b ound ed a wa y from 50%, whic h w e may use to construct a most likely knowledge state for the stud en t. F or learning spaces not deﬁned from partial orders, we ma y h av e to replace these construc- tions w ith alternativ e tec hniqu es. Ho wev er, it will still b e necessary to hav e some w a y of sampling the concepts of th e learning space, building a smaller lea rn ing space from th e sample, and using assessmen ts on the sample to b ound lik eliho o ds of unsampled concepts, b ecause this samp ling pro- cedure is crucial to limiting the n umber of states generated in the assessmen t pro cedur e and th er eby k eeping the assessmen t calculatio n ’s times fast enough for h uman interacti on. 3 Learning Spaces from Learning Sequences W e no w describ e an alte r n ativ e metho d for d eﬁ ning and d escrib ing learning spaces, that w e b e- liev e may form the b asis for an eﬃcien t and more ﬂ exible implemen tation of learning space based kno wledge assessmen t algorithms than th e one current ly used by ALEKS. While there has b een past w ork on algorithmic charact erizations of learning sp aces usin g the terminology of antimatroi d s (Bo yd and F aigle, 1990; Kempner and Levit, 2003), that w ork fo cuses on showing th at certain algorithms work correctly if and only if the structure they are applied to is an ant imatroid. Here instead our fo cus is on imp lemen tation details for allo wing an timatroid-based algorithms to run eﬃciently . 3.1 Learning Sequences Giv en an y learning sp ace, there may b e many orderin gs through which a student, starting from no kno wledge, could learn all the concepts in the space. W e call such an orderin g a le arning se q uenc e ; in the com binatorics literature these are also kn own as b asic wor ds . F ormally , a learning s equ ence σ can b e deﬁned as a one-to-one map from th e integ ers { 0 , 1 , 2 , . . . n − 1 } to the n concepts forming the d omain of a learning sp ace, with th e pr op ert y that eac h pr eﬁx P i ( σ ) = σ ( { 0 , 1 , 2 , . . . i − 1 } ) is a Ø A C AB AC BC ABC Fig. 4. A learning space that is not deﬁned from a partial order. Unlike partial order b ased learning spaces, this space is not closed u nder in tersections: { A, B } and { B , C } are b oth states in the space, but their in tersection { B } is not. Ho wev er , the space still satisﬁes the union closur e and ac cessibilit y requirement s of learning sp aces. v alid knowledge sta te in the learning space. The sequence of preﬁxes P 0 ( σ ) , P 1 ( σ ) , . . . P n − 1 ( σ ) forms a sh ortest path in the learning space from the empt y set to the whole domain, and for an y suc h path the sequence of items by which one state in the path diﬀers fr om the next forms a learnin g sequence. F or in s tance, in the learning space dep icted in Figure 1, the leftmost path from th e b ottom state (the empt y set) to the top set (the whole d omain) passes th rough the sequ ence of states ∅ , { A } , { A, C } , { A, C, E } , { A, B , C, E } , { A, B , C , D , E } , { A, B , C , D , E , F } , { A, B , C, D , E , F , G } , and { A, B , C, D , E , F , G, H } . The learning sequence corresp onding to this path is A, C, E , B , D , F , G, H . Similarly , the lea r n ing sequence corresp onding to the rightmost path in the ﬁgur e is B , A, D , F , H , C, E , G . Altoget h er, the learning sp ace in Figure 1 can b e shown to hav e 41 distinct learning sequences. F or a quasi-ordinal learning space, such as the one in Figure 1, a learning sequence is essenti ally just a top ological ordering of the partial order on concepts d eﬁning th e quasi-ordinal space. How ever, w e ha v e deﬁned learnin g sequences in a w ay that can b e applied to any learning space. F or instance, the learning space in Fig u r e 4, whic h do es not come from a partial order, h as four learnin g sequences: (1) A, B , C , (2) A, C , B , (3) C , A, B , and (4) C , B , A . 3.2 States from Sequences If we are giv en a set Σ of learning sequen ces, we can immediately infer that all pr eﬁxes of Σ are states in the asso ciated learning s p ace. But we can also infer th e existence of other states using the union-closure pr op ert y of learnin g spaces: any set formed from a un ion of preﬁ xes of S m ust b e a state in the asso ciated learning space. F or instance, supp ose we ha ve the tw o sequences A, B , C and C, B , A from the learning space in Figure 4. The pr eﬁxes of these sequences are the six sets ∅ , { A } , { A, B } , { A, B , C } , { C } , and { B , C } . Ho w eve r, b y formin g unions of preﬁxes we may also form the sev en th set { A } ∪ { C } = { A, C } . All sev en s tates of the learning s pace can b e reco vered in this wa y from unions of th e preﬁxes of these t w o sequ ences. In general, for an y set of sequen ces Σ o ver a domain of concepts, d eﬁ ne the learning space L Σ as the family of unions of pr eﬁ xes of Σ . If Σ consists of learning sequences from a learning space L , then L Σ ⊂ L . W e will discuss, in a later section of this c hapter, metho ds of s electing a small set Σ such th at L Σ = L . F or n o w, w e tak e Σ as giv en and describ e the learnin g sp ace L Σ it generates. Ø (0,0,0) A (1,0,0) BD (0,2,0) AC (1,0,1) BC (0,1,2) AB (2,1,0) BDF (0,3,0) ABC (3,1,2) BCD (0,2,2) BCE (0,1,3) ABD (2,2,0) ABCD (4,2,2) ABCE (3,1,3) BCDF (0,4,2) ABCDF (4,5,2) ABCDE (5,2,3) ABCDEF (6,6,6) BCDEF (0,4,4) ABCEF (3,1,5) BCDE (0,2,3) BCEF (0,1,4) ABDF (2,3,0) B (0,1,0) C (0,0,1) ABCDEF BDFCAE CBEF AD Fig. 5. Learning space L Σ generated from three sequences AB C D E F , B D F C AE , and C B E F AD . Eac h state is shown with its index mex( S ). 3.3 Indexing State s It is p ossible to name states in L Σ b y vecto rs in Z | Σ | , in suc h a w a y that eac h state has a unique name and the concepts in eac h s tate can b e reconstructed easily f r om the s tate’s name. Su c h a naming scheme will b e u s eful in several of our later algorithms. Giv en a state S in a learnin g sp ace, and a set Σ = { σ 0 , σ 1 , . . . , σ k − 1 } of k learnin g sequ ences within the space, deﬁne mex i ( S ) to b e the minimum index in σ i of a concept excluded fr om S ; that is, mex i ( S ) = min { j | σ i ( j ) / ∈ S } . If S is the w hole domain, we d eﬁne for completeness mex i ( S ) = n . Equ iv alen tly , therefore, mex i ( S ) is the size of the largest pr eﬁx of σ i that is a su bset of S . Deﬁn e mex( S ) to b e the v ector mex( S ) = (mex 0 ( S ) , mex 1 ( S ) , . . . , mex k − 1 ( S )) . That is, mex can b e view ed as a function mapping states in the learning space to v ectors in Z k . Con versely , we can deﬁ ne a fu n ction up( v ), mapping v ectors in Z k to states in the learning space, by up( v ) = [ 0 ≤ i 0, the v alues of q i w ould form an inﬁnite ascending c hain of prop er divisors of p violating the assumption of ﬁniteness. C on ve rs ely , if p has a sin gle predecessor q , it m ust b e irredu cible, as an y pr o duct of its p rop er divisors w ould also d ivid e q and ther efore b e unequal to p . ⊓ ⊔ By analogy , in an y semilattice, w e deﬁne s to b e sing u lar iﬀ it h as a single suc c essor t , suc h that s | t and, for any x , if s | x then s = x or t | x . W e identify an y ob ject x in the semilattice with the s et N ( x ) of sin gular ob jects that x do es not divid e. Lemma 4. F or any x and y , N ( xy ) = N ( x ) ∪ N ( y ) . Pr o of. If s ∈ N ( xy ), then xy 6 | s . But in any semilattice, for an y s , if x | s and y | s then xy | s , so w e can conclude that x 6 | s or y 6 | s , and therefore s ∈ N ( x ) or s ∈ N ( y ). Con v ersely if s ∈ N ( x ), so x 6 | s , then xy 6 | s and s ∈ N ( xy ). ⊓ ⊔ Theorem 6. Any ﬁnite semilattic e is isomorphic to the semilattic e forme d by the union op er ation on the sets N ( x ) . Pr o of. F rom the previous lemma, it remains only to sho w that for any x 6 = y , N ( x ) 6 = N ( y ). Supp ose to the contrary that S = { x | N ( x ) = T } has | S | > 1 for some set T of singular ob jects. By the previous lemma, S is closed u nder unions, so it contai n s a uniqu e maximal ob ject, and w e ma y c ho ose fr om S some t wo ob jects x 6 = y with x | y . Let U = { z | x | z and y 6 | z } , and let u b e a maximal ob ject in U ; U is nonempt y as it con tains x and a maximal u exists by ﬁniteness. Then uy must b e the unique su ccessor to u , for if u had a pr op er m ultiple w not divisible b y uy , then w w ould b e a m ultiple of x and nonmultiple of y that is larger than u contradicti n g the maximalit y of u . Therefore, u is sin gular, and N ( y ) con tains u bu t N ( x ) do esn’t, con tradicting the assumption that N ( x ) = N ( y ). ⊓ ⊔ In some sense the sets N ( x ) form a minimal represent ation b y sets for an y ﬁ nite semilattice. It remains to ﬁ nd cond itions on the semilattice f orcing these sets to b e accessible. There are many p ossible conditions that do so; the simplest w e hav e found is the follo win g. W e deﬁn e an e qualizing a b c d e a b c d e x ⇒ ∨ b xc d x c a xd e Fig. 12. Graphical explanation of Lemma 6. p air x, y to b e a pair of ob j ects x | y such that there exist a and b with x a 6 = xb and xa 6 = y a = y b 6 = xb . W e sa y th at a ﬁnite semilattice has sep ar ate d e qualizers if, for eac h equalizing pair x, y , there exists z w ith x | z | y and x 6 = z 6 = y . This d eﬁnition is motiv ated b y th e t w o semilattices shown in Figure 11. In the left semilattice, th e pair ∅ , ab is equalizing for ac and bc , w hile in the righ t semilattice , the pair ∅ , ab is equ alizing for c and bc ; ho w ev er, in eac h case there is no z b et w een ∅ and ab . Thus, from the lemma b elo w, the prop erty of ha ving separated equalizers distinguishes these tw o semilattice s from anti matroids. Lemma 5. Every antimatr oid has sep ar ate d e qualizers. Pr o of. Let x, y b e an equalizing pair for a, b in an an timatroid. W e can assume without loss of generalit y that x a 6 | xb . Then, in ord er for xa 6 = xb b ut y a = y b , y \ x must b e a sup erset of xa \ xb . It m ust b e a prop er sup erset of this set, else we would ha v e y a = xa . Therefore, | y \ x | > 1 and we can ﬁ nd z b etw een x and y usin g the c hain prop erty for an timatroids. ⊓ ⊔ Lemma 6. In a semilattic e with sep ar ate d e qualizers, supp ose that ther e exist obje cts a , b , c , and d , with a | b , a | c | d , and bc = bd . Then ther e exists another obje ct x , with a | x | b , such that either xc = d , or xc and d ar e inc omp ar able. Pr o of. An equiv alent descrip tion of th e p remise of the lemma is that a, b form an equalizing pair for some c and d with c | d . W e use induction on the length of the longest c hain of ob j ects b etw een a and b . Because a, b form an equalizing pair for c, d , there must exist x b et we en a and b . If xc is a pr op er d ivisor of d , then x, b form an equalizi n g pair for xc and d and the result follo ws by induction. If xc = d , we ha v e pr o v en the lemma. If d | xc , then a, x form an equalizing p air for c and d and the r esult follo ws by induction. I n the r emaining case, xc an d d are incomparable. ⊓ ⊔ A graph ical explanation of Lemma 6 is sho wn in Figure 12. Lemma 7. In a semilattic e with sep ar ate d e qualizers, if p is irr e ducible with pr e de c essor q , s is singular with suc c essor t , q | s , and p 6 | s , then t = p s . Pr o of. By s ingularit y of s , ps = pt , so if t 6 = ps then q , p form an equalizing pair for s, t . Bu t if p and q are separated b y z , q | z | p , then q could n ot b e the p redecessor of p . ⊓ ⊔ Lemma 8. In a semilattic e with sep ar ate d e qualizers, let p b e irr e duci b le with pr e de c essor q , s b e singular with suc c essor t , q | s , and p 6 | s . Then ther e c an b e no x with q | x , p 6 | x , and s inc omp ar able to x . Pr o of. Sup p ose to the con trary that x exists. T hen b y singularit y of s , t | xs . If t = xs , th en q , p w ould b e an equ alizing pair for s and x , and the existence of q | x | p w ould violate the assump tion that q is p ’s predecessor. Otherwise, t is a p rop er d ivisor of xs , so q , x is an equ alizing p air for s and t . By Lemma 6, there exists x ′ with q | x ′ | x wh ere either sx ′ = t or x ′ s an d t are in comparable; x ′ m ust b e incomparable to s . By rep eating this p ro cess w e can ﬁn d an inﬁnite descending sequence of x , x ′ , x ′′ , etc., violating the assu mption of ﬁniteness of the s emilattice, so x cannot exist. ⊓ ⊔ Lemma 9. In a semilattic e with sep ar ate d e qualizers, let p b e irr e ducible with pr e de c essor q . Then | N ( p ) \ N ( q ) | = 1 . Pr o of. | N ( p ) \ N ( q ) | ≥ 1 b y Theorem 6. By Lemma 8 , an y t wo mem b ers of N ( p ) \ N ( q ) must b e comparable, but by Lemma 7, a ny t wo mem b ers m u st be incomparable. Therefore | N ( p ) \ N ( q ) | ≤ 1. ⊓ ⊔ Theorem 7. A ﬁnite semilattic e c an b e r epr e se nte d as an antimatr oid if and only if it has sep ar ate d e qualizers. Pr o of. F or an y x in su c h a semilattice, there is a y | x with | N ( x ) \ N ( y ) | = 1: represent x as a minimal pro du ct of irreducibles, and form y b y r ep lacing one of the irredu cibles in this pr o duct by its predecessor. Therefore the sets N ( x ) are accessible as w ell as union-closed, and this set f amily forms an ant imatroid. Conv ers ely we hav e seen that any antimatroid forms a semilattice that has separated equ alizers. ⊓ ⊔ 7.4 Diﬀeren t Deﬁnitions of Dimension Our algorithms for learning spaces h a v e b een cen tered aroun d the concept of c onvex dimension , the minim u m num b er dim C ( L ) of learning sequences n eeded to deﬁne the learning space L . Ho wev er, there are several other natural concepts of dimension for learning sp aces. W e ma y deﬁne the b ase dimension dim B ( L )of a learning space L to b e the cardinalit y of its base. The lattic e dimension dim Z ( L ) (Epp stein, 2005b) is the m in im um dimension d of an inte ger lattice Z d in to w hic h the states of L may b e em b edded, in suc h a wa y th at the L 1 distance b etw een the emb ed dings of t w o states equ als the cardinalit y of their symmetric diﬀerence; lik e th e con vex dimension it can b e calculated eﬃcien tly by an algo rith m based on maxim um matc h in g in an asso ciated b ipartite graph. And , the or der dimension dim ≤ ( L ) is th e minimum dimension d of a Euclidean space R d in to whic h the states of L ma y b e em b edded, in such a w ay that, for t wo states S and T , S ⊆ T if and only if the co ordin ates of S are all less than or equal to the corresp ond ing co ordinates of T . In some sens e the order dimension is ve ry closely related to the conv ex d imension, as b oth are d eﬁned as the min im um num b er of s equ ences of items needed to deﬁne the learnin g space, but in the case of con ve x d im en sion th e sequences are of elemen ts of ∪ L while in the case of order dimension th e sequences (formed by eac h co ordin ate of the emb edding) are of states of L . W e may also view th e cardinalit y n = |∪L| as a d imension: it is the isometric dimension of L , that is, th e least dimen s ion of a h yp ercub e into whic h L can b e isometrically emb ed ded. F or any learning s pace L , th ese d iﬀeren t qu an tities satisfy the follo win g inequalities and rela- tions: – dim C ( L ) ≤ dim B ( L ). This follo ws as w e ma y r epresen t L usin g a separate learning sequ ence for eac h base set. – n ≤ dim B ( L ). Each elemen t of ∪ L must b e th e remov ab le elemen t of at least one base set. – dim B ( L ) ≤ dim C ( L ) · n , as eac h learnin g sequence in a represen tation of L b y learning sequences can contribute at most n base sets. ﬂ { a } { a,b } { a,b,c } { a,b,c,d } { b,c,d } { a,c } { b,d } { a,b,d } { b,c } { a,c,d } { c,d } { a,d } b { } c { } { d } Fig. 13. A p ow erset on f ou r elemen ts. – dim C ( L ) ≤  n ⌊ n/ 2 ⌋  = O (2 n / √ n ), b y our c h aracteriza tion of dim C ( L ) as the size of the largest an tic hain in the base, and Sp ern er ’s Theorem b ound ing th e size of an an tic hain in an y family of sets. – dim ≤ ( L ) ≤ dim C ( L ) (Korte et al., 1991). This can b e seen via the embedd ing int o R dim C ( L ) in whic h we map S to mex( S ), for this em b edding satisﬁes the requ iremen ts of the deﬁnition of order d imension. – dim ≤ ( L ) = 2 if and only if d im C ( L ) = 2, from our w ork on d ra wing learning spaces (Eppstein, 2006). – dim ≤ ( L ) ≤ dim Z ( L ): in an y lattice em b eddin g of L , all state s m ust b e mapp ed to a single orthant of the lattice in order to satisfy the union closure prop ert y of L , so the lattice em b edding m ust again s atisfy the requirement s of the deﬁnition of ord er dimension. – dim Z ( L ) ≤ n , as the c haracteristic fu nction embed s L , or more generally an y family of sets on the elemen ts of L , in to the s u bset { 0 , 1 } n in suc h a w a y th at L 1 distance equals sym m etric diﬀerence card in alit y . – dim C ( L ) = O ( n dim Z ( L ) − 1 ). This follo w s from the fact that a latt ice em b edd ing of L m ust lie within a pro du ct of inte r v als [0 , n − 1] and the fact that no t wo memb ers of an ant ichai n within this pr o duct can share all but one co ordinate. W e no w describ e examples of learning sp aces that are extremal for some of th ese inequalities. A c hain. A learning sp ace deﬁned from a single learning sequence has dim C ( L ) = dim ≤ ( L ) = dim Z ( L ) = 1 b u t dim B ( L ) = dim C ( L ) · n = n . A p ow erset. The family of all subsets of an n -elemen t set (Figure 13) has dim C ( L ) = dim ≤ ( L ) = dim Z ( L ) = dim B ( L ) = n . A learning sequence and its reverse. The learning s pace deﬁn ed by t wo learning sequ ences, one the reve r se of the other (Figure 14) has dim C ( L ) = dim ≤ ( L ) = 2 b ut dim Z ( L ) = n and dim B ( L ) = 2 n − 2. A learning space w ith a large base. Let D b e a set of n elemen ts, and x b e a designated elemen t f rom D . Deﬁne L to consist of the sets that either do not con tain x , or cont ain at least ⌊ n − 1) / 2 ⌋ elements. Then L is accessible, as for an y set S ∈ L , if x ∈ S then S \ { x } ∈ L wh ile if x / ∈ S then all subsets of S are in L . L is also closed und er unions, so it forms a learning space. The base of L consists of { x } toget h er with all subsets of exactly ⌊ ( | D | − 1) / 2 ⌋ elemen ts of D \ { x } ; the subset of the b ase formed b y ﬂ {a} {a,b} {a,b,c} {a,b,c,d} {a,b,c,d,e} {a,b,c,d,e,f} {a,b,c,d,f} {a,b,c,f} {a,b,f} {a,f} {f} {e,f} {a,e,f} {a,b,e,f} {a,b,c,e,f} {d,e,f} {a,d,e,f} {a,b,d,e,f} {c,d,e,f} {a,c,d,e,f} {b,c,d,e,f} Fig. 14. Th e learning space deﬁned by a learnin g s equence and its rev erse. Figure from Eppstein (2006). omitting { x } is an ant ichai n . Th erefore dim ≤ ( L ) ≤ dim Z ( L ) = n , but dim B ( L ) = 1 +  n − 1 ⌊ ( n − 1) / 2 ⌋  and dim C ( L ) =  n − 1 ⌊ ( n − 1) / 2 ⌋  , matc hing to within a constant factor the O (2 n / √ n ) upp er b ound on dim C ( L ). A three-dimensional zigz ag. Let N b e a giv en num b er, and let Z consist of the three-dimensional integ er lattice p oints with co ordinates 0 ≤ x , y < N , 0 ≤ z ≤ 1, an d suc h that, if z = 1, then x + y + z ≥ N (Figure 15). The semilatt ice of co ordinatewise maximization on Z can b e represen ted as a lea rn ing space L with n = |∪L| = 2 N − 1. F or this space, dim ≤ ( L ) = dim Z ( L ) = 3 but dim C ( L ) = N , as there is an an tic h ain in the b ase consisting of all minimal p oints with z = 1. Similar examples in h igher d im en sions sh o w that, in general, dim C ( L ) can b e lo wer b ounded b y Ω ( n dim Z ( L ) − 2 ), nearly matc h ing our O ( n dim Z ( L ) − 1 ) u pp er b oun d. It w ould b e of in terest to d etermine the algorithmic complexit y of calculating the order dimen- sion of a learning space. F or arbitrary partial orders, calculating the order dimension is NP-complete (Y annak akis , 1982) but it is u nclear w hether the reduction pro ving this can b e made to apply to learning s p aces. 8 F uture W ork Although we h a v e made signiﬁcant progress in learnin g space imp lemen tation, we b eliev e there is plen ty of s cop e for additional in v estigation, particularly on th e follo wing topics: Visualization of learning spaces. Quasi-ordinal spaces may b e visualized b y dra wing their Hasse diagrams as graphs, bu t this Fig. 15. A learning space with large conv ex d imension bu t lo w lattice d imension. tec hnique do es not work so w ell f or more general learning sp aces. In earlier work (Eppstein, 2005a, 2006) we foun d algorithms for drawing th e state-transitio n diagrams of learning sp aces and more general m edia, how ever th ese work b est when the space b eing dra wn has a well - b eha ved embed d ing into a lo w-d imensional Euclidean space. Can we generalize th ese d r a wing approac hes to learning spaces with h igher conv ex dimension? Is th ere a p ossibilit y of a h yb rid approac h that draws p ortions of a learning space as a Hasse diagram on concepts and resorts to the more complex state space only wh en necessary? Coun ting states. Is there an eﬃcien t metho d for coun ting or estimating the num b er of states in a learning sp ace, without taking the time to generate all states? This would ha ve imp licatio ns for our ability to set s amp le size s app ropriately in ALEKS ’s k n o wledge assessmen t algorithms, as w ell as for calculating more accurate pr iori probabilit y d istributions on p ro jected states wh en using pro jections o f learning spaces to speed up the assessmen t algorithm. F rom past exp erience with similar com binatorial counting problems (e.g., Jerrum et al., 2001) we exp ect th at the complexit y of a rand omized app ro ximation s c heme for the coun ting p roblem is like ly equiv alent to the complexit y of sampling s tates from the learning space un iformly at random, which seems to b e of in terest indep en d en tly . Inference of error ra tes. ALEKS curr en tly assumes ﬁxed r ates of careless mistake s and luc ky guesses, which it uses in its calc u lation of lik eliho o ds that a studen t kno ws eac h concept in a lea r n ing space. Bu t one could also en vision a more sophisticated Bay esian assessment pro cedu re that treats the c h ance of careless errors as a v ariable with an a pr iori prob ab ility distribu tion, and attempts to infer a maxim um like liho o d v alue of this v ariable b ased on the studen t’s answers. Such a pro cedure w ould lik ely b e based on an EM-algorithm approac h in w hic h one alternates app lications of the lik eliho o d calculation d escrib ed here with an algorithm for estimating the careless error probabilit y giv en the calculated lik eliho o ds , and wo uld allo w the system to b etter ﬁt its mo del to eac h s tu den t’s p erformance. Ho wev er the details of su c h an appr oac h still need to b e w orked out. Reconciliation of exp ert opinions. Along with its applications to concise repr esen tation of media, the join ma y b e useful for a p rob- lem arising when constru cting learning sp aces from the answ ers of exp erts (Do wling, 1993): t w o diﬀeren t exp erts ma y giv e quite diﬀerent answers w h en ask ed what they b eliev e ab out the pre- requisite structure of a giv en set of concepts, leading to quite diﬀeren t learning spaces on those concepts. Th e construction pr o cedure of Do wling (1993) inv olv es asking exp er ts a series of ques- tions ab out whether feasible kno wledge states can exist with certain com bin ations of concepts, but the ans w ers to these questions hav e only b een foun d r eliable when the combinations inv olv e at most t wo conce p ts at a time; the learning space s generated by limiting the questioning to such com binations are necessarily qu asi-ordinal. T o r eliably generate more complex learning spaces, it seems necessary to combine the r esu lts fr om questioning multiple exp er ts. Th e join p ro vides a mathematical mec hanism for reconciling those answers and ﬁn d ing a common learnin g space con taining as states any set of concepts b eliev ed to form a state by an y of the exp erts, b u t the learning spaces constructed in this wa y are lik ely to b e muc h large r than n ecessary . More researc h is needed on metho ds for combining inf ormation from m ultiple exp erts to generate learning spaces of size comparable to the space th at would b e constructed by qu estioning a single exp ert, while sim ultaneously taking adv antag e of the multiplicit y of exp erts to generate spaces that more accurately m o del the s tu den ts’ kno wledge. Ev en faster sta te space genera t ion. Our algorithm for generating the states in an n -concept qu asi-ordin al space tak es time O ( n ) p er generated state, without assump tion, and ma y often b e faster, while a s im ilarly fast time b oun d of O ( k ) p er generated state for learning spaces generated by k learning sequen ces can only b e sho wn w ith an additional assu m ption of constan t time bitve ctor op erations for m ain taining and up d ating the mex v alues of the generated states. Additionally , we ha ve brieﬂy describ ed an algorithm for listing all states in the ﬁb er L ( K, U ) of a pro jection of a learning space, giv en b eliefs that a student kno ws the concepts in K and d o es n ot know the concepts in U , that do not matc h these eﬃciencies. Can these state generation algorithms b e impro v ed to the p oin t where more eﬃcien t wo rs t case guarant ees on their p erformance can b e p r o v en? F aster upp er fringe construction. The algorithm w e implemen ted for our construction of the family of sets that can b e add ed to a learning space to form new larger learning spaces in vo lves generating all states of the learning space. Ho wev er, there ma y b e many fewer f ringe sets than there are states. In s ome s ense, the base of a learnin g space consists of the minimal sets in the space, while the outer fringe consists of the m aximal sets not in the space. T hus, it is plausib le that one could adap t h yp ergraph transv ersal algorithms (F redman and Khac hiya n , 1996), whic h can b e used to conv ert minimal sets in a f amily to maximal sets n ot in a family for certain other types of set families, to the purp ose of ﬁnding the outer f r inge of a learning space in time pseudop olynomial in the num b er of sets in the ou ter fringe. S u c h a result would also ha ve implications for the compu tational complexit y of inferring a learnin g s pace from questions asked of an exp ert (Do wling , 1993). Ho w eve r, we h a v e not work ed out th e details of s uc h an eﬃcien t algorithm for listing upp er fringe states. Structure of the family of learning spaces. W e know from th e w ork of Thi ´ ery (2001) that, wh en one learning space forms a sub family or sup erf amily of th e other, we can ﬁnd a shortest p ath from one to th e other by add ing and remo ving states, suc h that eac h set family in this shortest path is also a learnin g space. That is, the family of learning spaces has a c hain prop erty similar to that of individu al learning spaces. This fact motiv ates the calculation of the frin ges of a lea rn ing space, as the sets in the fr inge represent p oten tial neigh b ors in such p aths. W e also kn o w that the f amily of learnin g spaces on a giv en domain forms a semilattic e u nder the join op eration, which is ho wev er not the same as simple u nion of set f amilies. And we kno w that the family of learning sp aces is not in general w ell-graded, so it do es not form a mediu m u nder op erations th at add and remo v e sets. What other stru cture do es the family of learning spaces hav e, and ho w can that s tructure help us quic kly adapt a learning s p ace to c hanging in formation ab out th e p ossible knowle d ge s tates of student s? Question selection strategy . W e ha v e only indirectly addressed th e issu e of whic h question to ask the studen t next, in the assessmen t pro cedur e, after likel iho o ds of eac h concept h a v e b een calculated. As cur ren tly implemen ted, ALEKS selects the question with lik eliho o d closest to 50% of b eing known, bu t that ma y not b e the optimal selectio n strategy . It seems like ly that a somewhat b etter strategy w ould b e to select the questio n with lik eliho o d c losest to 50% of b eing answered correctly; due to the diﬀerent rates of careless errors and luc ky guesses this strategy diﬀers from the currentl y implemente d one, bu t it also dep end s on having an accurate estimate of the student’s careless err or rate, which the cur r en t assessmen t pro cedure d o es n ot supp ly . Also, if ther e are m ultiple qu estions with similar likeli h o o ds, it ma y b e b est not to choose th e one with lik eliho o d closest to 50%, but instead to p erform some lo ok ahead in the sequence of questions, and ask a question such that wh ic hev er answ er is giv en will again lead to a situation where some qu estion has lik eliho o d close to 50%. The eﬀect of an imp ro ve d question selection s tr ategy could b e to reduce the n umb er of questions needed to assess eac h s tuden t’s kno wledge, o v er and ab o ve the reduction aﬀorded b y more accurately deﬁning the lea rn ing space on whic h the assessmen t is based. In addition, th e outcome of a question ma y not actually b e bin ary : “don’t know” ma y b e treated diﬀerently than an incorrect answer, and the natur e of the errors in an incorrect answer ma y yield some insight ab out th e stud en t’s kno wledge. It seems lik ely that these brief initial observ ations could b e signiﬁcantl y expanded with more thou ght. 9 Conclusions W e ha v e sho wn that a computer r epresen tation of learnin g sp aces b y learning sequen ces can ap- proac h the eﬃciency of the existing quasi-ordinal space r ep resen tation for ALEKS ’s kno wledge assessmen t algorithms, while allo wing a broader class of learning sp aces that ma y more easily b e adapted by adding and remo ving state s. W e b eliev e that the alg orithms describ ed here are su ﬃ- cien tly detailed and eﬃcien t to b e suitable for implement ation within ALEKS. W e ha ve also p erformed theoret ical inv estigatio ns concerning learning spaces. W e hav e describ ed ho w to r ecognize spaces formed f rom a learning space w hen w e assu me certain facts ab out the state of a stud ent’s kn o wledge, and w e hav e inv estigat ed the algorithmic complexit y of recognizing learning spaces that can b e decomp osed into joins of a s m all n umb er of simpler spaces. W e h av e in ve stigated alternativ e m athematical represent ations of learning sp aces, and w e ha ve compared the con v ex dim en sion f undamenta l to our computer represen tation to sev eral other imp ortan t n umerical measures of a learning space’s size. Finally , we ha v e identi ﬁ ed m ultiple areas where more researc h ma y lead to additional pr actica l algorithms or theoretical insigh ts concerning learning sp aces. Bibliograph y D. Avis and K. F ukud a. Rev erse searc h f or en umeration. Discr ete Applie d Mathematics , 65:21–46 , 1996. G. Birkh oﬀ. Rings of sets. Du ke Mathematic al Journal , 3:443– 454, 1937. E.A. Bo yd and U. F aigle. An algorithmic c haracterization of an timatroids. Discr ete Applie d Mathematics , 28:197–205 , 19 90. R.P . Dilw orth. Lattices w ith un iqu e irreducible d ecomp ositions. Annals of Mathematics , 41: 771–7 77, 1940. C.W. Doble, J.-P . Doignon, J.-Cl. F almagne, and P .C. Fish bu r n. Almost connected orders. Or der , 18(4): 295–311, 2001. J.-P . Doignon and J.-Cl. F almagne. K now le dge Sp ac es . Spr inger-V erlag, Berlin, Heidelb erg, and New Y ork, 1999. C.E. Do wling. Applyin g the basis of a knowledge space for cont rolling the questioning of an exp ert. Journal of Mathematic al Psycholo gy , 37:21– 48, 1993. D. Ep pstein. Algorithms f or drawing media. In Gr aph Dr awing: 12th International Symp osium, GD 2004, New Y ork, N Y, USA, Septemb er 29–Octob er 2, 2004 , vo lum e 3383 of L e ctur e Notes in Computer Scienc e , p ages 173–183, Berlin, Heidelb erg, and New Y ork, 2005a. Springer-V erlag. D. Ep pstein. The lattice dimension of a graph. Eur op e an Journal of Combinatorics , 26(6): 585–5 92, 2005b. D. Ep pstein. Upright -quad drawing of st -p lanar learning spaces. In Gr aph D r awing: 14th International Symp osium, GD 2006, Karlsruhe, Germany, Septemb er 18–20, 2006 , Lecture Notes in Compu ter S cience, Berlin, Heidelb erg, and New Y ork, 2006. Sp ringer-V erlag. D. Ep pstein, J.-Cl. F almagne, and Ovc h inniko v S. M e dia The ory . Springer-V erlag, Berlin, Heidelb erg, and New Y ork, 2007. J.-Cl. F almagne and J-P . Doignon. A class of sto chastic pro cedu r es for the assessmen t of kno wledge. British Journal of M athematic al and Statistic al Psycholo gy , 41:1– 23, 1988. J.-Cl. F almagne and S. Ov c h innik ov. Media theory. Discr ete Applie d Mathematics , 121:83 –101, 2002. J.-Cl. F almagne, M. Kopp en, M. Villano, J.-P . Doignon, and L .. Johanessen. In tro d uction to kno wledge spaces: h o w to bu ild, test and searc h them. P sycholo g ic al R eview , 97:204 –224, 1990. Mic hael L. F redman and Leonid Khac hiya n . On the complexit y of du alizati on of monotone disjunctive n ormal forms . Journal of Algorithms , 21(3):618– 628, 1996. J.E. Hop croft and R .M. Karp. An O ( n 5 / 2 ) algorithm for maximum matc hings in bipartite graphs. SIAM J. on Computing , 2(4):225 –231, 1973. M. Jerr um, A. Sinclair, and E. Vigoda. A p olynomial-time appr oximati on algorithm for the p ermanent of a m atrix with non-negativ e en tries. In Pr o c. 33 r d ACM Symp. on The ory of Computing , p ages 712–271 , 2001. Y. Kemp ner and V.E. Levit. C orresp onden ce b et wee n tw o an timatroid algorithmic c haracterizations. Electronic p reprint m ath.CO /0307 013, arXiv.org, 2003. B. Korte, L. Lov´ a sz, and R. Sc hr ad er . Gr e e doids . Number 4 in Algorithms and Com bin atorics. Springer-V erlag, 1991. N. Th i ´ ery . Dynamic al ly A dapting K now le dge Sp ac e s . Ph D thesis, Univ. of California, Irvine, Sc ho ol of So cial Sciences, 2001. M. Y ann ak akis. Th e complexit y of the partial ord er dimension p roblem. SIAM J. Alg. D isc. Meth. , 3(3):351– 358, S eptem b er 1982.

Learning Sequences

Original Paper

Comments & Academic Discussion

Leave a Comment

Original Paper

Related Papers

Comments & Academic Discussion

Leave a Comment