Is Randomness "Native" to Computer Science?

We survey the Kolmogorov's approach to the notion of randomness through the Kolmogorov complexity theory. The original motivation of Kolmogorov was to give up a quantitative definition of information. In this theory, an object is randomness in the se…

Authors: Marie Ferbus-Z, a (LIAFA), Serge Grigorieff (LIAFA)

Is Randomness “native” to Computer Science? Marie F erbus-Zanda ∗ ferbus@log ique.jussi eu.fr Serge Grigo r ieff ∗ seg@liafa. jussieu.fr August 2003 Original p ap er publishe d in Current T rends in Theoretical Computer Science. G. Paun, G. Rozenberg, A. Salomaa (eds.). W orld Scien tific, V ol.2, p.1 41–18 0, 2004 Earlier version in Y uri Gurevic h’s “Logic In Computer Science Column” Bulletin o f EA TCS, vol. 74, p.78– 118, June 20 0 1 Con ten ts 1 F rom probabili t y theory to Kolmogorov comple x ity 3 1.1 Randomness and Probabilit y theory . . . . . . . . . . . . . . . . . . . . . . 3 1.2 Intuition o f finite random strings and Berry’s paradox . . . . . . . . . . . . 5 1.3 Kolmogoro v complexity relati ve to a f unction . . . . . . . . . . . . . . . . . 6 1.4 Why binary programs? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 1.5 What ab out oth er p ossible ou t puts? . . . . . . . . . . . . . . . . . . . . . . 8 2 Optimal Kolmogorov complexity 8 2.1 The I nv ariance Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8 2.2 Coding pairs of strings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 2.3 Non d eterminism . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 3 Ho w complex is Kol m ogoro v comple xity? 13 3.1 Approximation from ab o ve . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 3.2 Dov etailing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 4 3.3 Undecidability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15 3.4 No non t riv ial computable low er b ound . . . . . . . . . . . . . . . . . . . . 16 3.5 Kolmogoro v complexity and representation of ob jects . . . . . . . . . . . . . 16 ∗ LIAF A, CNRS & Universit ´ e P aris 7, 2, pl. Jussieu 75251 Paris Cedex 05 F rance 1 4 Algorithmic Information Theory 17 4.1 Zip/Unzip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.2 Some relations in Algorithmic I nformation Theory . . . . . . . . . . . . . . 18 4.3 Kolmogoro v complexity of pairs . . . . . . . . . . . . . . . . . . . . . . . . . 18 4.4 Symmetry of in formation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5 Kolmogoro v complex ity and Logic 19 5.1 What to do with paradoxes . . . . . . . . . . . . . . . . . . . . . . . . . . . 19 5.2 Chaitin I ncompleteness results . . . . . . . . . . . . . . . . . . . . . . . . . 20 5.3 Logical complexity of K . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21 6 Random finite strings and their appli cations 22 6.1 Random versus how muc h random . . . . . . . . . . . . . . . . . . . . . . . 22 6.2 Applications of random finite strings in computer science . . . . . . . . . . 23 7 Prefix complex i ty 24 7.1 Self d elimiting programs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 7.2 Chaitin-Levin prefix comp lex ity . . . . . . . . . . . . . . . . . . . . . . . . . 25 7.3 Comparing K and H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 7.4 How big is H ? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26 7.5 Con vergence of series and the Co ding Theorem . . . . . . . . . . . . . . . . 26 8 Random infinite sequences 27 8.1 T op-d own app roac h to randomness of infi n ite sequences . . . . . . . . . . . 27 8.2 F requ ency tests and von Mises random sequences . . . . . . . . . . . . . . . 28 8.3 Martin-L¨ of random sequences . . . . . . . . . . . . . . . . . . . . . . . . . . 29 8.4 Bottom-up approach to rand omness of infinite sequences: Martin-L¨ of ’s Large oscillations theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32 8.5 Bottom-up approach with prefix complexity . . . . . . . . . . . . . . . . . . 33 8.6 T op-d own/Bottom-up approac hes: a sum up . . . . . . . . . . . . . . . . . . 33 8.7 Randomness with other p robabilit y distributions . . . . . . . . . . . . . . . 34 8.8 Chaitin’s real Ω . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 8.9 Non compu table inv ariance . . . . . . . . . . . . . . . . . . . . . . . . . . . 36 9 More randomness 36 9.1 Beyo nd c.e.: oracles and infinite computations . . . . . . . . . . . . . . . . . 37 9.2 F ar b eyond: Solo v ay random reals in set theory . . . . . . . . . . . . . . . . 37 2 1 F rom p robabilit y theory to Kolmogoro v com- plexit y 1.1 Randomness and Probabilit y theory Quisani 1 : I just found a surp rising assertion on Leonid Levin’s home page: While fu ndamental in many ar e as of Scienc e, r andomness is r e- al ly “native” to Computer Sci enc e. Common sen se w ould rather consider randomness as intrinsically relev an t to Probabilit y theory! Authors : Levin also adds: “The c omputa tional natur e of r andomness was clarifie d by Kolmo gor ov.” The p oint is that, f rom its v ery origin to mo dern axiomatization aroun d 1933 [21] by An drei Nik olaievitc h Kolmogoro v (1903–1 987), Probab ility the- ory carries a paradoxi cal result: if we toss an unbiaise d c oin 100 times then 100 he ads ar e just as pr ob able as any other outc ome! As Pet er G´ acs p leasingly remarks ([17 ], p. 3), this c onvinc es us only that the axioms of Pr ob ability the ory, as developp e d in [21], do not solve al l mysteries that they ar e sometimes supp ose d to. In fact, since Laplace, m uc h w ork has b een devot ed to get a mathematic al the ory of r andom obje cts , notably by Ric hard v on Mises (18 83–19 53) (cf. § 8.2). Bu t none was satisfactory u p to the 60’s when su c h a theory emerged on the basis of computabilit y . As it sometimes o ccur s, th e theory was d isco v ered by seve ral authors ind e- p endently . 2 In the USA, Ra y J . Solomonoff (b. 1926), 1964 [42] (a pap er submitted in 1962) and Gr egory J . Chaitin (b. 1947), 1966 [5], 1969 [6] (b oth pap ers sub mitted in 1965). In Russia, Kolmogoro v, 1965 [23], with premisses announced in 1963 [22]. Q : Same phenomenon as f or hyp erb olic geometry with Gauss, Lobatc hevski and Boly ai. I recen tly read a ci tation from Boly ai’s father: “When the time is rip e for certain things, these th in gs app ear in different p laces in the mann er of violets coming to light in early spring”. 1 Quisani is a student with quite eclectic scien tific curiosity , who works under Y uri Gurevich’s sup ervision 2 F or a detailed analysis of who did what, and when , see [30 ] p.89–92. 3 A : Mathematics and p o etry. . . W ell, pioneered b y Kolmogoro v, Martin-L¨ of, Levin, G´ acs, Sc h norr (in Europ e) and Chaitin, Solo v a y (in America), the theory deve lopp ed v ery fru itfu lly and is no w named Kolmo gor ov c omplexity or Algorithm ic Information The ory . Q : So, Kolmogoro v founded Probabilit y Theory t wice! In the 30’s and then in the 60’s. A : Hum. . . In the 30’s Kolmogoro v axiomatized Pr obabilit y T heory on the basis of measur e theory , i.e. int egration theory on abstract spaces. In the 60’s, Kolmogoro v (and also S olomonoff and Chaitin ind ep endently) founded a mathematical theory of r andomness . That it could b e a n ew basis f or Probabilit y Theory is not clear. Q : What? R an d omness w ould not b e the natural b asis for Probab ility Theory? A : Rand om num b ers are usefu l in d ifferen t kinds of applications: simu- lations of natural p henomena, sampling for testing “t ypical case”, getting go o d sour ce of d ata f or algorithms, . . . (cf. Donald Knuth, [20], c hapter 3). Ho we v er, the notion of r andom ob ject as a mathematical notion is p resen tly ignored in lectures ab out Probab ility Th eory . Be it for the foundations or for the deve lopm en t of Probability Theory , such a notion is neither introdu ced nor u sed. That’s the w a y it is. . . T here is a notion of rand om v ariable, b ut it has really nothing to d o with rand om ob jects. F ormally , th ey are ju s t func- tions o v er some p robabilit y space. The name “random v ariable” is a mere v o cable to con vey the und erlying non formalize d intuitio n of randomness. Q : That’s right. I attended sev eral courses on Probabilit y T heory . Nev er heard an ything precise ab ou t random ob jects. And, n ow that you tell me, I realize that there was something strange for me with r andom v ariables. So, fi nally , our concrete exp erience of c hance and rand omness on wh ic h w e build so muc h int uition is simply remov ed from the formalization of Probabilit y Theory . Hum. . . Someho w, it’s as if the theory of computabilit y and pr ogramming w ere omitting the notion of pr ogram, real p r ograms. By the w a y , isn ’ it the case? In recursion theory , p rograms are red u ced to mere int egers: G¨ odel num b ers! A : S ure, recursion theory illum inates but d o es not exhaust the sub ject of programming. As concerns a n ew foundation of Probabilit y Th eory , it’s already quite remark able that Kolmogoro v h as lo ok ed at his o w n work on the sub ject with 4 suc h a d istance. S o muc h as to come to a new theory: the mathematization of rand omness. Ho w ev er, it seems (to us) that Kolmogoro v has b een am- biguous on the question of a new foun dation. Indeed, in his first pap er on the sub ject (1965, [23], p. 7), Kolmogoro v briefly evo ked that p ossibilit y : . . . to c onsider the use of the [Algorithmic Information The ory] c onstructions in pr oviding a new b asis for Pr ob ability The ory. Ho we v er, later (1983, [25], p. 35–36 ), he separated b oth topics “ther e is no ne e d whatso ever to change the establishe d c onstruc- tion of the mathematic al pr ob ability the ory on the b asis of the gener al the ory of me asur e. I am not encline d to attribute the signific anc e of ne c essary foundations of pr ob ability the ory to the investigations [ab out Kolmo gor ov c omplexity] that I am now go- ing to survey. But they ar e most inter esting in themselves. though stressing the r ole of his n ew theory of random ob jects for mathemat- ics as a whole ([25], p. 39): The c onc epts of information the ory as applie d to infinite se- quenc es g ive rise to very inter esting inve stigations, which, with- out b eing indisp ensable as a b asis of pr ob ability the ory, c an ac- quir e a c ertain value in the investigation of the algorithmic side of mathematics as a whole.” Q : All this is really exciting. Please, tell me ab out th is appr oac h to r an- domness. 1.2 In tuit ion of finite random strings and Berry’s parado x A : OK. W e shall fi rst consider finite strin gs. If you don’t mind, we can start with an appr oac h whic h actually fails but con ve ys the b asic in tuitive idea of ran d omness. W ell, just for a while, let’s sa y that a fi nite string u is random if there is no shorter wa y to describ e u but give the successiv e s y mb ols whic h constitute u . Saying it otherwise, the shortest description of u is u itself, i.e. the v ery writing of the string u . Q : S omething to d o with int ensionalit y and extensionalit y? A : Y ou are completely righ t. Our ten tativ e defi n ition d eclares a fi nite s tring to b e rand om jus t in case it do es not carry any int ensionalit y . So that there is no description of u bu t th e extensional one, which is u itself. 5 Q : But the n otion of description is s omewhat v ague. Is it p ossible to b e more precise ab out “descrip tion” and in tensionalit y? A : Div ers e partial formalizations are p ossib le. F or instance within any par- ticular logica l fir s t-order str ucture. But they are qu ite far from exhausting the intuitiv e notion of defin abilit y . In fact, the untamed intuiti v e notion leads to parado xes, m uc h as the int uition of tru th. Q : I presume y ou mean the liar paradox as concerns tr uth. As for defin - abilit y , it should b e Berry’s p arado x ab out “the smal lest inte ger not definable in less than eleven wor ds” and th is integ er is indeed d efined by this very sen tence con taining only 10 w ord s. A : Y es, these ones p r ecisely . By the wa y , this last parad ox was fir st m en - tioned by Bertrand Russell, 1908 ([38 ], p.222 or 150) who credited G.G. Berry , an Oxf ord librarian, for the suggestion. 1.3 Kolmogoro v complexit y relative to a function Q : An d how can one get aroun d suc h problems? A : What Solomonoff, Kolmogoro v and Chaitin did is a ve ry ingenious mov e: inste ad of lo oking for a gener al notion of definability, they r estricte d i t to c omputability . Of course, computabilit y is a p r iori as muc h a v ague and in tuitive notion as is d efinabilit y . But, as you know, since the thir ties, there is a mathematizatio n of the notion of computabilit y . Q : T h anks to K u rt, Alan and Alonzo. 3 A : Hum. . . W ell, with su c h a mov e, general d efinitions of a str in g u are replaced by programs which compute u . Q : Pr oblem: we h av e to admit Churc h’s thesis. A : OK. In fact, ev en if Ch urc h’s thesis w er e to br eak do wn , the theory of computable fu nctions w ould still r emain as elegan t a theory as yo u learned from Y ur i and other p eople. It w ould just b e a formalization of a prop er part of computabilit y , as is the theory of pr imitiv e recursive fun ctions or elemen tary fu nctions. As concerns Kolmogoro v theory , it would still hold and surely get extension to s u c h a new cont ext. 3 Kurt G¨ odel (1906–1978), Alan Mathison T uring (1912–1954 ) , Alonzo Churc h (1903– 1995). 6 Q : But wh ere do the p rograms come from? Are y ou considering T urin g mac hines or some programming language? A : An y partial computable fu nction A : { 0 , 1 } ∗ → { 0 , 1 } ∗ is consider ed as a programming language. The domain of A is seen as a family of p rograms, the v alue A ( p ) — if there is any — is the output of program p . As a whole, A can b e seen b oth as a language to w rite programs and as the asso ciated op erational semantic s. No w, K olmogoro v complexity relativ e to A is the fu nction K A : { 0 , 1 } ∗ → N whic h maps a string x to the length of shortest p rograms w hic h outp ut x : Definition 1. K A ( x ) = min {| p | : A ( p ) = x } (Con ven tion: min( ∅ ) = + ∞ , so th at K A ( x ) = + ∞ if x is outside the range of A ). Q : Th is d efinition reminds me of a discussion I had with Y uri some yea rs ago ([19] p.76–78). Y ur i explained me thin gs ab out Levin complexit y . I remem b er it inv olv ed time. A : Y es. Levin complexit y is a v ery clev er v ariant of K w hic h adds to the length of the pr ogram the log of the compu tation time to get th e output. It’s a m u c h finer notion. W e shall not consider it for our discussion ab out randomness. Y ou’ll fi n d some develo p men ts in [30] § 7.5. Q : T h ere are pr ograms and outputs. Where are the inpu ts? A : W e can do without inputs. It’s true that functions with no argu m en t are not considered in mathematics, b ut in computer s cience, they are. In fact, since V on Neumann, we all know that there can b e as muc h trad eof as d esir ed b etw een in put and program. This is indeed th e basic idea for unive r sal mac hin es and computers. Nev ertheless, Kolmogoro v [23] p oint s a natural role for inp uts w hen consid- ering c onditional Kolmo gor ov c omplexity in a sense v ery muc h alik e that of conditional probabilities. T o that purp ose, consider a p artial computable function B : { 0 , 1 } ∗ × { 0 , 1 } ∗ → { 0 , 1 } ∗ . A p air ( p, y ) in the domain of B is int erpreted as a program p toget her with an input y . And B ( p, y ) is the output of program p on inpu t y . Kolmogoro v [23] defin es th e conditional complexit y r elativ e to B as the fun ction K B ( | ) : { 0 , 1 } ∗ × { 0 , 1 } ∗ → N whic h maps strings x, y to the length of shortest p rograms whic h ou tp ut x on inpu t y : Definition 2. K B ( x | y ) = min {| p | : B ( p, y ) = x } 7 1.4 Wh y binary programs? Q : Pr ograms sh ould b e binary strings? A : Th is is merely a reasonable restriction. Binary strings sur ely hav e some flav or of mac h ine lev el p rogramming. Bu t this h as n othing to do with th e present choic e. In fact, b inary strings ju st allo w for a fairness condition. T he reason is that Kolmogoro v complexit y deals with lengthes of p rograms. Squaring or cubing the alphab et divides all lengthes by 2 or 3 as w e see when going from binary to o ctal or h ex- adecimal. So that binary representa tion of pr ograms is merely a w a y to get an absolute measure of length. If w e we r e to consider programs p w ritten in some fi nite alphab et Σ, we w ould h a ve to replace the length | p | by the pro du ct | p | log( car d (Σ)) wher e car d (Σ) is th e num b er of symb ols in Σ. Th is is an imp ortan t p oint when comparing K olmogoro v complexities asso ciated to div ers e programming languages, cf. 2.1. 1.5 What ab out other p ossible outputs? Q : O u tputs s hould also b e binary strings? A : Of course not. In Kolmogoro v approac h, outp uts are the finite ob jects for w hic h a n otion of r andomness is lo ok ed for. Binary strings constitute a simple instance. One can as w ell consider in tegers, or rationals or elements of any structure D with a n atural notion of computabilit y . The m o dification is straigh tforward: now A is a partial computable function A : { 0 , 1 } ∗ → D and K A : D → N is defined in the same wa y: K A ( x ), for x ∈ D , is the minim um length of a pr ogram p su ch that f ( p ) = x . 2 Optimal Kolmogoro v complexit y 2.1 The Inv ariance Theorem Q : W ell, for eac h partial computable f u nction A : { 0 , 1 } ∗ → { 0 , 1 } ∗ (or A : { 0 , 1 } ∗ → D , as y ou ju st exp lained) there is an asso ciated Kolmogoro v complexit y . S o, what is the Kolmogoro v complexit y of a given string? It dep end s on the chosen A . A : No w, comes the fundamenta l result of the theory , the so-called invarianc e the or em. W e shall state it uniquely for Kolmogoro v complexity but it also holds for conditional Kolmogoro v complexit y . Recall the enumer ation the or em : partial computable fun ctions can b e enu- merated in a partial compu table w ay . Th is means that there exists a p artial 8 computable function E : N × { 0 , 1 } ∗ → { 0 , 1 } ∗ suc h th at, for eve ry par- tial computable function A : { 0 , 1 } ∗ → { 0 , 1 } ∗ , there is some e ∈ N for whic h we hav e ∀ p A ( p ) = E ( e, p ) (equalit y m eans that A ( p ) and E ( e, p ) are sim u ltaneously d efi ned or not and, if defi n ed, of course they must b e equal). Q : W ait, wa it, I remember the d iagonal argum ent which prov es th at there is no enumeration of functions N → N . It go es through computabilit y . Giv en E : N × { 0 , 1 } ∗ → { 0 , 1 } ∗ , the function A : N → N suc h that A ( n ) = E ( n, n ) + 1 is different fr om eac h one of the fun ction n 7→ E ( n, e )’s. And if E is computable then A is computable. So, ho w can there b e an enumeration of compu table functions? A : There is no computable enumeration of compu table functions. Th e diagonal argument yo u recalled prov es that this is imp ossible. No w ay . But, we are not considerin g computable functions b u t partial computable functions. This mak es a b ig difference. Th e diagonal argum en t breaks d o wn . In f act, equalit y E ( n, n ) = E ( n, n ) + 1 is n ot incoherent: it just insur es that E ( n, n ) is not defin ed! Q : V ery strange prop er ty ,indeed. A : No, no. It’s quite intuitiv e no wa da ys, in our world with computers. Giv en a program in some fi x ed programming language, sa y language L I SP , an in terpr eter executes it. Thus, with one more argumen t, the simulate d program, the LISP compiler en umerates all fun ctions whic h can b e computed b y a LISP program. No w, an y partial computable function adm its a LISP program. Th u s, the LISP inte r preter gives you a computable enumeration of computable fun ctions. Q : O K . A : L et’s go b ac k to the inv ariance theorem. W e transform E in to a one ar g u ment partial compu table function U : { 0 , 1 } ∗ → { 0 , 1 } ∗ as follo ws: S et  U (0 e 1 p ) = E ( e, p ) U ( q ) = undefined if q con tains no o ccur rence of 1 (where 0 e is the string 00 ... 0 with length e ). Then, if A : { 0 , 1 } ∗ → { 0 , 1 } ∗ is partial computable and e 0 is suc h that ∀ p A ( p ) = E ( e 0 , p ), w e h a ve 9 K U ( x ) = min {| q | : U ( q ) = x } (definition of K U ) = min {| 0 e 1 p | : U (0 e 1 p ) = x } ≤ min {| 0 e 0 1 p | : U (0 e 0 1 p ) = x } (restriction to e = e 0 ) = min {| 0 e 0 1 p | : A ( p ) = x } ( e is a co de for A ) = e + 1 + min {| p | : A ( p ) = x } = e + 1 + K A ( x ) (definition of K A ) Let’s in tro duce u seful notations. F or f , g : { 0 , 1 } ∗ → N , let’s write f ≤ g + O (1) (resp. f = g + O (1)) to mean that there exists a constant c suc h that ∀ x f ( x ) ≤ g ( x ) + c (resp. ∀ x | f ( x ) − g ( x ) | ≤ c ) i.e. f is smaller than (resp. equal to) g up to an additiv e constan t c . What we h a ve just sho wn can b e expressed as th e follo wing theorem indep end en tly obtained by Kolmogoro v (1965 [23] p.5), Chaitin (1966 [5] § 9-11) an d Solomonoff (1964 [42] p .12, who gives the pro of as an informal argumen t). Theorem 3 (Inv ariance theorem) . K U ≤ K A + O (1) for any p artial c om- putable A : { 0 , 1 } ∗ → { 0 , 1 } ∗ . In other wor ds, up to an additive c onstant, K U is th e smallest one among the K A ’s. Th us, up to an additive constan t, there is a smallest K A . Of course, if K U and K V are b oth smallest, up to an add itiv e constant, th en K U = K V + O (1). Whence the follo win g defi n ition. Definition 4 (Kolmogo ro v complexit y) . Kolmogoro v complexit y K : { 0 , 1 } ∗ → N is any fixed such smallest (p to an add itiv e constan t) function K U . Let’s su m up . Th e inv ariance th eorem means that, up to an additiv e constan t, there is an in trin sic notion of K olmogoro v complexit y and we can sp eak of the Kolmogoro v complexit y of a binary strin g. Q : Which is an integ er d efined up to a constan t. . . Somewhat funny . A : Y ou witt y! Statemen ts that only m ak e sense in the limit o ccur ev ery- where in mathematical conte x ts. Q : Do n ot mind, I was joking. A : In fact, Kolmogoro v argued as follo ws ab out the constan t, [23] p. 6: 10 Of c ourse, one c an avoid the indeterminacies asso ciate d with the [ab ove] c onstants, by c onsidering p articular [ . . . functions U ], but it is doubtful that this c an b e done without explicit arbitr ari- ness. One must, however, supp ose that the differ ent “r e asonable” [ab ove uni v ersal functions] wil l le ad to “c omplexity estimates” that wil l c onver ge on hundr e ds of bits inste ad of tens of thou- sands. Henc e, such q uantities as the “c omplexity” of the text of “War and Pe ac e” c an b e assume d to b e define d with what amounts to uniqueness. Q : Using the interpretatio n yo u men tioned a minute ago with pr ogram- ming languages concernin g the enumeratio n th eorem, the constan t in the in v ariance theorem can b e view ed as the length of a L I SP program whic h in terp r ets A . A : Y ou are r igh t. 2.2 Co ding pairs of str ings A : Hav e yo u n oted th e tric k to enco d e an integ er e and a string p into a string 0 e 1 p ? Q : Y es, and the constan t is the length of the extra part 0 e 1. But you hav e enco ded e in u nary . Why not us e binary represen tation and th us low er the constan t? A : There is a problem. It is n ot trivial to en co d e t w o binary strings u, v as a b inary string w . W e need a tric k. But first, let’s b e clear: e nc o de here means to apply a computable injectiv e fu nction { 0 , 1 } ∗ × { 0 , 1 } ∗ → { 0 , 1 } ∗ . Observe that concate nation do es not w ork: if w = uv then we d on’t know whic h prefix of w is u . A n ew sy mb ol 2 inserted as a marke r allo ws for an enco ding: from w = u 2 v we can ind eed r eco v er u, v . How ev er, w is n o more a binary string. A simple solution u ses this last idea together with a padding fu nction ap- plied to u whic h allo ws 1 to b ecome an end-marker. Let p ad ( u ) b e ob- tained b y inserting a new zero in fron t of ev ery symbol in u . F or instance, pad (0101 1) = 000100010 1. No w, a simple enco ding w of strings u, v is the concatenati on w = pad ( u )1 v . In fact, the v ery d efinition of pad insures that the end of th e prefix pad ( u ) in w is marke d by the first o ccurrence of 1 at an o dd p osition (obvious con ve ntion: first symb ol h as p osition 1). Thus, from w we get pad ( u ) — hence u — and v in a very simple wa y: a finite 11 automaton can do the job! Ob serv e that | pad ( u )1 v | = 2 | u | + | v | + 1 (1) Q : Is th e constant 2 the b est one can do? A : No, on e can iterate the tric k. I nstead of p ad d ing u , one can pad th e string | u | wh ic h is the bin ary r epresent ation of the length of u . Look at the string w = pad ( | u | )1 uv . The first o ccurrence of 1 at an o dd p osition tells y ou wh ic h prefix of w is pad ( | u | ) and whic h suffix is uv . F rom the p refix pad ( | u | ), yo u get | u | , hence | u | . F r om | u | and the suffix uv , you get u an d v . Nice tric k, isn’t it? And since | | u || = 1 + ⌊ log( | u | ) ⌋ , we get | pad ( | u | )1 uv | = | u | + | v | + 2 ⌊ log( | u | ) ⌋ + 3 (2) Q : Exciting! On e could altoget her pad the length of the length. A : Sure. pad ( || u || )1 | u | uv is ind eed an enco ding of u, v . T he fi rst o ccur rence of 1 at an o dd p osition tells yo u whic h pr efix of w is pad ( || u || ) and whic h suffix is | u | uv . F rom the prefix p ad ( || u || ) , y ou get || u || hence || u || . Now, from | | u || and the su ffix | u | uv y ou get | u | — h ence | u | — and uv . F rom | u | and uv , you get u and v . Also, a simple compu tation leads to | pad ( || u || )1 | u | uv | = | u | + | v | + ⌊ log ( | u | ) ⌋ + 2 ⌊ log(1 + ⌊ log( | u | ) ⌋ ) ⌋ + 3 (3) Q : O f course, we can iterate this p ro cess. A : Right. But, let’s lea v e su ch refin emen ts. 2.3 Non determinism Q : O ur p roblematic is ab ou t rand omn ess. Ch ance, rand omness, arb itrari- ness, unreasonned choice , n on d eterminism . . . Why not add rand omn ess to programs by making them n on deterministic with seve ral p ossib le outputs? A : Caution: if a sin gle program can output ev ery strin g then K olmogoro v complexit y collapses. In order to get a non trivial th eory , y ou need to restrict non d eterminism. There is a lot of reasonable wa ys to do so. It happ ens that all lead to something wh ic h is essentiall y usu al K olmogoro v complexity up to some change of scale ([41], [18]). Same with the pr efix Kolmogoro v complexit y whic h w e sh all discuss later. 12 3 Ho w complex is Kolmogoro v complexit y? Q : W ell, let m e tell y ou some p oints I see ab out K A . The domain of K A app ears to b e the range of A . S o th at K A is total in case A is on to. Since there are finitely man y programs p with length ≤ n , there can b e only finitely many x ’s such that K A ( x ) ≤ n . So that, lim | x |→ + ∞ K A ( x ) = + ∞ . Also, in the defin ition of K A ( x ), there are 2 p oints: 1) find some program which outputs x , 2) make sure that all programs with shorter length either d o not h alt or ha ve output different from x . P oint 2 do es not matc h with defin itions of partial compu table functions! 3.1 Appro ximation from ab o ve A : Right. In general, K A is not p artial computable. Q : S o, no wa y to compu te K A . A : Definitely not, in general. Ho w ever, K A can b e app ro ximated from ab o ve: Prop osition 5. K A is the limit of a c omputable de cr e asing se quenc e of functions. Moreo v er, we can tak e suc h a sequence of functions with finite domains. T o see this, fix an algorithm A for A and denote A t the p artial function obtained by applyin g up to t steps of algorithm A f or the sole pr ograms with length ≤ t . It is clear that ( t, p ) 7→ A t ( p ) has computable graph . Also, K A t ( x ) = min {| p | : p ∈ { 0 , 1 } ≤ t and A t ( p ) = x } so that ( t, x ) 7→ K A t ( x ) has computable graph to o. T o conclude, just observe that ( t, x ) 7→ K A t ( x ) is decreasing in t (with the ob vious conv en tion that undef ined = + ∞ ) and that K A ( x ) = lim t →∞ K A t ( x ). The same is true for cond itional Kolmogoro v complexity K B ( | ). Q : If K A is not compu table, there should b e n o computable mo du lu s of con ve rgence for this appr oximati on sequence. So what can it b e go o d for? A : In general, if a function f : { 0 , 1 } ∗ → N can b e app ro ximated fr om ab o ve b y a computable sequence of functions ( f t ) t ∈ N then X f n = { x : f ( x ) ≤ n } is computably enumerable (in fact, b oth prop erties are equiv alen t). Which is a very us efu l prop ert y of f . Ind eed, suc h argumen ts are used in th e p ro of of some hard theorems in th e su b ject of Kolmogoro v complexit y . 13 Q : C ou ld yo u giv e me the fl a vor of w hat it can b e useful for? A : Supp ose y ou kno w that X f n is fin ite (whic h is indeed the case for f = K A ) and has exactly m elemen ts then yo u can exp licitly get X f n . Q : Exp licitly get a fi nite set? what do y ou mean? A : What we mean is that there is a computable fu nction which asso ciates to an y m, n a co de (in wh atev er mo delization of computabilit y) for a p artial computable fu nction w hic h has range X f n in case m is equal to the n u m b er of elemen ts in X f n . This is not trivial. W e d o this thanks to the f t ’s. Indeed, compute th e f t ( x )’s for all t ’s and all x ’s u nt il y ou get m different strings x 1 , . . . , x m suc h that f t 1 ( x 1 ) , . . . , f t m ( x m ) are d efined and ≤ n for some t 1 , . . . , t m . That y ou will get su c h x 1 , . . . , x m is insured by the fact that X f n has at least m elemen ts and that f ( x ) = min { f t ( x ) : t } for all x . Since f ≤ f t , surely these x i ’s are in X f n . Moreo v er, they indeed constitute the whole of X f n since X f n has exactly m elemen ts. 3.2 Do vetailing Q : Y ou run infin itely many computations, some of whic h never h alt. Ho w do yo u manage them? A : Th is is called dovetailing . Y ou organize these computations (wh ic h are infinitely many , s ome lasting f orev er) as follo ws: — Do up to 1 computation step of f t ( x ) for 0 ≤ t ≤ 1 and 0 ≤ | x | ≤ 1, — Do up to 2 computation steps of f t ( x ) for 0 ≤ t ≤ 2 and 0 ≤ | x | ≤ 2, — Do up to 3 computation steps of f t ( x ) for 0 ≤ t ≤ 3 and 0 ≤ | x | ≤ 3, . . . Q : S omeho w lo oks like Can tor’s en u m eration of N 2 as the sequence (0 , 0) (0 , 1) (1 , 0) (0 , 2) (1 , 1) (2 , 0) (0 , 3) (1 , 2) (2 , 1) (3 , 0) . . . A : This is really the same idea. Here, it w ould rather b e an en umeration ` a la Can tor of N 3 . When dealing with a multi-indexed family of compu tations ( ϕ t ( ~ x )) ~ t , y ou can imagine compu tation steps as tu p les of integ ers ( i, ~ t, ~ x ) where i denotes the rank of some computation step of f ~ t ( ~ x ) (here, these tuples are triples). Do ve tailing is ju st a wa y of enumerating all p oin ts in a discrete m ultidimensional s p ace N k via some zigzaggi ng ` a la Canto r. Q : W ell, Can tor wandering along a brok en line wh ic h fi lls the discrete p lane here b ecomes a wa y to sequentia lize parallel computations. 14 3.3 Undecidabilit y Q : Let’s go bac k to K A . If A is on to then K A is total. So that if A is also computable then K A should b e computable to o. In fact, to get K A ( x ) just compu te all A ( p )’s, for increasing | p | ’s until some has v alue x : this w ill happ en since A is onto. Y ou said K A is in general u ndecidable. I s this undecidabilit y r elated to th e fact that A b e p artial compu table and not computable? A : No. It’s p ossible that K A b e q u ite trivial with A as complex as y ou w ant. Let f : { 0 , 1 } ∗ → N b e an y partial computable function. Set A (0 x ) = x and A (1 1+ f ( x ) 0 x ) = x . Then, A is as complex as f th ou gh K A is trivial since K A ( x ) = | x | + 1, as is easy to chec k. Q : Is it hard to p ro ve that some K A is in deed not compu table? A : Not that m u c h. If U : { 0 , 1 } ∗ → { 0 , 1 } ∗ is optimal then we can show that K U is not computable. T h u s, K (which is K U for s ome fixed optimal U ) is not computable. And this is where Berry’s p arado x comes bac k. C on s ider th e length-lexicographic order on binary strings: u < hier v if and only if | u | < | v | or | u | = | v | and u is lexicographically b efore v . No w, lo ok, w e come to the core of the argument. T he ke y id ea is to in tro duce the fun ction T : N → { 0 , 1 } ∗ defined as follo ws: T ( i ) = the < hier smallest x s uc h th at K U ( x ) > i As you see, th is fu nction is nothing b ut an implemen tation of the v ery statemen t in Berry’s p arado x mo d ified according to K olmogoro v’s mo v e from definabilit y to computabilit y via the fu nction A . Clearly , w e ha ve K U ( T ( i )) > i (4) Supp ose, b y w ay of con tradiction, th at K U is computable. Then so is T an d so is the f unction V : { 0 , 1 } ∗ → { 0 , 1 } ∗ suc h that V ( p ) = T ( V al 2 (1 p )) wher e V al 2 (1 p ) is the int eger with bin ary r epresen tation 1 p . No w, if i > 0 has binary r epresen tation 1 p then T ( i ) = V ( p ), so that K V ( T ( i )) ≤ | p | = ⌊ log ( i ) ⌋ (5) The in v ariance theorem insu res that, for some c , w e h a ve K U ≤ K V + c (6) 15 F rom inequalities (4), (5 ), (6) w e get i < K U ( T ( i )) ≤ K V ( T ( i )) + c ≤ log ( i ) + c whic h is a con trad iction for i large enough since lim i → + ∞ log( i ) i = 0. Th us, our assump tion that K U b e computable is false. 3.4 No non trivial computable lo wer b ound Q : Q u ite nice an argum en t. A : Can get m uch more out of it: Theorem 6. 1) No r estriction of K U to an infinite c omputable set is c om- putable. 2) W orse, if X ⊆ { 0 , 1 } ∗ is c omputable and f : X → N is a c omputable function and f ( x ) ≤ K U ( x ) for al l x ∈ X then f is b ounde d! T o prov e this, just change th e ab ov e defi nition of T : N → { 0 , 1 } ∗ as follo ws: T ( i ) = the < hier smallest x ∈ X suc h that f ( x ) > i Clearly , by d efinition, w e ha v e f ( T ( i )) > i . Since T ( i ) ∈ X and f ( x ) ≤ K U ( x ) for x ∈ X , this implies equation (4) ab ov e. Also, f b eing compu table, so are T and V , and equation (5) still h olds. As ab o ve, we conclude to a con tradiction. Let’s reformulate this result in terms of th e greatest monotonous (with re- sp ect to ≤ hier ) lo wer b ound of K U whic h is m ( x ) = min y ≥ hier x K U ( y ) This fun ction m is monotonous and tends to + ∞ but it d o es so incredibly slo wly: on any c omputable set it c an not gr ow as f ast as any unb ounde d c omputable function. 3.5 Kolmogoro v complexit y and represen tation of ob jects Q : Y ou ha ve considered in tegers and their base 2 represen tations. Complex- it y of algorithms is often m u c h d ep endent on the w ay ob jects are represen ted. Here, y ou ha ve not b e v er y precise ab out rep resen tation of in tegers. A : Th er e is a simple fact. Prop osition 7. L et f : { 0 , 1 } ∗ → { 0 , 1 } ∗ b e p artial c omputable. 1) K ( f ( x )) ≤ K ( x ) + O (1) for e v ery x i n the domain of f . 2) If f is also inje ctive then K ( f ( x )) = K ( x ) + O (1) for x ∈ domain ( f ) . 16 Indeed, denote U some fixed u niv ers al fu n ction su c h that K = K U . T o get a program whic h outputs f ( x ), we j ust enco d e a p rogram π computing f together with a pr ogram p outpu tting x . F ormally , let A : { 0 , 1 } ∗ → { 0 , 1 } ∗ b e such th at A ( pad ( π )1 z ) is the output of π on input U ( z ) for all z ∈ { 0 , 1 } ∗ . Clearly , A ( pad ( π )1 p ) = f ( x ) so that K A ( f ( x )) ≤ 2 | π | + | p | + 1. T aking p such that K ( x ) = | p | , we get K A ( f ( x )) ≤ K ( x ) + 2 | π | + 1. The Inv ariance T h eorem in sures that K ( f ( x )) ≤ K A ( f ( x )) + O (1), whence p oint 1 of the Pr op osition. In case f is in jectiv e, it has a partial computable inv erse g with domain the range of f . Applyin g p oint 1 to f and g w e get p oin t 2. A : So all representa tions of inte gers lead to the same Kolmogoro v complex- it y , up to a constan t. A : Y es, as long as one can computably go from one repr esentati on to the other one. 4 Algorithmic In formation T h eory 4.1 Zip/Unzip Q : A moment ago, yo u said the sub ject was also named Algorithmic In for- mation Theory . Why? A : W ell, yo u can lo ok at K ( x ) as a measure of the inform ation conte n ts that x conv eys. T h e notion can also b e vividly describ ed using our everyda y use of compression/decompression soft wa r e (cf. Alexa nder Shen’s lecture [40]). First, notice the follo wing s imple fact: Prop osition 8. K ( x ) ≤ | x | + O (1) Indeed, let A ( x ) = x . Th en K A ( x ) = | x | and the ab o ve inequalit y is a mer e application of the Inv ariance Th eorem. Lo oking at the string x as a file, any pr ogram p such that U ( p ) = x can b e seen as a c ompr esse d file for x (esp ecially in case the righ t memb er in Prop osition 8) is ind eed < | x | . . . ). So, U app ears as a de c ompr ession algorithm whic h maps the compressed file p onto the original file x . In this w ay , K ( x ) measures the length of the shortest compressed files for x . What do es compression? It eliminates r edundancies, exp licits r egularities to shorten th e file. Thus, maximum compr ession redu ces the file to the core of its information con ten ts which is therefore m easured b y K ( x ). 17 4.2 Some relations in Algorithmic Information Theory Q : O K . An d wh at d o es Algorithmic Information T heory lo ok like? Q : Conditional complexity should give some nice relations as is the case with conditional pr ob ab ility . A : Y es, there are r elations which h a ve some prob ab ility theory fla v or. Ho w - ev er, there are often logarithmic extra terms wh ic h come from the enco ding of pairs of strings. F or ins tance, an easy relation: K ( x ) ≤ K ( x | y ) + K ( y ) + 2 log(min( K ( x | y ) , K ( y ))) + O (1) (7) The idea to get this relation is as follo ws. Supp ose y ou h av e a p rogram p (with no p arameter) which outputs y and a program q (with one parameter) whic h on inp ut y outp uts x , then yo u can mix them to get a (no p arameter) program whic h outputs x . F ormally , S u pp ose that p, q are optimal, i.e. K ( y ) = | p | and K ( x | y ) = | q | . Let A 1 , A 2 : { 0 , 1 } ∗ → { 0 , 1 } ∗ b e s u c h that A 1 ( pad ( | z | )1 z w ) = A 2 ( pad ( | w | )1 z w ) = V ( w , U ( z )) where V denotes some fi xed unive rsal fun ction such that K ( | ) = K V ( | ). It is clear that A 1 ( pad ( | p | )1 pq ) = A 2 ( pad ( | q | )1 pq ) = x , so that K A 1 ( x ) ≤ | p | + | q | + 2 log ( | p | ) + O (1) K A 2 ( x ) ≤ | p | + | q | + 2 log ( | q | ) + O (1) whence, p, q b eing optimal pr ograms, K A 1 ( x ) ≤ K ( y ) + K ( x | y ) + 2 log( K ( y )) + O (1) K A 2 ( x ) ≤ K ( y ) + K ( x | y ) + 2 log ( K ( x | y )) + O (1) Applying the Inv ariance Th eorem, we get (7). 4.3 Kolmogoro v complexit y of pairs Q : What ab out pairs of strings in the v ein of the p robabilit y of a p air of ev ents? A : First, we ha v e to defi ne the Kolmogoro v complexit y of p airs of strin gs. The k ey fact is as follo ws : Prop osition 9. If f , g : { 0 , 1 } ∗ × { 0 , 1 } ∗ → { 0 , 1 } ∗ ar e e nc o dings of p airs of strings (i.e. c omputable inje ctions), then K ( f ( x, y )) = K ( g ( x, y )) + O (1) . As we alw ays argue up to an additive constan t, this leads to: Definition 10. The K olmogoro v complexity of pairs is K ( x, y ) = K ( f ( x, y )) where f is an y fixed enco ding. 18 T o pro ve Pr op osition 9, observe that f ◦ g − 1 is a p artial computable injection suc h that f = ( f ◦ g − 1 ) ◦ g . Then , apply Pr op osition 7 with argum en t g ( x, y ) and function f ◦ g − 1 . 4.4 Symmetry of information A : Relation (7) can b e easily impro v ed to K ( x, y ) ≤ K ( x | y ) + K ( y ) + 2 log(min( K ( x | y ) , K ( y ))) + O (1) (8) The same pro of works. Just obser ve th at fr om b oth p r ograms pad ( | p | )1 pq and pad ( | q | )1 pq one gets q hence also y . No w, (8) can b e considerably imp r o ved: Theorem 11. | K ( x, y ) − K ( x | y ) − K ( y ) | = O (log ( K ( x, y )) This is a hard result, indep endent ly obtained by Kolmogoro v and Levin around 1967 ([24], [50] p.117). W e b etter skip the pro of (y ou can get it in [40] p. 6–7 or [30] Thm 2.8.2 p. 182 –183). Q : I d on’t r eally see th e meaning of that theorem. A : Let’s restate it in another form. Definition 12. I ( x : y ) = K ( y ) − K ( y | x ) is called the algorithmic infor- mation ab out y con tained in x . This n otion is qu ite intuitiv e: y ou tak e the difference b et w een the whole information con tents of y an d that wh en x is kno w n for free. Con trarily to what wa s exp ected in analogy with S hannon’s classical in for- mation theory , this is n ot a symmetric function Ho w ev er, up to a logarithmic term, it is symmetric: Corollary 13. | I ( x : y ) − I ( y : x ) | = O ( log ( K ( x, y )) F or a pro of, just app ly Th eorem 11 with K ( x, y ) and K ( y , x ) and ob s erv e that K ( x, y ) = K ( y , x ) + O (1) (use Prop osition 9). 5 Kolmogoro v complexit y and Logic 5.1 What to do with parado xes Q : S omeho w, S olomonoff, K olmogoro v and C h aitin ha v e b uilt up a theory from a parado x. 19 A : Right. In fact, there seems to b e t wo m athematical wa ys to wards p ara- do xes. The most natural one is to get rid of them b y building secured and delimited mathematical framew orks whic h will lea v e them all ou t (at least, w e hop e so. . . ). Historically , this was the wa y follo we d in all sciences. A sec- ond wa y , wh ic h came up in the 20th century , somehow in tegrates p arad oxes in to scienti fic theories via some clev er and soun d (!) use of the ideas th ey con ve y . Kolmogo ro v complexit y is su c h a r emark able inte gration of Berry’s parado x in to m athematics. Q : As G¨ odel did with the liar p arado x wh ic h is underlying h is in complete- ness theorems. Can we compare these p arado xes? A : Hard qu estion. The liar parado x is ab out truth wh ile Berry’s is ab out definabilit y . View ed in computational terms, truth and definabilit y someho w corresp ond to d enotational an d op erational seman tics. This leads to exp ect conn ections b et ween incompleteness theorems ` a la G¨ odel and Kolmogoro v inv estigati ons. 5.2 Chaitin I ncompleteness results Q : So, in completeness theorems can b e obtained from Kolmogoro v theory? A : Y es. Gregory Ch aitin, 1971 [7], p oin ted and p opu larized a simple but clev er and sp ectacular application of Kolmogoro v complexit y (this original pap er by Ch aitin did not consider K but the num b er of states of T u ring mac hines, whic h is m u c h similar). Let T b e a computable theory conta in ing P eano arithmetic suc h that all axioms of T are true statemen ts. Theorem 14. Ther e exists a c onstant c such that if T pr oves K ( x ) ≥ n then n ≤ c . The pro of is by w ay of con tradiction and is a redo of the undecidabilit y of K U P AS C AL bin . Sup p ose that T can p ro ve statemen ts K ( x ) ≥ n for arbitrarily large n ’s. Consider a computable en umeration of all theorems of T and let f : N → N b e such that f ( n ) is the first string x suc h that K ( x ) ≥ n app ears as a theorem of T . Ou r h yp othesis in sures that f is total, hence a computable function. By ve ry defi nition, K ( f ( n ) ≥ n (9) Also, applying Prop ositions 7 and 8 we get K ( f ( n )) ≤ K ( n ) + O (1) ≤ log( n ) + O (1) (10) 20 whence n ≤ log ( n ) + O (1), which is a con tr ad iction if n is large enough. Q : Quite nice. But this do es not giv e an y explicit statemen t. How to compute the constan t c ? How to get any explicit x ’s su ch that K ( x ) > c ? A : Righ t. Hum. . . y ou could also see this as a particularly strong f orm of incompleteness: y ou ha ve a v ery simple infinite family of statemen ts, only finitely many can b e p ro ved but you don ’t know w hic h ones. 5.3 Logical complexity of K A : By the wa y , there is a p oin t we sh ould mentio n as concerns th e logical complexit y of Kolmogoro v complexit y . Since K is total and not computable, its graph can n ot b e computably en u m erable (c.e.). How ev er, the graph of an y K A (hence that of K ) is alw a ys of the form R ∩ S wh ere R is c.e. and S is co-c.e. (i.e. th e complemen t of an c.e. relation). W e can see this as follo ws. Fix an algorithm P for A and denote A t the partial fun ction obtained by applying up to t computation steps of this algorithm. Then K A ( x ) ≤ n ⇔ ∃ t ( ∃ p ∈ { 0 , 1 } ≤ n A t ( p ) = x ) The relation within parentheses is computable in t, n, x , so that K A ( x ) ≤ n is c.e. in n, x . Replacing n by n − 1 and going to negations, w e see that K A ( x ) ≥ n is co-c.e. S ince K A ( x ) = n ⇔ ( K A ( x ) ≤ n ) ∧ ( K A ( x ) ≥ n ), we conclude that K A ( x ) = n is the inte rsection of an c.e. and a co-c.e r elations. In terms of P ost’s hierarch y , the graph of K A is Σ 0 1 ∧ Π 0 1 , hence ∆ 0 2 . The same with K B ( | ). Q : W ould y ou remind me ab out P ost’s hierarc hy? A : Em il Post in tro duced families of r elations R ( x 1 . . . x m ) on strings and/or in tegers. Let’s lo ok at th e first tw o lev els: Σ 0 1 and Π 0 1 are the resp ectiv e families of c.e. and co-c.e. relations, Σ 0 2 is th e family of pro j ections of Π 0 1 relations, Π 0 2 consist of complement s of Σ 0 2 relations. Notatio ns Σ 0 i and Π 0 i come f rom the f ollo wing logica l c haracterizations: R ( ~ x ) is Σ 0 1 if R ( ~ x ) ⇔ ∃ t 1 . . . ∃ t k T ( ~ t, ~ x ) with T computable. R ( ~ x ) is Σ 0 2 if R ( ~ x ) ⇔ ∃ ~ t ∀ ~ u T ( ~ t, ~ u, ~ x ) with T computable. Π 0 1 and Π 0 2 are defined similarly with qu an tifications ∀ and ∀ ∃ . Eac h of these f amilies is closed u nder union and intersec tion. But not u nder complemen tation since Σ 0 i and Π 0 i are so exc hanged. 21 A last notation: ∆ 0 i denotes Σ 0 i ∩ Π 0 i . In particular, ∆ 0 1 means c.e. and co-c.e. hence computable. As for inclusion, ∆ 0 2 strictly conta ins the b o olean closure of Σ 0 1 , in particular it con tains Σ 0 1 ∪ Π 0 1 . Th is is wh y the term hierarc h y is used. Also, we see that K A is quite lo w as a ∆ 0 2 relation since Σ 0 1 ∧ Π 0 1 is the v ery first lev el of the b o olean closure of Σ 0 1 . 6 Random finite strings and their app lications 6.1 Random versus ho w mu c h random Q : Let’s go b ac k to the question: “what is a random string?” A : This is the interesti ng question, but th is will not b e the one w e shall answ er . W e shall mo d estly consider th e question: “T o what exten t is x random?” W e kno w that K ( x ) ≤ | x | + O (1). It is tempting to d eclare a string x random if K ( x ) ≥ | x | − O (1). But wh at d o es it really mean? Th e O (1) hides a constan t. Let’s explicit it. Definition 15. A str in g is called c -incompressible (where c ≥ 0 is any constan t) if K ( x ) ≥ | x | − c . Other strings are called c -compressible. 0-incompressible strings are also called in compr essible. Q : Are there man y c -incompressible strings? A : Kolmogoro v n oticed th at they are qu ite n umerous. Theorem 16. F or e ach n the pr op ortio n of c - i nc ompr essible among strings with length n is > 1 − 2 − c . F or instance, if c = 4 then, for any length n , more than 90% of strings are 4-incompressible. With c = 7 and c = 10 we go to more than 99% and 99 . 9%. The p r o of is a s imple counting argumen t. T h ere are 1 + 2 + 2 2 + · · · + 2 n − c − 1 = 2 n − c − 1 programs with length < n − c . Every string with length n wh ic h is c -compressible is necessarily the output of suc h a program (but some of these programs may not halt or may outpu t a string with length 6 = n ). Thus, there are at most 2 n − c − 1 c -compressible strings with length n , hence at least 2 n − (2 n − c − 1) = 2 n − 2 n − c + 1 c -incompressible strings with length n . Whence the pr op ortion stated in the theorem. 22 Q : Are c -incompressib le strings really rand om? A : Y es. Martin-L¨ of, 1965 [34], formalized the notion of statistical test and pro ved that incompressible strings p ass all these tests (cf. § 8.3). 6.2 Applications of random finite strings in computer science Q : An d wh at is the us e of incompressible strings in compu ter s cience? A : Roughly s p eaking, incompressible strin gs are strings without an y form of lo cal or global regularit y . Consideration of suc h ob jects may help almost an ytime one h as to sh o w something is complex, f or instance a lo wer b ound for w orst or av erage case time/space complexit y . The accompan ying key to ol is Prop osition 7 . And, indeed, incompr essible strings h a ve b een successfully used in suc h con- texts. An impr essiv e compilation of su c h applications can b e foun d in Ming Li and P aul Vitanyi’ s b o ok ([30], chapter 6), r unnin g th rough nearly 100 pages! Q : C ou ld yo u giv e an example? A : Sur e. T he v ery fir s t s u c h application is quite repr esen tativ e. It is due to W olfgang P aul, 1979 [37] and gives a quadr atic lo wer b ound on the computa- tion time of an y one-tap e T ur ing mac hine M wh ic h r ecognizes palindromes. Up to a linear w aste of time, one can sup p ose that M alw a ys h alts on its first cell. Let n b e ev en and xx R = x 1 x 2 . . . x n − 1 x n x n x n − 1 . . . x 2 x 1 b e a palindrome written on the inp ut tap e of the T urin g mac hin e M . F or eac h i < n let C S i b e th e crossing sequence asso ciated to cell i , i.e. the list of successiv e states of M wh en its h ead visits cell i . Key fact: string x 1 x 2 . . . x i is u ni q uely determine d by C S i . I.e. x 1 x 2 . . . x i is th e sole string y such that — relativ e to an M -computation on some palindrome with prefix y —, the crossing sequence on cell | y | is C S i . This can b e s een as follo ws. Su pp ose y 6 = x 1 x 2 . . . x i leads to th e same crossing sequence C S i on cell | y | for an M -computation on s ome p alindrome y z z R y R . Run M on inpu t y x i +1 . . . x n x R . Consider the b eh aviour of M while the head is on the left part y . T h is b eha viour is exactly the same as that for th e r un on inp ut y z z R y R b ecause the sole u seful information f or M while scanning y comes f rom the crossing sequence at cell | y | . In p articular, M – whic h halts on cell 1 — accepts this input y x i +1 . . . x n x R . But th is is not a palindr ome! Contradicti on. 23 Observe that the wa y x 1 x 2 . . . x i is uniquely determined by C S i is quite complex. But w e don’t care ab out that. It w ill just c harge the O (1) constan t in (11). Using Prop osition 7 with the binary string asso ciated to C S i whic h is c times longer (where c = ⌈| Q |⌉ , | Q | b eing the num b er of states), w e see th at K ( x 1 x 2 . . . x i ) ≤ c | T i | + O (1) (11) If i ≥ n 2 then x 1 x 2 . . . x n 2 is uniquely determined by the pair ( x 1 x 2 . . . x i , n 2 ). Hence also by the pair ( C S i , n 2 ). Sin ce the b inary represen tation of n 2 uses ≤ log ( n ) bits, this p air can b e enco ded with 2 c | C S i | + log( n ) + 1 bits. Thus, K ( x 1 x 2 . . . x n 2 ) ≤ 2 c | C S i | + log ( n ) + O (1) (12) No w, let’s s um equ ations (12) f or i = n 2 , . . . , n . Observe that th e su m of the lengthes of the crossing sequen ces C S n 2 , . . . , C S n is at most th e n u m b er T of computation steps. Th erefore, this su mmation leads to n 2 K ( x 1 x 2 . . . x n 2 ) ≤ 2 cT + n 2 log( n ) + O ( n 2 ) (13) No w, consider as x a s tr ing such that x 1 x 2 . . . x n 2 is incompr essible, i.e. K ( x 1 x 2 . . . x n 2 ) ≥ n 2 . Equation (13) leads to ( n 2 ) 2 ≤ 2 cT + n 2 log( n ) + O ( n 2 ) (14) whence T ≥ O ( n 2 ). Since the inpu t xx R has length 2 n , this pr ov es the quadratic lo wer b ou n d. QED 7 Prefix complexit y 7.1 Self delimiting programs Q : I h eard ab out prefix complexit y . What is it? A : Pr efix complexit y is a very interesting v arian t of Kolmogoro v complexit y whic h wa s in tro d uced arou n d 1973 b y Levin [28] and , indep endently , by Chaitin [8]. The basic idea is tak en f rom some p rogramming languageswhic h ha ve an explicit delimiter to mark the end of a program. F or instance, P AS C AL uses “ end. ”. Thus, no program can b e a p rop er prefix of another p rogram. Q : Th is is not tru e with P RO LO G pr ograms: you can alwa ys add a new clause. 24 A : T o execute a P RO LO G p rogram, you ha ve to write do wn a query . And the en d of a query is marke d by a full stop. So, it’s also true for P R O LO G . Q : O K . How ev er, it’s not tru e for C programs nor LI S P p rograms. A : Hum. . . Y ou are righ t. 7.2 Chaitin-Levin prefix complexit y A : Let’s say that a set X of str ings is prefix-fr ee if no string in X is a pr op er prefix of another string in X . A programming language A : { 0 , 1 } ∗ → { 0 , 1 } ∗ is prefix if its domain is a prefix-free s et. Q : So th e programming language P AS C AL that yo u seem to b e fond of is prefix. A : Su re, P AS C AL is pr efix. A : But, w h at’s n ew w ith th is sp ecial condition? A : Kolmogoro v Inv ariance Theorem from § 2.1 go es th rough with prefix programming languages, leading to the p r efix v arian t H of K . Theorem 17 (Inv ariance theorem) . Ther e exists a pr efix p artial c omputable function U pr ef ix : { 0 , 1 } ∗ → { 0 , 1 } ∗ . such that K U pr ef ix ≤ K A + O (1) for any pr efix p artial c omputable function A : { 0 , 1 } ∗ → { 0 , 1 } ∗ . In other wor ds, up to an additive c onsta nt, K U pr ef ix is the smallest one among the K A ’s. Definition 18 (Prefix Kolmogoro v complexit y) . Prefi x Kolmogoro v com- plexit y H : { 0 , 1 } ∗ → N is an y fixed su c h fu nction K U pr ef ix . 7.3 Comparing K and H Q : How do es H compare to K ? A : A sim p le r elation is as follo ws: Prop osition 19. K ( x ) − O (1) ≤ H ( x ) ≤ K ( x ) + 2 log ( K ( x )) + O (1) Idem with K ( | ) and H ( | ) . The fir st inequalit y is a mere app lication of the I nv ariance Th eorem for K (since U pr ef ix is a p r ogramming language). T o get the second one, we con- sider a programming language U s u c h that K = K U and construct a p r efix programming language U ′ as follo w s: the domain of U ′ is the set of s tr ings of the form pad ( | p | )1 p and U ′ ( pad ( | p | )1 p ) = U ( p ). By very construction, the domain of U ′ is prefix-free. Also, K U ′ ( x ) = K U ( x ) + 2 log ( K U ( x )) + 1. 25 An application of the In v ariance T heorem for H giv es the second inequalit y of the Prop osition. This inequalit y can b e impr o ve d. A b etter en co ding leads to H ( x ) ≤ K ( x ) + log( K ( x )) + 2 log log ( K ( x )) + O (1) Sharp er relations hav e b een p ro ved b y S olo v a y , 1975 (unpub lished [43], cf. also [30] p. 211): Prop osition 20. H ( x ) = K ( x ) + K ( K ( x )) + O ( K ( K ( K ( x )))) K ( x ) = H ( x ) − H ( H ( x )) − O ( H ( H ( H ( x )))) 7.4 Ho w big is H ? Q : How big is H ? A : K and H b eha ve in similar wa ys. Nev er th eless, there are some differ- ences. Essentia lly a logarithmic term. Prop osition 21. H ( x ) ≤ | x | + 2 log( | x | ) + O (1) T o pro v e it, apply the H In v ariance T h eorem to the prefix fu nction A ( pad ( | x | )1 x ) = x . Of course, it can b e imp ro ved to H ( x ) ≤ | x | + log( | x | ) + 2 log log( | x | ) + O (1) Q : How big can b e H ( x ) − | x | ? A : W ell, to get a non trivial q u estion, w e ha ve to fix the length of the x ’s. The answe r is not a simple function of | x | as exp ected, it do es use H itself: max | x | = n ( H ( x ) − | x | ) = H ( | x | ) + O (1) Q : How big can b e H ( x ) − K ( x ) ? A : It can b e quite large: K ( x ) ≤ | x | − log ( | x | ) ≤ | x | ≤ H ( x ) happ en s for arbitrarily large x ’s ([30 ] Lemma 3.5.1 p. 208). 7.5 Con vergence of series and the Co ding Theorem Q : What’s s o really sp ecial with this pr efix condition? A : Th e p ossibility to use K r aft’s inequalit y . T his inequ alit y tells y ou that if Z is a p refix-free set of strings then Σ p ∈ Z 2 −| p | ≤ 1. Kraft’s inequalit y is n ot hard to prov e. Denote I u the set of infi nite s trings whic h admit u as prefix. Ob s erv e that 26 1) 2 −| p | is th e probabilit y of I u . 2) If u, v are prefix incomparable then I u and I v are d isj oin t. 3) S in ce Z is prefix, the I u ’s, u ∈ Z are pairwise d isjoin t and their u nion has probabilit y Σ p ∈ Z 2 −| p | < 1 The K A ( x )’s are lengthes of d istinct programs in a pr efix set (namely , the domain of A ). So, Kr aft’s inequalit y im p lies Σ x ∈{ 0 , 1 } ∗ 2 − K A ( x ) < 1. In fact, H satisfies the follo wing very imp ortant p rop erty , prov ed by Levin [29] (w h ic h can b e seen as another version of the Inv ariance Theorem for H ): Theorem 22 (Co din g Theorem) . Up to a multiplic ative factor, 2 − H is maximum among f unctions F : { 0 , 1 } ∗ → R such that Σ x ∈{ 0 , 1 } ∗ F ( x ) < + ∞ and which ar e appr oximable fr om b elow (in a sense dual to that in § 3.1, i.e. the set of p airs ( x, q ) such that q is r ational and q < F ( x ) is c.e.). 8 Random infinite sequences 8.1 T op-do wn approac h to randomness of infinite sequences Q : S o, we now come to r andom infi nite sequences. A : I t happ ens that th ere are t wo equiv alen t wa ys to get a mathematical notion of random sequences. W e sh all first consider the most n atural one, whic h is a sort of “top-do wn approac h”. Probabilit y la ws tell you that with probabilit y one such and su c h thin gs happ en , i.e. that some particular set of sequences has pr ob ability one. A natural approac h leads to consider as random those sequences wh ich satisfy all su ch la ws, i.e. b elong to the asso ciated sets (wh ic h ha v e probabilit y one). An easy wa y to r ealize this wo u ld b e to declare a sequence to b e rand om just in case it b elongs to all sets (of sequences) h a ving pr obabilit y one or, equiv alen tly , to no set having p robabilit y zero. Said otherwise, the family of r andom sequ en ces would b e the in tersection of all sets h aving p robabilit y one, i.e. the complemen t of the un ion of all sets ha ving p robabilit y zero. Unfortunately, this family i s empty! In fact, let r b e any sequ ence: the singleton set { r } has pr obabilit y zero and conta ins r . In order to maint ain the idea, w e h a ve to consider a not to o big family of sets with pr obabilit y one. Q : A counta ble f amily . 27 A : Righ t. Th e inte rsection of a countable family of set with pr obabilit y one will h a ve prob ab ility one. So that the set of r andom sequ ences will h a ve probabilit y one, whic h is a m uc h exp ected p rop erty . 8.2 F requency tests and von Mises random sequences A : This top-do w n app roac h w as pionneered b y Ric hard v on Mises in 1919 ([48], [49]) who insisted on frequency statistical tests. He declared an infin ite binary sequence a 0 a 1 a 2 . . . to b e rand om (h e used the term Kol lektiv ) if the frequence of 1’s is “ev eryw here” fairly distributed in the follo win g sense: i) L et S n b e the num b er of 1’s among the fi r st n terms of the sequence. Then lim n →∞ S n n = 1 2 . ii) Th e same is true f or ev ery subsequen ce a n 0 +1 a n 1 +1 a n 2 +1 . . . wh er e n 0 , n 1 , n 2 . . . are the successiv e in tegers n suc h that φ ( a 0 a 1 . . . a n ) = 1 where φ is an “ad- missible” place-selectio n rule. What is an “admissible” place-selection ru le w as n ot d efinitely settled by v on Mises. Alonzo Churc h, 1940, pr op osed that admissibilit y b e exactly computabilit y . It is n ot difficult to pro ve that the family of infin ite binary sequence satisfying the ab o ve cond ition has probability on e for an y p lace-selec tion rule. T aking the intersectio n o ve r all computable place-selection r ules, w e see that the family of vo n Mises-Ch ur c h rand om sequen ces has probabilit y one. However, von Mi ses-Chur ch notion of r andom se quenc e is to o lar ge. There are probabilit y laws which do not r educe to tests with place-selection rules and are not satisfied by all v on Mises-Churc h random sequen ces. As sho wn by Jean Ville, [47] 1939 , this is the case for the la w of iterate d log- arithm. This ve ry imp ortant la w (due to A. I. Khintc hin, 1924) expresses that with probabililt y one lim sup n → + ∞ S ∗ n p 2 log log( n ) = 1 and lim inf n → + ∞ S ∗ n p 2 log log( n ) = − 1 where S ∗ n = S n − n 2 √ n 4 (cf. William F eller’s b o ok [14], p. 186, 204–20 5). Q : W o w ! What d o these equations mean? A : They are quite m eaningful. The quan tities n 2 and p n 4 are the exp ec- tation and stand ard deviation of S n . So that, S ∗ n is obtained from S n b y normalization: S n and S ∗ n are linearly related as random v ariables, and S ∗ n ’s exp ectation and standard deviation are 0 and 1. 28 Let’s in terpr et the lim s u p equation, the other one b eing sim ilar (in fact, it can b e obtained from the fi rst one b y s y m metry). Remem b er that lim su p n → + ∞ f n is obtained as follo ws. Consider the se- quence v n = su p m ≥ n f m . The bigger is n the smaller is the set { m : m ≥ n } . So that the sequence v n decreases, and lim sup n → + ∞ f n is its limit. The la w of iterated logarithm tells y ou that w ith pr obabililt y one the set { n : S ∗ n > λ p 2 log log( n ) } is finite in case λ > 1 and infinite in case λ < 1. Q : O K . A : More p recisely , there are v on Mises-Churc h rand om sequences whic h satisfy S n n ≥ 1 2 for all n , a p rop erty w hic h is easily seen to con tradict th e la w of iterated logarithm. Q : S o, von Mises’ app roac h is defi nitely o ver. A : No. Kolmogoro v, 1963 [22], and Lo veland, 1966 [31], in dep end en tly considered an extension of the n otion of place-selec tion rule. Q : K olmogoro v once more. . . A : Ind eed. Kolmogo r o v allo ws p lace-sele ction rules giving sub s equences pro ceeding in some new order, i.e. mixed subsequences. The asso ciated notion of ran d omness is called Kolmogoro v stochastic r andomness (cf. [26] 1987) . S ince there are more conditions to satisfy , s to c hastic r andom se- quences form a sub class of v on Mises-Ch urc h ran d om sequences. T h ey con- stitute, in fact, a pr op er s ub class ([31]). Ho we v er, it is n ot kn o wn w h ether they satisfy all classical p robabilit y la ws. 8.3 Martin-L¨ of random sequences Q : S o, how to come to a successful theory of random sequences? A : Martin-L¨ of foun d su c h a th eory . Q : Th at w as not Kolmogoro v? The same Martin-L¨ of y ou mentio ned con- cerning random finite str in gs? A : Y es, the s ame Martin-L¨ of, in the ve ry same pap er [34] in 1965. Kol- mogoro v lo oke d for suc h a n otion, bu t it was Martin-L¨ of, a Swedish mathe- matician, who came to the p ertinen t id ea. A t that time, he was a p u pil of Kolmogoro v and s tudied in Mosco w. Martin-L¨ of made no u s e of Kolmogoro v random fin ite string to get th e right notion of infinite rand om sequence. What he did is to forget ab out the frequency c h aracter of computable sta- tistical tests (in vo n Mises-Ch u rc h notion of randomn ess) and lo ok for what 29 could b e the essen ce of general statistical tests and probab ility la ws. Whic h he did b oth for fi nite str in gs and for in finite sequences. Q : T h ough intuitiv e, this concept is r ather v ague! A : Indeed. And Martin-L¨ of ’s analysis of what can b e a probabilit y law is quite int eresting. T o pro v e a probabilit y la w amounts to pro v e th at a certain set X of se- quences has probability one. T o do this, one has to pro v e that the exception set — which is the complemen t Y = { 0 , 1 } N \ X — has probability zero. No w, in order to pr o ve th at Y ⊆ { 0 , 1 } N has probabilit y zero, basic measure theory tells us that one has to include Y in op en sets w ith arbitrarily small probabilit y . I .e. for eac h n ∈ N one must fi nd an op en set U n ⊇ Y which has probabilit y ≤ 1 2 n . If thin gs w er e on the real line R w e w ould sa y that U n is a coun table union of in terv als with rational endp oin ts. Here, in { 0 , 1 } N , U n is a countable union of sets of the form I u = u { 0 , 1 } N where u is a finite binary string and I u is the s et of infin ite sequ en ces wh ic h extend u . W ell, in order to p ro ve that Y has probabilit y zero, f or eac h n ∈ N one m ust find a family ( u n,m ) m ∈ N suc h that Y ⊆ S m I u n,m and P r oba ( S m I u n,m ) ≤ 1 2 n for eac h n ∈ N . And no w Martin-L¨ of mak es a crucial observ ation: mathematical proba- bilit y la ws whic h we can consider necessarily ha ve some effectiv e c h aracter. And this effectiv eness should reflect in the pro of as follo ws: the doubly indexe d se que nc e ( u n,m ) n,m ∈ N is c omputable. Th us, the set S m I u n,m is a c omputably enumer able op en set and T n S m I u n,m is a coun table in tersection of a c omputably enumer able family of op en sets . Q : T his observ ation has b een c heck ed for pro ofs of usual probabilit y laws? A : Sure. Let it b e the la w of large n umbers, that of iterated logarithm. . . In fact, it’s quite convincing. Q : T h is op en set S m I u n,m could n ot b e computable? A : No. A computable set in { 0 , 1 } N is alwa ys a finite u nion of I u ’s. Q : Why? A : Wh at do es it mean that Z ⊆ { 0 , 1 } N is computable? That ther e is some T uring mac hine s u c h that, if yo u write an infinite sequence α on the input tap e then after finitely many steps, the mac h ine tells yo u if α is in Z or not. Wh en it do es answer, the mac hin e has read but a finite prefix u of α , 30 so that it giv es th e same ans w er if α is replaced by any β ∈ I u . In fact, an app lication of K¨ onig’s lemma (which we shall not detail) sho ws that we can b oun d the length of suc h a prefix u . Whence th e fact that Z is a finite union of I u ’s. Q : OK. So, w e shall tak e as rand om sequences those sequen ces which are outside any set whic h is a coun table in tersection of a compu tably enumerable family of op en sets and has p robabilit y zero. A : T his would b e to o m uc h . Remem b er, P r oba ( S m I u n,m ) ≤ 1 2 n . Thus, the w ay the probabilit y of S m I u n,m tends to 0 is compu tably controll ed. So, here is Martin-L¨ of ’s d efinition: Definition 23. A set of infi n ite binary sequen ces is constructive ly of p r ob- abilit y zero if it is included in T n S m I u n,m where ( m, n ) 7→ u n,m is a p artial computable fun ction N 2 → { 0 , 1 } ∗ suc h that P r oba ( S m I u n,m ) ≤ 1 2 n for all n . And no w comes a v ery sur prising th eorem (Martin-L¨ of, [34], 1966) : Theorem 24. Ther e is a lar gest set of se quenc es (for the inclusion or dering) which is c onstructively of pr ob ability zer o. Q : Largest? u p to wh at? A : Up to n othing. Really largest set: it is constructiv ely of pr obabilit y zero and con tains an y other set constructiv ely of probabilit y zero. Q : How is it p ossible? A : Via a diagonaliza tion argument. Th e construction has some tec hni- calities but we can sk etch th e ideas. F rom the w ell-kno w n existence of unive r sal c.e. sets, we get a computable enumeration (( O i,j ) i ) j of sequences of c.e. op en sets. A sligh t transformation allo ws to satisfy the inequalit y P r oba ( O i ) ≤ 1 2 i . No w, s et U j = S e O e,e + j +1 (here lies th e diagonalizatio n !) Clearly , P r oba ( O j ) ≤ P e 1 2 e + j +1 = 1 2 j , so that T j U j is constructiv ely of probabilit y zero. Also, U j ⊇ O i,j for all j ≥ i whence ( T j U j ) ⊇ ( T j O i,j ). Q : So, Martin-L¨ of random sequences are exactly those lying in this largest set. A : Y es. An d all theorems in probabilit y theory can b e strengthened by replacing “with pr ob ab ility one” by “for all Martin-L¨ of r andom sequences” 31 8.4 Bottom-up approac h t o randomness of infinite sequences: Martin-L¨ of ’s Large oscillations theorem Q : S o, now, wh at is the b ottom-up approac h ? A : Th is appr oac h lo oks at the asymptotic algorithmic complexity of th e prefixes of the infi n ite binary s equence a 0 a 1 a 2 . . . , namely the K ( a 0 . . . a n )’s. The next theorem is the first s ignifi cativ e result relev an t to this approac h. P oint 2 is du e to Alb ert Meye r and Donald Lov eland, 1969 [32] p . 525. P oints 3,4 are due to Gregory Ch aitin, 1976 [9]. (Cf. also [30] 2.3.4 p .124). Theorem 25. The fol lowing c onditions ar e e quivalent: 1) a 0 a 1 a 2 . . . is c omputable 2) K ( a 0 . . . a n | n ) = O (1) . 3) | K ( a 0 . . . a n ) − K ( n ) | ≤ O (1) . 4) | K ( a 0 . . . a n ) − log ( n ) | ≤ O (1) . Q : Nice results. Let m e tell what I see. W e kno w that K ( x ) ≤ | x | + O (1). W ell, if we hav e the equalit y , K ( a 0 . . . a n ) = n − O (1), i.e. if maxim u m complexit y o ccurs for all p r efixes, th en the sequ ence a 0 a 1 a 2 . . . s hould b e random! Is it indeed the case? A : That’s a v ery tempting idea. And Kolmogoro v h ad also lo ok ed for such a c h aracterizat ion. Unfortunately , as Martin-L¨ of pro ved around 1965 (1966, [35]), ther e is no such se que nc e ! It is a p articular case of a more general result (just set f(n )=constan t). Theorem 26 (Large oscillations, [35]) . L et f : N → N b e a c omputable function such that Σ n ∈ N 2 − f ( n ) = + ∞ . Then, for eve ry b inary se quenc e a 0 a 1 a 2 . . . ther e ar e infinitely many n ’s su c h that K ( a 0 . . . a n | n ) < n − f ( n ) . Q : S o, the b ottom-up approac h completely fails as concerns a charact eri- zation of random s equ ences. Hum. . . But it do es su cceed as concerns com- putable sequences, which were already fairly well charact erized. F un n y! A : It’s how ev er p ossib le to sandw ic h the set of Martin-L¨ of ran d om sequen ces b et w een t wo sets of probability one defin ed in terms of the K complexit y of prefixes. Theorem 27 ([35]) . L et f : N → N b e c omputable such that the series Σ2 − f ( n ) is c omputably c onver gent. Set X = { a 0 a 1 . . . : K ( a 0 . . . a n | n ) ≥ n − O (1) for i nfinitely many n ’s. } Y f = { a 0 a 1 . . . : K ( a 0 . . . a n | n ) ≥ n − f ( n ) for al l but finitely many n ’s. } 32 Denote M L the set of Martin-L¨ of r ando m se quenc es. Then X and Y f have pr ob ability one and X ⊂ M L ⊂ Y f . NB: Prop er inclusions h a ve b een p ro ved b y P eter Schnorr, 1971 [39] (see also [30] 2.5.15 p.154). Let’s illustrate this theorem on an easy and sp ectacular corollary whic h uses the fact th at 2 − 2 l og( n ) = 1 n 2 and that the series Σ 1 n 2 is computably con- v ergent: if K ( a 0 . . . a n | n ) ≥ n − c for infinitely many n ’s then K ( a 0 . . . a n | n ) ≥ n − 2 log ( n ) for al l but finitely many n ’s. 8.5 Bottom-up approac h with prefix complexit y Q : What ab out considering prefix Kolmogoro v complexit y? A : Kolmogoro v’s original idea d o es w ork with p refix K olmogoro v complex- it y . T his has b een pr o ved b y Claus Pete r Sc h norr (1974, un published , cf. [8] Remark p. 106, and [10] p . 135-137 for a pro of ). Rob ert M. S olo v a y , 1974 (unp ublished [43]) strengthened Sc h norr’s result (cf. [10] p. 137-13 9). Theorem 28. The fol lowing c onditions ar e e quivalent: 1) a 0 a 1 a 2 . . . is Martin-L¨ of r amdo m r andom 2) H ( a 0 . . . a n ) ≥ n − O (1) for al l n . 3) lim n → + ∞ ( H ( a 0 . . . a n ) − n ) = + ∞ . 4) F or any c.e. se q u enc e ( A i ) i of op en su b sets of { 0 , 1 } N if Σ i P r oba ( A i ) < + ∞ then a 0 a 1 a 2 . . . b elongs to finitely many A i ’s. These equiv alences s tress the robustness of the notion of Martin-L¨ of ramdom sequence. 8.6 T op-do wn/Bottom-up approac hes: a sum up Q : I get somewhat confused with these t wo app r oac hes. Could yo u sum up. A : Th e top-down and b ottom-up approac h es b oth w ork and lead to the v ery same class of rand om sequences. Kolmogoro v lo oke d at the b ottom-up approac h from the very b eginning in 1964. But nothing w as p ossible with the original Kolmogoro v complexit y , Levin-Chaitin’s v arian t H w as n eeded. Q : T en years later. . . A : As for the top-do wn approac h, it was p ionneered b y vo n Mises since 1919 and made successful by Martin-L¨ of in 1965. Martin-L¨ of had to giv e 33 up von Mises frequ ency tests. Ho wev er, Kolmogoro v wa s muc h intereste d b y these frequency tests ([22 ]), and h e refined them in a very clev er wa y with the pur p ose to reco ver Martin-L¨ of randomness, wh ic h lead h im to the notion of Kolmogoro v sto chastic rand omness. Unfortunately , up to now, w e only know that Martin-L¨ of random ⇒ sto c hastic rand om ⇒ von Mises-Churc h r an d om. The second implication is kn o wn to b e strict b ut not the fir st one. W ould it b e an equiv alence, this would give a qu ite vivid c haracterization of random sequences via muc h concrete tests. 8.7 Randomness wit h other probabilit y distributions Q : All this is relativ e to the un if orm probabilit y d istribution. Can it b e extended to arbitrary p robabilit y distributions? A : Not arbitrary probab ility distribu tions, but computable Borel ones: those distribu tions P suc h that the sequen ce of r eals ( P ( I u )) u ∈{ 0 , 1 } ∗ (where I u is the set of infi nite sequences whic h extend u ) is computable, i.e. there is a computable fun ction f : { 0 , 1 } ∗ × N → Q such that | P ( I u ) − f ( u, n ) | ≤ 1 2 n . Martin-L¨ of ’s defin ition of random sequences extends trivially . As for c har- acterizat ions with v ariants of Kolmogoro v complexity , one has to replace th e length of a finite strin g u by th e qu an tity − log ( P ( I u )). 8.8 Chaitin’s real Ω Q : I r ead a lot of things ab out Chaitin’s r eal Ω . A : Gregory Chaitin, 1987 [11], explicited a sp ectacular random real and made it ve ry p opu lar. Consider a universal p refix p artial recursive function U and let Ω b e th e Leb esgue measur e of the set { α ∈ { 0 , 1 } N | ∃ n U ( α ↾ n ) is defined } Q : S eems to b e an a v atar of the h alting problem. A : In deed. It is the probabilit y that, on an infinite in put, the mac h ine whic h computes U halts in fin ite time (hence after reading a fi nite prefix of its input). Ω = Σ { 2 −| p | | U h alts on inp ut p } (15) Theorem 29. The binary exp ansion of Ω is Martin-L¨ of r amd om. 34 Q : How do es one pro ve that Ω is ramd om? A : U has prefix-free domain, hence Ω = Σ { 2 −| p | | p ∈ domain ( U ) } < 1. An y halting program with length n con tr ib utes for exactly 2 − n to Ω. Th us, if you know the firs t k digits of Ω then you kno w the num b er of halting programs with length ≤ k . F rom this num b er, by dov etailing, y ou can get the list of the h alting pr ograms with length ≤ k (cf. § 3.1, 3.2). Having these programs, y ou can get the first string u which is not the output of s u c h a program. Clearly , H ( u ) > k . No w, u is computably obtained from the first k d igits of Ω, so th at b y Prop osition 7 we ha v e H ( u ) ≤ H ( ω 0 . . . ω k ) + O (1). Whence H ( ω 0 . . . ω k ) ≥ k + O (1), which is condition 2 of Th eorem 28 (Sc h norr condition). This prov es that the binary expansion of Ω is a Martin-L¨ of ramdom sequence. Q : Ω s eems to d ep end on the un iversal mac hin e. A : S ure. W e can sp eak of the class of Chaitin Ω reals: those reals whic h express th e halting p robabilit y of some un iversal prefix pr ogramming lan- guage. Cristian Calud e & P eter Hertling & Bakhadyr K h oussaino v & Y ongge W ang, 1998 [4] (cf. also Anto n ` ın Ku ˘ cera & Theo dore Slaman, 2000 [27]) p ro ved a v ery b eautiful result: r is a Chaitin Ω real if and only if (the binary dev elopment of ) r is Martin-L¨ of r an d om and r computably en umerable fr om b elo w (i.e. the set of rational num b ers < r is c.e.). Q : I r ead that th is r eal has incredib le pr op erties. A : This real h as a v ery simple and app ealing definition. Moreo v er, as w e just n oticed, there is a simple w a y to get all size n halting p rograms from its n fir st digits. This leads to m an y consequences du e to the follo wing fact: an y Σ 0 1 statemen t of the form ∃ ~ x Φ( ~ x ) (wh ere Φ is a computable relation) is equ iv alen t to a statemen t insur ing that a certain program h alts, and th is program is ab out th e s ame size as th e statemen t. Now, deciding the truth of Σ 0 1 statemen ts is the same as deciding that of Π 0 1 statemen ts. And significant Π 0 1 statemen ts ab ound! Like F ermat’s last theorem (whic h is now Wiles’ theorem) or consistency statemen ts. This is w h y Chaitin says Ω is the “Wisdom real”. Other prop erties of Ω are common to all reals whic h ha ve Martin-L¨ of ramdom binary expansions. F or instance, transcendance and the fact that an y theory can giv e us but fin itely many digits. Hum. . . Ab out that last p oint, using Kleene’s recursion theorem, Rob ert Solo v a y , 1999 [45], pro ved that there are particular Ch aitin Ω reals ab out whic h a giv en theory can not predict any sin gle b it! 35 8.9 Non computable in v ariance Q : In some s ense, Martin-L¨ of ramd omness is a part of recur sion theory . Do random sequences form a T ur ing d egree or a family of T uring degrees? A : Oh , no! Randomness is definitely not computable inv ariant. It’s in fact a v ery fragile notion: quite in s ignifican t mo d ifications destroy randomness. This mak es ob jects lik e Ω so sp ecial. Let’s illustrate this p oint on an example. Supp ose yo u transform a random sequence a 0 a 1 a 2 a 3 . . . in to a 0 0 a 1 0 a 2 0 a 3 0 . . . T he sequence y ou obtain has the same T uring degree as the original one, but it is no more r andom since its digits with o dd ranks are all 0. A random sequence has to b e random ev erywh ere. Hum . . . for Martin-L¨ of random reals, I should rather sa y ”ev ery c.e. where”. Q : Everywhat? A : ”Ev ery c.e. w here”. I mean that if f is a computable function from N into N (in other w ord s, a computable enumeration of an c.e. set) then the sequence of digits with ranks f (0) , f (1) , f (2) , . . . of a Martin-L¨ of ran- dom sequence has to b e Martin-L¨ of random. In fact, yo u recognize h ere an extraction pr o cess ` a la v on Mises for which a r andom sequen ce should give another random sequence. Q : O K . Wh at ab out man y-one degrees? A : Same. Let’s repr esen t a binary in finite sequence α b y the set X α of p ositions of digits 1 in α . Then, n ∈ X a 0 a 1 a 2 ... ⇔ 2 n ∈ X a 0 0 a 1 0 a 2 0 ... Also, let ϕ (2 n ) = n and ϕ (2 n + 1) = k w here k is some fixed rank such th at a k = 0, then n ∈ X a 0 0 a 1 0 a 2 0 ldots ⇔ ϕ ( n ) ∈ X a 0 a 1 a 2 ... These t w o equiv alences prov e that X a 0 a 1 a 2 ... and X a 0 0 a 1 0 a 2 0 ... are man y-one equiv alen t. 9 More randomness Ther e ar e mor e things in he aven and e arth, Hor atio, Than ar e dr e amt of in your philosoph y. Hamlet , William Shakespeare 36 9.1 Bey ond c.e.: oracles and infinite computations Q : Are there other ran d om reals than Chaitin Ω reals ? A : Sure. Ju st replace in Martin-L¨ of ’s definition the computable en u mer- abilit y condition by a more complex one. F or instance, you can consider Σ 0 2 sets, whic h amount s to computable enumerabilit y with oracle ∅ ′ (the set whic h encod es the halting problem for T u ring mac hines). Q : W ait, w ait. Just a minute ago, you said that for all classical probabilit y la ws, c.e. op en sets, i.e. Σ 0 1 sets, are the ones which come in when pro vin g that the exception set to the la w has probabilit y zero. So, what could b e the use of suc h generalizations? A : Clearly , the more rand om s equences y ou ha ve whic h satisfy classical probabilit y laws, the more you s tr engthen these theorems as we s aid earlier. In this sens e, it is b etter to stick to Martin-L¨ of ’s definition. But yo u can also w ant to consider r andom s equ ences as wo rst ob jects to use in some con- text. Dep ending on that conte xt, y ou can b e lead to ask for muc h complex randomness conditions. Also, y ou can h a ve some very natural ob jects muc h alike Chaitin real Ω whic h can b e more complex. Q : Be kin d, giv e an example! A : In a recen t pap er, 2001 [2], V er´ onica Bec her & Chaitin & S ergio Daicz consider the probability that a prefi x universal p r ogramming language pr o- duces a finite output, though p ossibly run ning in definitely . T hey pro v e that this probabilit y is an Ω ′ Chaitin real, i.e. its b inary expansion is ∅ ′ -random. Bec her & Chaitin, 2002 [1 ], consider the p robabilit y for the output to rep- resen t a cofinite set of in tegers, relativ ely to some co d ing of sets of in tegers b y sequences. They pro v e it to b e an Ω ′′ Chaitin real. Suc h reals are as muc h app ealing and remark able as Chaitin’s real Ω and also they are logically more complex. 9.2 F ar b ey ond: Solo v a y random reals in set theory Q : I heard ab out Solo v a y random reals in set theory . Has it an y th ing to do with Martin-L¨ of ran d om reals? A : Hum... These n otions come from v ery differen t con texts. But w ell, there is a relation: prop er inclusion. Every Solo v a y rand om real is Martin- L¨ of random. The con v erse b eing far from true. In fact, these n otions of randomness are t w o extreme notions of randomness. Martin-L¨ of randomness 37 is the we ak est condition wh er eas Solo v a y randomness is r eally the s tr ongest one. So b ig indeed that for Solo v a y reals y ou need to work in set theory , not merely in recur sion theory , an d ev en w orse, y ou hav e to consid er tw o mo dels of set theory , say M 1 and an inner submo d el M 2 with the same ordinals. . . Q : Y ou mean transfi nite ord inals? A : Y es, 0 , 1 , 2 , 3 , . . . , ω , ω + 1 , ω + 2 , . . . , ω + ω (which is ω . 2) an so on: ω . 3 , . . . , ω .ω (w h ic h is ω 2 ) , . . . , ω 3 , . . . , ω ω , . . . In a mo del of set theory , yo u ha v e reals and may consid er Borel sets, i.e. sets obtained from rational interv als via iterated coun table un ions and coun table in tersections. Th us, you hav e reals in M 1 and r eals in M 2 and ev ery M 2 real is also in M 1 . Y ou also hav e Borel sets defined in M 2 . And to eac h such Borel set X 2 corresp onds a Borel set X 1 in M 1 with the same d efinition (w ell, some w ork is necessary to get a precise meaning, but it’s somewhat intuitiv e). One can sh o w that X 2 ⊆ X 1 and that X 1 , X 2 ha ve the v ery same measure, whic h is necessarily a real in M 2 . Such a Borel set X 1 in M 1 will b e called a M 2 -cod ed Borel set. No w, a real r in M 1 is S olo v a y r andom o ver M 2 if it lies in no m easure zero M 2 -cod ed Borel set of M 1 . Such a real r can n ot lie in the inner mo del M 2 b ecause { r } is a measure zero Borel set and if r w ere in M 2 then { r } w ould b e M 2 -cod ed and r should b e outsid e it, a contradict ion. In case M 1 is big enough relativ e to M 2 it can con tain reals wh ich are Solo v a y random o ver M 2 . It’s a rather tough sub ject, b ut y ou see: — Martin-L¨ of random reals are reals outside all c.e. G δ sets (i.e. in tersection of an c.e. sequence of op en sets) constructiv ely of m easur e zero. In other w ord s, outside a v ery smo oth countable family of Borel sets. Suc h Borel s ets are, in fact, co ded in any in ner su bmo del of set theory . — Solo v a y rand om r eals o v er a subm o del of set theory are reals outside ev ery measure zero Borel set co d ed in that sub mo del. Thus Solo v a y random reals can not b e in th e inner sub mo del. They ma y or ma y not exist, dep ending on how big is M 1 relativ e to M 2 . Q : What a strange theory . What ab out the motiv ations? A : Solo v a y introd uced random reals in set theory at the p ionneering time of indep en d ence r esults in set theory , usin g the metho d of forcing inv en ted b y Paul J. Cohen. T hat was in the 60’s. He u sed them to get a mo del of set theory in wh ic h eve r y set of reals is L eb esgue measurable [44]. Q : W o w ! it’s getting late. 38 A : Hop e y ou are n ot exhausted. Q : I r eally enj o ye d talking with y ou on s u c h a topic. Note. The b est reference to the sub ject are • L i & Vitanyi’s b o ok [30] • Downey & Hirs c hfeldt’s b o ok [13] Caution: In these tw o b o oks, C , K denote what is here – and in m any pap ers – denoted K , H . Among other very useful references: [3], [12], [17 ], [40 ] and [46]. Gregory Chaitin’s pap ers are a v ailable on h is h ome p age. References [1] V. Bec her and G. Chaitin. Another example of higher order rand om- ness. F und. Inform. , 51(4):325 –338, 2002. [2] V. Bec her, G. Chaitin, and S. Daicz. A highly random num b er. In C.S. Calude, M.J. Dineen, and S. S burlan, editors, P r o c e e dings of the Thir d Discr ete Mathematics and The or etic al Computer Scienc e Confer e nc e (DMTCS’01) , pages 55–68. Sp ringer-V erlag, 2001. [3] C. Calud e. Information and r ando mness . Spr inger, 1994. [4] C.S . Calude, P .H. Hertling, and B. Khous saino v Y. W ang. Recursively en u m erable reals and Chaitin Ω n um b ers. I n ST ACS 98 (Paris, 1998) , n u m b er 1373 in Lecture Notes in Computer Science, pages 596–606. Springer-V erlag, 1998. [5] G. Chaitin. On the length of p rograms f or computin g finite binary sequences. J. Asso c. Comput. Mach. , 13:547–5 69, 1966. [6] G. Chaitin. On the length of p rograms f or computin g finite binary sequences: statistical considerations. J. Asso c. Comput. Mach. , 16:145 – 159, 1969. [7] G. Chaitin. Computational complexit y and Gd el incompleteness th eo- rem. ACM SIGACT N ews , 9:11–12, 1971. Av aila ble on Ch aitin’s home page. 39 [8] G. Chaitin. A theory of pr ogram size formally id en tical to information theory . J ournal of the ACM , 22:329–340 , 1975. Av ailable on Ch aitin’s home page. [9] G. Chaitin. Information theoretic c haracterizations of infinite strings. The or et. Comput. Sci. , 2:45–48, 1976. Av ailable on Chaitin’s home page. [10] G. C haitin. Algorithmic Information The ory . Cam bridge Unive r sit y Press, 1987. [11] G. Chaitin. In completeness theorems for rand om reals. A dvanc es in Applie d M ath. , pages 119–146, 1987. Av aila ble on Chaitin’s h ome page. [12] J.P . Delaha ye. Information, c omplexit ´ e , hasa r d . Herm` es, 199 9 (2d edition). [13] R. Downey and D. Hirsc hfeldt. Algorith mic r andom ness and c omplexity . Springer, 2006. T o app ear. [14] W. F eller. Intr o duction to pr ob ability the ory and its applic ations , vol- ume 1. John Wiley , 1968 (3d edition). [15] M. F erbus and S. Grigorieff. Kolmogoro v complexities k min , k max on computable p artially order ed sets. The or et. Comput. Sci. , 352:159 –180, 2006. [16] M. F erb us and S. Grigorieff. Kolmogoro v complexit y an d set theoretical represent ations of int egers. Math. L o gic Quarterly , 52(4):38 1–409, 2006. [17] P . G´ acs. Lectures notes on descriptional complexit y and randomness. Boston University , pages 1–67, 1993. h ttp://cs-pub.b u.edu/facult y/gacs/Home.h tml . [18] S. Grigorieff and J .Y. Marion. Kolmogo r o v complexit y and n on - determinism. The or et. Comput. Sci. , 271:151– 180, 2002. [19] Y. Gurevic h . The Logic in C omputer Science Colum n : On Kol- mogoro v mac hines and related issu es. Bul l. EA TCS , 35:71–82, 1988. h ttp://researc h .microsoft.com/˜ gurevic h/ pap er 78. [20] D. Knuth. The Art of Computer Pr o gr amming. V olume 2: semi- numeric al algorithms . Addison-W esley , 1981 (2d edition). 40 [21] A.N. Kolmogoro v. Grundb e griffe der Wahscheinlichkeitsr e chnung . Springer-V erlag, 1933. English translation ‘F ound ations of the Theory of Probabilit y ’, Chelsea, 1956. [22] A.N. Kolmogoro v. On tables of random num b er s . Sankhya, The Indian Journal of Statistics, ser. A , 25:369– 376, 1963. [23] A.N. Kolmogoro v. Three approac h es to the quantita tiv e definition of information. Pr oblems Inform. T r ansmission , 1(1):1 –7, 1965. [24] A.N. Kolmogoro v. Some theorems ab out algorithmic ent rop y and algo- rithmic inform ation. U sp ekhi M at. N auk , 23(2):20 1, 1968 . (in r ussian). [25] A.N. Kolmogoro v. Com b inatorial foun dation of information theory and the calculus of probability . R ussian Math. Surveys , 38(4):29–40 , 1983. [26] A.N. Kolmogoro v and V. Usp ensky . Algorithms and randomness. SIAM J. The ory P r ob ab. Appl. , 32:389 –412, 198 7. [27] A. Ku˘ cera and T.A. Slaman. Randomness and recur s iv e enumerabilit y . SIAM J. on Computing , 2001. to app ear. [28] L. Levin. On the notion of random sequence. Soviet Math. Dokl. , 14(5): 1413– 1416, 1973 . [29] L. Levin. Random conserv ation inequalities; inf ormation and indep en- dence in mathematical theories. Information and Contr ol , 61:15–3 7, 1984. [30] M. Li and P . Vitan yi. An intr o duction to Kolmo g or ov c omplexity and its applic ations . Springer, 1997 (2d edition). [31] D. Lo v eland . A new in terp retation of v on Mises’s concept of random sequence. Z. Math. L o gik und Grund lagen Math. , 12:279– 294, 1966 . [32] D. Lov eland. A v arian t of the K olmogoro v concept of complexit y . In- formation and Contr ol , 15:510–5 26, 1969. [33] Mic hael Mac h tey and Paul Y oung. An intr o duction to the gener al the ory of algorithms . North-Holland, New Y ork, 1978. [34] P . Martin-L¨ of. The definition of random sequences. Information and Contr ol , 9:602–61 9, 1966. 41 [35] P . Martin-L¨ of. Comp lexit y of oscilati ons in infi nite binary sequences. Z. Wahrscheinlichkeitsthe orie verw. Geb. , 19:225–2 30, 1971. [36] J. Miller and L. Y u. On initial segmen t complexit y and degrees of randomness. T r ans. Amer. Math. So c. to app ear. [37] W. Pa ul. Kolmogoro v’s complexit y and lo wer b ounds. I n L. Bud ac h, ed- itor, Pr o c . 2nd Int. Conf. F undamentals of Computation The ory , pages 325–3 34. Ak ademie V erlag, 1979. [38] B. Russell. Mathematical logic as based on the theory of typ es. Amer. J. Math. , 30:222–2 62, 1908. Reprint ed in ‘F rom F rege to Gdel A s ource b o ok in mathematical logic, 1879-1931’, J. v an Heijeno ort ed ., p. 150- 182, 1967. [39] P . Schnorr. A u nified approac h to the definition of r an d om sequences. Math. Systems The ory , 5:246–2 58, 197 1. [40] A. Shen. Kolmogoro v complexit y an d its applications. L e c- tur e Notes, Uppsala University, Swe den , pages 1–23, 2000. h ttp://www.csd .u u.se/˜v oroby o v/ Cour ses/K C/2000/all.ps. [41] A. S hen and V. Usp ensky . Relations b etw een v arieties of Kolmogoro v complexities. Mathematic al systems the ory , 29:271– 292, 1996. [42] R. S olomonoff. A formal theory of ind uctiv e inference, p art 1. Infor- mation and c ontr ol , 7:1–22, 1965. [43] R.M. Solo v a y . Draft of pap er (or series of pap ers) on c haitin’s work, done for th e most p art during the p erio d of sept.-dec. 1974. Unp u blished man u script, IBM T homas W atson Researc h Cent er, Y orkto wn Heigh ts, NY. [44] R.M. Solo v a y . A mo del of set theory in which eve ry set of reals is Leb esgue measurable. Annals of Mathematics , 92:1–56, 1970. [45] R.M. Solo v a y . A v ersion of Ω for wh ic h ZFC can not predict a single b it. Centr e for Di sc r ete Math and Comp. Sc., Auckland, New Ze aland , 104:1– 11, 1999. h ttp://www.cs.auckla n d.ac.nz/staff-cg i-bin/mjd/secondcgi.pl . [46] V.A Usp ensky , A.L Semeno v, and A.Kh Shen . Can an individual se- quence of zeros and ones b e random. Russian M ath. Surveys , 41(1):121– 189, 1990. 42 [47] J. Ville. Etude critiqu e de la notion de Col le ctif . Gauthier-Villars, 1939. [48] R. von Mises. Grundlagen d er wa h rsc h einlic hkeit srec hnung. Mathemat. Zeitsch. , 5:52–99, 1919. [49] R. v on Mises. Pr ob ability, Statistics and T ruth . Macmillan, 1939 . Reprint ed: Do ver, 1981. [50] A. Z v onkin and L. Levin. The complexit y of fi nite ob jects and the dev elopment of the concepts of information and randomness by means of the theory of algorithms. Russian M ath. Su rveys , 6:83–12 4, 1970 . 43

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment