A Conversation with Peter Huber
Peter J. Huber was born on March 25, 1934, in Wohlen, a small town in the Swiss countryside. He obtained a diploma in mathematics in 1958 and a Ph.D. in mathematics in 1961, both from ETH Zurich. His thesis was in pure mathematics, but he then decide…
Authors: Andreas Buja, Hans R. K"unsch
Statistic al Scienc e 2008, V ol. 23 , N o. 1, 120– 135 DOI: 10.1214 /07-STS251 c Institute of Mathematical Statistics , 2008 A Conversation with P eter Hub er Andreas Buja and Hans R. K¨ unsch Abstr act. P eter J. Hub er w as b orn on Marc h 25, 1934, in W ohlen, a small to wn in the Swiss coun tryside. He obtained a diploma in math- ematics in 1958 and a Ph.D. in mathematics in 1961, b oth from ETH Zuric h. His thesis w as in pure mat hematics, but he then decided to go in to statistic s. He sp ent 1961 –1963 as a p ostdo c at the statistics depart- men t in Berk eley where h e wrote his first and most famous p ap er on robust statistics, “Robust Estimation of a Lo cation P arameter.” After a p ositio n as a visiting professor at C ornell Univ ersit y , he became a full professor at ETH Zurich. He work ed a t ETH un til 197 8, inte rsp ersed by visiting p ositions at Corn ell, Y ale, Princeton and Harv ard. After lea v- ing ETH, h e h eld professor p ositions at Ha rv ard Univ ersit y 1 978–1988 , at MIT 1988 –1992, and fin ally at the Univ ersit y of Ba yreuth from 1992 unt il his retiremen t in 1999 . He no w liv es in Klosters, a villa ge in the Grisons in the S wiss Alps. P eter Hub er has publish ed four b ooks and o ver 70 p ap ers on statis- tics and data analysis. In addition, he has wr itten more than a dozen pap ers and t w o b o oks on Bab ylonian mathemat ics, astronom y and his- tory . In 197 2, he deliv ered the W ald lectures. He is a fell o w of the IMS, of the American Asso ciatio n f or the Adv ancemen t of Science, and of the American Academy of Arts and S ciences. In 1988 he receiv ed a Hum- b oldt Aw ard and in 1 994 an honorary do ctorate from the Univ ersit y of Neuc hˆ atel. In addition to his fund amen tal results in robust statistics, P eter Hu b er made imp ortant cont ribu tions to computatio nal sta tistics, strategie s in d ata analysis, and applications of statistics in fi elds such as crystallograph y , EEGs, and human gro wth curves. This conv ersation to ok place at Professor Hub er’s home in Kloste rs, S witzerland, on No vem b er 10, 20 05. Andr e as Buja is is Liem Sio e Liong/First Pacific Comp any Pr ofessor of Statistics, Statistics Dep artment, The Wharton Scho ol, University of Pennsylvania, Philad elphia, Penns ylvania, USA (e-mail: buja.at.wh arton@gmail.c om ). Hans R . K ¨ unsch is Pr ofessor and Chair, Dep artment of Mathematics, ETH, CH-8092 Z¨ urich, Switzerland (e- m ail: kuensch@stat.math.ethz.ch ). This is an electronic reprin t of the original article published b y the Institute of Mathematical Statistics in Statistic al Scienc e , 2008, V ol. 23 , N o. 1 , 120 –135 . Thi s reprint differs from the orig inal in pagination and t yp ogr aphic detail. Fig. 1. Peter with his p ar ents, 1940. STUD Y YEARS AND THE MO VE INTO ST A TISTICS HK: Ho w d id y ou find y our wa y int o th e field of statistic s? 1 2 A. B UJA AND H. R. K ¨ UNSCH Fig. 2. In the military, 1954. Pete r i s se c ond f r om right. PH: M ore or less by acciden t. I started my career in pure mathematic s, more sp ecifically , in cat egory theory . I n oted a tendency in my researc h to mo v e from the concrete to the abstract, but if I w as al- ready in cate gory theory , and if I push ed further to ward the ab s tr act, I though t I w ould end up in empt y space. This d idn’t seem like the right w a y to start a career. I w an ted to start somewhere else, in a more concrete p lace in mathematics. I en tertained a mo v e to functional analysis, but around that time it so h ap p ened that ETH 1 w as lo oking for a senior statistic ian. In its searc h ETH had con tacted a cou- ple of seniors, among them Erich Lehmann, bu t they had d eclined. So ETH decided to n urture local tal- en t for stat istics, but the question w as who. In late 1960 I w as approac hed b y t wo pr ofessors at ETH, W alter Saxer and Eduard Stiefel. Saxer ma y hav e con tacted me fir st, but it wa s Stiefel who had made the suggestion to try me. I thought hard ab o ut the prop osal and u ltimately concluded th at mathemat- ical s tatistics is not that far from f u nctional analy- sis. The prop osal started to mak e sense to me, and I w an ted to giv e it a try . So I started to lo ok in to statistic s. A t this p oint, the story merges with another, older story . At the time I was col lab orating w ith B. L. v an der W aerden, the famous algebraist at the Univ er- sit y of Zu r ic h next do or to ETH, on writing a b o ok on ancien t astronomy . I had read v an d er W aer- den’s b ook on statisti cs, but I had nev er talk ed to him ab out the sub ject. Neither had I tak en any 1 The “Swiss F ederal Institute o f T ec hnology ,” abbreviated from German “Eidgen¨ ossisc he T echnisc he Ho chsc hule. ” At the time it w as lo cated in Zuric h only . In 1969, a sister school w as founded in Lausanne, abbreviated EPFL from F rench “ ´ Ecole Polytec hnique F ´ ed´ eral, Lausanne.” courses in probabilit y or statistic s. I had sampled t w o courses by Saxer and b y Linder, bu t I did n ’t last long b eca use they were to o lo w-lev el and, to b e h onest, not v ery captiv ating. Then I read v an der W aerden’s statistics text again, this time more carefully , and I r ead some of th e b o oks h e recom- mended in his forew ord, n amely , W ald’s “ Statisti- c al de cision f u nctions ” and Do ob’s “ Sto chastic pr o- c esses .” With this foundation it b eca me clear that I should go to Berk eley to learn statistics. Beno Eck- mann, m y Ph.D. advisor and the leading algebraist and top ologist at ETH at the time, tried to dissu ade me; h e wan ted me to sta y in top ology . I couldn’t see muc h of a risk, though, b eca use if I d idn’t lik e statistic s I could just c hange flo ors and sp end m y time in Berk eley’s math departmen t. So this is the circuitous story of h o w I got into statistics. AB: Ma yb e y ou could tell u s what ETH and the Univ ersit y of Zuric h w ere lik e at the time, a nd what studying in these plac es was lik e. Y ou sp en t a semester at the Univ ersit y and then switc hed to ETH. PH: It was n ot easy to d ecide in wh ic h place to study , Univ ersit y of Z uric h or ETH. The difference w as that at the Universit y y ou w ere essen tially free to choose what to d o and how to do it. Y ou could study without a single exam up to the Ph.D. defense. HK: That’s s u rprising. PH: A negativ e p oint w as th at y ou w ere never sure whether mandatory courses we re offered. Y ou couldn’t b e sure when y ou w ould finish y our stud - ies, without fee db ack from exams and with no disin- cen tiv e for pro crastination. ETH as a tec hnical uni- v ersit y w as muc h more structured. Y ou had a fixed course of study , a nd y ou could finish with a diploma (equiv alen t to a m asters degree) after four y ears. ETH was on a yearly sc hedu le and w ould start in fall, whereas the Universit y w as more loosely orga- nized, and y ou co uld s tart also in spring. A fter high sc ho ol and b efore the obligato ry Swiss military ser- vice of 17 we eks, I had to mak e u se of m y time b e- t w een graduation (“Ma turit¨ at”) in Marc h and the b eginning of b o ot camp in July . So in spring 1954 I enrolled at the Univ ersit y and sampled differen t courses. In partic ular, I attended v an der W aerden’s course on algebra. I found out that h e was inte r- ested in the history of mathematics, and I w as i nter- ested in Ass yr iology , Bab ylonian mathematics and the like . AB: That in terest go es bac k to high sc ho ol . . . PH: Y es, that’s another story w e ma y get to . . . In addition to learning algebra from v an der W aerden, A CONV ERSA TION WITH PETER HUBER 3 I soon w as in clo se con tact with him working o n the history of mathemati cs and astronom y . I lik ed his st yle, whic h was ve ry direct. Lo oking back, I think, he had a decisiv e influence on me b ecause he st arted me on writing pap ers and publishing them, and that help ed. HK: Ho w man y studen ts were there in mathemat- ics at th at time? PH: I don’t remember the precise num b ers, but at the Univ ersit y , in a graduate course such as algebra there w ere eigh t or ten students, a nd at ETH, when I started, the cohort of mathematics and physics to- gether consisted of ab out 40, of which three quarters ma y hav e b een physic ists and one quarter mathe- maticians. The fir s t t w o yea rs mathematicians and physi cists we re e ssentia lly to gether all the ti me. T he difference wa s t hat th e mathematicians took astron- om y in th e first term while the p h ysicists took c hem- istry , and if y ou were not sure, yo u took both, wh ic h I did. I wasn’t su r e which wa y I would go. HK: Going bac k to the b ooks y ou read when y ou got in to statistics, y ou didn’t men tion Cram ´ er. This w ould hav e b een another p lausib le b o ok that w ould ha v e b ee n around at that time. PH: Y es, it w as around. I m a y ha v e look ed in to it, but I don’t thin k I really read it. I lik ed v an der W aerden’s s t yle, so I lo ok ed merely into the refer- ences he recommended, and among those he recom- mended were—I think he did ha ve Cram´ er to o— but let m e h a v e a lo ok [grabs v an d er W aerden’s b o ok from the shelf and cites]: “It make s no sense to redev elop theories that are comprehensiv ely treated b y Kolmogoro v, Carath ´ eo dory a nd Cram ´ er.” S u bse- quen tly h e recommends W ald’s “Sequenti al Analy- sis,” his “Stat istical decision functions,” and Doob’s “Sto c hastic pro cesses.” But I really d id this only when I p ondered the question of whether to go into Fig. 3. Montr´ eal 1968. S ´ eminair e de Math ´ ematiques Sup ´ erieur es. The six sp e akers: Sam Karlin, Constanc e van Ee den, Mark Kac, PJH, Lucien L e Cam, Jac ques Neveu. 4 A. B UJA AND H. R. K ¨ UNSCH statistic s or not. And so I looked in to statistics to find out (a) wh ether I w ould fi nd it sufficient ly at- tractiv e in the long ru n, and (b) whether I could get a feel for the sub ject. AB: Y ou men tioned that Beno Ec kmann was not at all in fav or of y ou going int o stati stics. PH: No , h e wa s not. It w as Stiefel and S axer who tried to p ersuade me. AB: And y et Beno Ec kmann someho w recommended that y ou go to Battelle to see some other type of problems . . . PH: Battelle, oh, that is a different story . Eck- mann h ad managed to get a contrac t from the U.S. Arm y Researc h Office, but he couldn ’t b e the p rin- cipal inv estigator as professor of ETH; so h e had to ha v e a pro forma principal in ve stigator, and that w as a student of h is, Heinric h Kleisli . B ut Kleisli got an offer fr om a universit y in Canada, and Ec kmann urgen tly needed a new principal in v estigato r of that pro ject, so he put me in c h arge. AB: What kind of exp erience wa s Battelle ? PH: The pro ject w as really in a theoretical ph ysics group, a v ery in teresting place, with inte resting p eo- ple, but my w ork was on h omological alg ebra. Bat- telle was an umbrella for many differen t p ro jects. The U.S. Arm y Researc h pro ject w as c hann eled through Battelle . This was a short stint, though, less than a y ear, from Octob er 1960 to the follo wing July . AB: So then Ec kmann’s d esign with you wo uld ha v e b ee n what? PH: Ec kmann though t that I should sta y in top ol- ogy , and then I would probably ha ve go ne to Berk e- ley also, but a flo or lo w er in Campb el l Hall. Ec k- mann thought I should first pub lish some more pa- p ers in topology b efore I branched out or switc hed fields. AB: F ew statistici ans these da ys study catego ry theory or to p olo gy at this lev el, b ut can y ou give us a rough idea what sorts of things y ou work ed on? PH: Or iginally E ckmann had suggested a prob- lem to me that w as fairly concrete on the in terface b et we en homotop y theory and homological algebra, and I wo rked very h ard on it for ab out a y ear. Then I decided that the problem w as ill p ose d, that it didn’t ha v e a solution; I nev er con vinced Eckma nn , but I convince d mysel f. Eckma nn at the time was actually int erested in another problem, a curious analogy b et we en top ology and algebra. There w ere certain similarities b et wee n the tw o fields that Ec k- mann describ ed with the term “dualit y .” I realized if one reph rased the matter in terms of categories, things b ecame iden tical. It only d ep ended on how one interpreted the ob jects, the morp hisms as con- tin uous maps on the one hand and as algebraic maps on the other. This fi tted in with Ec kmann’s views, and it b ecame my thesis. HK: In add ition to v an der W aerden and Ec k- mann, you m en tioned some other mathematicians who w ere influ en tial in y our life. C ould y ou com- men t on Stiefel? PH: Stiefel originally made himself a name as a top ologist. Stiefel manifolds are named after him. He had b een a Ph.D. studen t of Heinz Hopf. (Inci- den tally , Hopf wa s the second r eader of m y thesis.) When Stiefel w as a yo ung faculty mem b er at ETH in the 1940s, someb o dy had to teac h applied math- ematics, and the lo t f ell to him. It seems he lik ed it. Besides, he became the co-in ven tor of the conjugate gradien t metho d. He w as also the dr iving f orce b e- hind computing a t ETH in t he early da ys. I will sa y more ab out him when we talk ab out computers. BERKELEY, CORNELL AND THE 1964 P A PER HK: So you came to Berk eley , after reading v an der W aerden’s, W ald’s and Do ob ’s b o oks. Ho w did y ou decide on a topic to work on? PH: I ha v e n o recol lections of what I w as plan- ning to d o. Pet er N ¨ uesc h, who w as then a gradu ate studen t at Berk eley , p ic k ed us (m y wife Effi, myself and our little son Thomi) up from San F rancisco Air- p ort (it w as in 1961), and some y ears ago he claimed that, while w e we re driving to Berk eley , I had told him that I wan ted to w ork on a theory of robust- ness. I can’t confirm this, but it could b e true. V an der W aerden was in terested in nonparametrics, and he w as wo rried ab out the reliabilit y of d istributional assumptions, and so w as I. V an d er W ae rd en ’s b ook and m y study of W ald’s decision theory com bined to convince me that, if it were p ossible to build a theory of robustness, it would b e through decision theory . These thoughts did not gel till ab out a yea r later when I found that certai n M-e stimates ha ve an asymptotic min imax p rop ert y—wh ic h b eca me the n ucleus of m y 1964 pap er. HK: W ere y ou in fl uenced by T uk ey’s 1960 pap er “Sampling f r om conta minated distributions”? PH: Certainly . I don’t know when I read it fi rst; it m ust ha v e b een prett y early at Be rkele y . I think I didn’t meet T ukey unti l after I had fi nished writing the 1964 pap er. The p ap er w as submitted somet ime A CONV ERSA TION WITH PETER HUBER 5 in spring or summer 1963. That summer w e mov ed to Cornell. I t hink that’s when I met him, sometime in 1963 /64, on the East Coast. HK: Some p eople sa y that the idea of robustn ess, that you p a y a pr ice up front as insurance against things g oing really bad , is a v ery Swiss kind of men- talit y . PH: It was F rank Anscombe’s idea, though. It migh t ha v e b een in his 1960 pap er on the r ejectio n of outliers, or it could ha v e b een already in an ear- lier of his p ap ers; as far a s I r ememb er the insur ance idea is Anscom b e’ s. So there isn’t m uch Swiss ab out it really . . . HK: Wi th whom did you collab orate in Berk eley? Who w as influent ial in writing y our 1964 pap er? PH: Eric h Lehmann . I b eliev e Lehmann w as editor of th e Annal s but I am not sure an ymore; h e m ay ha v e b een ed itor earlier. 2 An ywa y , Lehmann w as the natural p erson to get advice on h o w to pub lish a pap er in the Annals . HK: W ere there any difficulties, such as r eferees missing your key p oin ts? PH: Not in m y recollection. Lehmann ga ve some v ery goo d advice, telling me that I should submit a pap er in a p reliminary v ersion that w as to o long. This h as tw o adv an tages: First, referees w ould un- derstand what it w as about and, second, they w ould recommend shortening it, so I could sta rt revising i t righ t a wa y after su bmitting it. AB: Do y ou remember anything from the referees’ commen ts? PH: No, I d on’t. AB: So it must hav e b een smo oth sailing. PH: I guess so. I ha v e no idea who the referees w ere either. AB: More ab out Berk eley: wh at kind of in terac- tions did you h a v e with Lucien Le Cam? PH: I w as sitting in Le C am’s course on decision theory where he did comparison of exp erime nts. Y ou [to AB] got in to something like this later, d idn’t y ou? AB: Exactly . He in tro d uced ǫ -sufficiency also in a 1964 pap er, and y ou suggested that there should exist a link to robu s tness, whic h was indeed the case. PH: Le C am’s course was n ot easy , so I rewo rked the material. Subsequently I taugh t at Corn ell for a y ear and ga ve a gradu ate course on d ecision theory . 2 Actually , Lehmann had b een editor 19 53–1955; fro m 19 61– 1964, the editor was Joseph L. Hodges, also at Berk eley . It so h app ened that Larry Brown w as one of the studen ts, and he ob viously sto o d out. AB: One last question ab out Berk eley: w as Jerzy Neyman there? PH: Neyman wa s still activ e; I think he officially retired while I was there, but he con tinued to b e around all the time. AB: Ho w did y ou exp erience him? PH: I had li ttle conta ct with him, but his hold on the p lace wa s immediately app aren t and v ery am us- ing. Neyman still was the departmen t. A t the b egin- ning of m y s tay he w as on lea ve and the departmen t w as sleepy , but when he returned the atmosphere came aliv e fr om one we ek to th e next. The place burst with activit y and one could see, for exam- ple, graduate stud en ts busily addressing env elop es for fund raisers for Martin Luther K ing. It wa s in- deed am using. AB: So he b r ough t energy to th e p lace . . . PH: He w as the d riving force b ehind the depart- men t. I hav e nev er had a similar exp erience again, an eruption of a place from sleepiness to bursting activit y . Other p eople w ere less ov ertly visible b e cause th ey had o d d working hours , su c h as Jo e Ho dges who w ork ed d uring the n igh t, and Lehmann y ou b est caugh t after class. Le Cam w as ve ry nice, bu t you had to approac h him. T he only p rofessor who wa s regularly in the coffee ro om wa s Mic hel Lo ` eve , and with h im y ou could discuss an yth ing. Davi d Blac k- w ell came someti mes. The coffee ro om al so housed a reprint collectio n fr om whic h I learned muc h ab out T ukey . HK: Did y ou hav e con tact with gradu ate students and other p ostdocs at Berk eley? PH: Let me jog my memory . I shared the office with Don Burkholder who w as on lea v e, and among the graduate students w ere P eter Bic k el and Ric hard Bucy of the Kalma n–Bucy filter. I b el iev e the Kalman– Bucy filter had b een in ve nted the y ear b efore. While I wa s there Da vid F reedman, coming from Prince- ton, w as made assistan t professor. AB: Ho w did y ou get to Cornell after Berkele y? PH: I could h a v e sta ye d at Berk eley because I h ad first a one-y ear Swiss National S cience F ound ation fello w s h ip and then a t wo -y ear fello wship from th e Miller found ation. I wan ted to exp erience one other place as well, a nd I w an ted to g et some teac hin g ex- p erience, partly out of cur iosit y and partly for career reasons. Jac k Kiefer offered me a visiting p osition at Cornell. 6 A. B UJA AND H. R. K ¨ UNSCH Fig. 4. Princ eton, F al l 1970. Wi th sons Thomas and Niklaus, and do g T appi. AB: Ho w did y ou kno w K iefer? PH: I had met him in Berk eley . AB: And after Corn ell y ou got a call from ETH? PH: Y es, this was in spring 1964. I wa s getti ng ner- v ous o ve r m y J-visa that was to expire after thr ee y ears. It couldn’t b e extended, an d the rules m an- dated t wo y ears in the country of origin. So the call from ETH came in hand y . ROBUSTNESS AFTER THE 1964 P APER HK: Can y ou d escrib e ho w the researc h con tin- ued? Y ou had this 1964 pap er coming out, and then I think the n ext step w as the robus t testing approac h? PH: Y es, the asymp totic asp ect of my r obustness theory w as un s atisfacto ry b eca use it makes a ma jor difference whether one has, sa y , 1% con tamination and a samp le size of 10 or of 1000. So I was v ery happy when I managed to get exact fi nite sample results. I think among those papers I lik e the one in the “Zeitsc hrift” b est. AB: . . . the one on “Robust Confidence Limits.” Earlier yo u had also published a robust version of the Neyman–Pe arson lemma. PH: Y es, and that pap er led on the one hand to the pap er in the “Zeitsc hrift” and on the other h and to the pap er w ith V olk er Strassen. HK: Ho w did y ou meet Strassen? PH: I ma y hav e met h im at the 1965 Berke ley Symp osium , and I tried to get him to come to Z uric h. ETH, h ow ever, w as problematic b e cause he would ha v e to teac h calculus for scienti sts, which he r e- fused to do. ETH was n ot willing or not a ble to giv e him another app ointmen t. I then lobbied v an der W aerden to get Strassen to the Un iversit y . AB: . . . and that wo rked? PH: It w orked, b ut b oth Strassen and I made it v ery clear that he w ould p r obably not sta y in prob- abilit y , ev en though they wo uld hire him as a p rob- abilist. F ortunately , the Univ ersit y didn ’t qu ite b e- liev e him, but it came true anyw a y . Pa rt of Strassen’s c harter w ould ha ve b een to do statistica l consult- ing, but he refused that, to o, so he n egotia ted th at he could bring along a y oung p erson to do the con- sulting, namely , F rank Hamp el. Both Strassen and Hamp el m ust h a v e arrive d in 1968. W e all w ere in con tact with eac h other, had common seminars on computational complexit y and on robustness, and it made life in Zu r ic h a lot more interesti ng. AB: When y ou talk ed to Strassen, did y ou d isco v er that there w as a p ossibilit y of doing something w ith Cho quet capacities to generaliz e yo ur robu st neigh- b orho o d tests? PH: I h ad read C h o quet’s pap er in connection with Mark o v pro cesses and p oten tial theory , but this w as really Strassen’s idea. AB: So y ou told him ab out wh at yo u d id and . . . PH: If I r emem b er correctly , he had read my pa- p er on robu st tests. He drew a connection to his thesis (wh ere he had used capacities to formalize inaccurate kn o wledge of pr obabilit y measures on fi- nite sets), and he told me that this was essen tially a capacit y argument. W e h ad man y d iscussions— I don’t remember wh en these discussions started, some time b efore he came to Z uric h—bu t s ince we did not m anage to extend the theory b eyo nd fin ite sets, we let the wo rk sit for a wh ile, and in th e end, in 1970 , Strassen said w e should make a final effort to generalize it and write it up. I remem b er that w e talk ed ab out this on a long w alk in the w o o d s ab ov e Zuric h, a few we eks b efore I left for Prin ceton. I made th e ma jor push when I w as at Princeton for the robustness study . I sen t a draft to Strassen, but b y that time he h ad fully mo v ed in to c omputational complexit y and algebraic geometry , and he w as no longer in terested. The d raft w as sitting, and in the end we b oth had forgotten the d etails. I am still a A CONV ERSA TION WITH PETER HUBER 7 bit un happy b ecause some errors made it into the publication. AB: W ell, that b ecame grist for others. A t an y rate, this is another side of robustness that is m uc h less kno wn than the 1964 pap er with its asymp- totic theory: robust tests wh ere b ot h H 0 and H 1 are cont amination n eigh b orho o ds or, more gener- ally , C ho quet capacities, and based on these tests y ou inv ented “Robust Confidence Limits” (1968). This looks lik e a prettier and more satisfying th e- ory . PH: Ind eed. I ts imp ortance is the f ollo win g: it sho ws that essen tially the same p ro cedures that are asymptoticall y optimal for symmetrically con tami- nated distributions are also optimal in a finite sam- ple sense and f or arbitr ary asymmetric c ontamina- tions . AB: The sy m metry assumption of the asymptotic theory was ind eed a longstanding criticism. PH: Y es, and I think it is still not widely kno wn that the finite sample optimalit y results ab out ro- bust confid ence limits essent ially fi nished up that question. Ma yb e I should ha v e pressed this side of robustness a little more. I w as n ev er su re whether the theory of m y 1964 pap er w as the real thing, but it did giv e th e idea of robustness an elemen t of re- sp ectabilit y , wh ic h is all that I could h a v e hop ed for. I feel more strongly , though, that the pap er on robust confid ence limits is the real thing. AB: But in addition to theory yo u also came up with complete algorithms. PH: Y es, but p e ople alw a ys though t that I w as a b lo o dy theoretician. Y et, there is merit to theory in that optimalit y results giv e u s some sign p osts that here, in this d irection, one cannot go fu rther. Optimalit y tells us u nder what conditions there are limits to how we ll a metho d can p erform, and if a metho d is close to but not p erfectly optimal, this migh t b e as w ell. Bac k to robu s tness theory: There w as criticism of the robust minimax results also b ecause o f the form of the least fav orable d istributions that w ere pieced together and hence not analytic. I found this a n un- w orth y ob jection b eca use, if the least fa v orable dis- tributions are realistic, t hey mak e sense, nev er mind their analyti c form. In fact, later I was pleased to see that least fav orable distributions were often closer to observ ed data distrib utions than the n ormal d istri- bution. Whic h brings me to rea l data... I h ad a lwa ys b een in terested in dat a analysis, and I got int erested even Fig. 5. Schloss R einhar dsbrunn, German Demo cr atic R e- public, 1975. Peter discus ses lever age p oints. more du ring the r obustness ye ar at P r inceton in 1970/ 71. That, I think, was the turn ing p oin t: I es- sen tially got out of robus tn ess and mov ed in to data analysis. I still had to write a b o ok on robustness . . . AB: . . . and yo u still did other w ork on robust co- v ariances and robust regression, the W ald lectures, and so on. PH: S u re, I wa s still in it, but I really had my mind set on data analysis. S ometime in the 1970 s I realized that I was in tellectual ly going in circle s. That i s, I would ha v e an id ea, and somehow I w ould fall in to the s ame trac k, lik e a brok en record: ev ery problem tur ned in to some kin d of minimax problem. As the sa ying go es, if you hav e a hammer, ev ery problem lo oks like a n ail. THE DEVELOPMENTS OF COMPUTING AB: Can y ou tell us more ab out ho w y ou first got in v olv ed with computers? PB: I got into computing I think in 1956 when ETH had ac quired the n ew ERMETH computer built from v acuum tub es, an d Rutishauser gav e a course on pr ogramming it. Compu ting at ET H started with Stiefel. A t the end of the w ar, he had heard ab out the “Zuse computer” that was stored in a shed some- where in south ern German y , and he managed to get hold o f it and mov e it to Zuric h . Stiefel then formed a small subgroup for computing researc h within the applied mathematic s group. AB: T ell us more ab out the Zuse computer. PH: It w as a ve ry am usin g piece of hardware. It w as p artly made from the tin of war-t ime qualit y tin cans and therefore had m an y problems. It was not mec hanical bu t based on rela ys. When Zuse’s mac hine was d emonstrated to us I was in trigued but not impressed. It to ok ab out 10 minutes to solv e 8 A. B UJA AND H. R. K ¨ UNSCH linear equations with three or four unknowns, a nd I felt I could d o that faster b y h and. This must ha v e b een to w ard the end of my first term at ETH, in early 1955 . A few y ears b efore, 3 Stiefel had sent t w o of his y ounger p eo ple to the United States to learn ab out computers: an ele ctrical engineer, Ambros Sp eise r, and a mathemat ician, Heinz Rutishauser. When they came bac k th ey designed a computer made from tub es and d io des. It b ecame op erational in 1956. It w as called “ERMETH,” and it w as one of the first floating p oint mac hines. It w as n ot binary but deci- mal with double pr ecision floating p oin t n umb ers. It had a drum memory as its only memory , with 10,000 w ords, in all ab out 50 KB of memory . AB: . . . which must hav e b ee n big at the time. PH: That w as big, indeed. The co mpu ter w as prett y slo w, th ough . An addition took 4 msec or so. There w as no assem bler, so one had to p r ogram in actual mac hine co de. I p la y ed with ERMETH for fun, bu t then m y wife b egan to u se it in a bigger wa y . AB: Y our wife, Effi: y ou met when . . . ? PH: W e met in high school and we married shortly after our ETH diplomas. She w as in crystallo graphy , and crysta llographers h a v e b een big computer users from the start. AB: So y ou w ere b ot h early computer users. PH: Y es, w e w ere indeed b oth early computer users, y es. But Effi w as really the big user. A t times she h ad jobs that lasted 24 hours. This mean t that she had to att end to the m ac h ine and r estart it ev ery few hours b ecause it w ould break do wn fre- quen tly . One had to p rogram ve ry carefully so one c ould restart w ithout goi ng bac k to square one. I got in to computing and d ata analysis thr ough Effi. AB: What t yp e of data analysis w ould she do? PH: Three-dimensional F ourier syn theses and least squares pr oblems, nonlinear we ight ed least squares. In her thesis she dealt with maybe 37 un kno wns and 1000 to 2000 observ ations. F or her final results she considered using ERMETH but sa w that it was too risky: the m ac h ine w as too slo w, to o unreliable, and restarting was difficult. So she ended up doing h er least squares problems on a computer at CERN in Genev a. HK: Could you also d escrib e for us the d ev elop- men ts in computer soft w are and p r ogramming lan- guages, ho w y ou exp erienced them? 3 1948/4 9. PH: Programming on the ERMETH was fun. There w ere thr ee lev els of languages on top of eac h other: First, one wrote a pr ogram in a flo w c hart languag e, then one translated that flow chart b y hand into a kind of assem bly co de with symbolic addresses, and then one translated that assem bly co de into mac h ine co d e. AB: . . . manual, not automatic? PH: By hand. Subsequen tly , the problem w as mak- ing c h an ges in the p rogram b ecause usually the c hanges w ere not lo cal and the add resses w ere abs olute. Based on his early experiences with computing and n umeri- cal analysis, it w as natural that Rutishauser b ecame one of the driving f orces b ehind the dev elopmen t of ALGOL 60, whic h at the time w as still called AL- GOL 57. O n e should kn o w that Ru tish aus er had written the first pap er ever on compilers in 1951, h is Habilitations Schrift , p ublished in 1952. Th e p oin t of his pap er w as to pr op ose a language that made it p ossible to describ e numerical algorithms u n am- biguously . He was actually not inte rested in com- piling p er se, bu t m ac hine compilation w ould sho w that the description w as complete and c onsisten t. In 1960 ALGOL 60 came int o b eing. As Rutishauser told the story , there w as a large commit tee that had agreed on the language, b ut in the end Naur completely rewrote the fi nal do cument, whic h b e- came then known a s the ALGOL 60 rep ort. V arious participan ts of the ALGOL co nference, attended by ma yb e a dozen p eople, agreed to wr ite compilers or h a v e compilers written. In Zuric h it wa s Hans Rudolf Sc hw arz wh o w rote an ALGOL compiler. I tried it out bu t end ed up unimpressed again: a little program for a Jacobi eigen v alue prob lem that I had written to ok almost an hour to compile. The p r ogram is of some in terest b ec ause it was part of a data analysis p roblem of Effi’s. The prob- lem wa s to map in tensities measur ed from differen t photographs to the same scale, as when one h as in- exact crystallo graphic d ata. This kind o f problem is no w ada ys called a Pr o crustes problem. T o m e this w as an opp ortunit y to try out th e ALGOL 60 com- piler. I w as probably the first ALGOL 60 user ou t- side th e app lied m athematics group of Stiefel. The next step in the dev elopment of languages that I exp erienced w as F ortran. In late 1960 o r early 1961 Effi and I attended a F ortran programming course at CERN in Genev a, programming on an IBM 704. The co mpu ting en vironmen t w as still prim- itiv e. F or example, if on e w anted to w rite something to tap e, one had to get the tap e running b efore one A CONV ERSA TION WITH PETER HUBER 9 b egan to w rite. So one h ad to estimate ho w m uc h earlier one had to start it. But more imp orta ntly , F ortran compilers we re faster. The next step wa s seve ral years later, it m ust ha v e b een 1967, wh en Niklaus Wirth came to Zuric h. It so happ ened that our neighbors were on an extended absence to the United States; Wirth ended up rent- ing th eir h ouse and we b ec ame neigh b ors. A t the time Wirth w as dev eloping the Pa scal language. I lo ok ed at Pasca l but wa s disapp o inte d b ecause in strict P ascal arra y b ou n ds we re fi xed, w h ic h is why I didn’t b other to lea rn P ascal. Neither could I con- vince Wirth that he should provi de greater fl exibil- it y . A t th at time we had a v ery efficien t ALGOL com- piler on th e CDC 1604 computer. Unfortunately , ALGOL 60 then got kille d b y its successor, AL- GOL 68, whic h wa s a big fi asco b ecause it w as so complicate d that it nev er to ok off. It had what Donoho used to call the “second-ge neration syndrome”: start with a goo d pro duct, then fol lo w it u p with a n o ver- designed second-generation pro duct, only to see it killed. This happ ens often, and it did happ en to AL- GOL 68. Next w as the app earance of the C language, but I nev er took to it b ecause it w as too error-prone. So I stuc k to F ortran, wh ic h was sufficien t f or my t yp e of computing. Later I had o ccasion to p r ogram in C, but, matter of fact, I ha ve u sed more often f2c, the F ortran-to-C cross-compiler, whic h seems to work b etter than most F ortran compilers. AB: So y ou ha v e seen computing from the b egin- ning . . . PH: Sp eaking of b eginnings, in the 197 0s w e b egan to exp eriment with in teractiv e graph ical data anal- ysis. This c hanged my views of compu ting rather dramatically . I so on realized that we needed a data handler, that is, a language in whic h we could not only write programs, but whose command lines we re capable of immediate execution. BASIC and APL w ere suc h languages. I lik ed BASIC for w r iting lit- tle things, esp ecially string manipulations, but we needed an arra y-orien ted languag e. As f or APL, it w as a “write-only” language, I could write it, but I could not r ead it, and I ma y b e a kind of exp ert in such matters. I nev er und ersto o d Anscom b e why he had written his programming examples in APL. When I ask ed him he confessed that he couldn’t read his o wn programs either after h alf a yea r! I guess that I should expand on our exp eriences with d ata handling. After I had m ov ed to Harv ard in 1978 and started a graphics pr o ject there, Donoho prop osed to reviv e the ISP language as a data han- dler for the pro ject. He h ad co-dev eloped ISP at Princeton. Donoho left in 1983, and then Effi and I extended it, and to this da y I am using ISP for m y o wn purp oses. It is still easiest for me to use it b oth in teractiv ely and to do ad ho c programming. I nev er got used to S. Neither to Matlab, although I am familiar only with early ve rsions. At an y rate, I am still using ISP wh ic h is great for me to impro vise solutions to n onstandard problems. T o AB: How d o y ou deal with nonstandard com- putations? AB: I grew up with S and late r grew into R, so I kno w all kinds of tric ks . Maybe it dep ends on how one gro ws u p. Ma yb e it is a sub jectiv e thing, deter- mined b y life history . PH: Y our mother tongue is what you s tay with. AB: Y es, but one has to ha v e one mother tongue that one kn o ws in and out, and s o studen ts I think ha v e to hav e one. PH: Another question is, w hat should you learn as a student? In my exp erience, one area that is generally underemph asized is reformatting of data. Often one has to kn ow tric ks to do reformatting, and usually one h as to p r ogram these things in a lo w-lev el language, in particular when facing binary formats that require n ot only manipulating byte s but bits. W e ended up putting a bit hand ling f acilit y in to ISP . Most p eople probably write C co de to do bit h andling. W e thought a long time ab out ho w to do it, and no w I kn o w I can do essen tially eve rything with our b it h andling f aciliti es. AB: Y ou made it a high-lev el problem . . . PH: Bit manipulations are partic ularly u s eful when apparen t data corruptions turn out to b e something else. Once w e met a case wh ere a p rogrammer had put the main information in to 7 -bit ASCI I but then squeezed additional inform ation in to the eigh th b it. W e exp erience d something of this kin d also in the c hildren’s gro w th data at the Univ ersit y of Zuric h, based on a 20- ye ar longitudinal study o n which W erner Stuetzle did his thesis. Th e b ac kground w as as fol- lo ws: Certain ev en ts in b one dev elopment w ere sup- p osed to o ccur b et w een ages 5 and 9, and so a sin- gle d ecimal d igit p ositio n had b een reserved for age. Then a few c hildren had this d ev elopmen t at age 10 or 11. Th e p unc h-card op erators made a v ery in telligen t decision: they enco ded the in formation allo wing letters in addition to digits. This was n ot do cument ed and w e didn’t know when we started to 10 A. B UJA AND H. R. K ¨ UNSCH analyze the d ata. T he lette r co d es w ere thrown out b y a data-c hec king mec hanism as pun c hing errors. Then the p erson in c h arge of transf err ing the pu nc h- card data to tap e (it ma y hav e b een Th eo Gasser) actually w en t bac k to the original handwritten data sheets, whic h still existed, and realize d w hat w as go- ing on. And w e managed to read t he cards prop erly . Another, p ublished, example is in a famous JASA pap er by C oale and S tephan, 4 a v ery w orthwhile piece. The authors lo ok ed at 1950 census data and realized that there w ere quite a few 14-, 15-, 16- y ear-old wido wers. C oale and Stephan, in something that reads lik e a detectiv e story , describ e their dis- co very that a few thousand punch-ca rds m ust ha v e b een pu n c hed with one column shift, so a 32-y ear- old head of household w as turn ed i nto a 13 -y ear-old wido we r, and a 42-y ear-old in to a 14-y ear-old w id- o w er, and so on. Apparentl y , 13-y ear-old wido w ers w ere automatic ally thro wn out as errors b e cause it w as legally imp ossible to marry at age 13, but one could m arr y at age 14 in some states. AB: This act ually b rings up the more general p rob- lem th at analysts of large datasets can easily b e mis- led b y artifacts and d ata corrup tions that are diffi- cult to s ee. Datasets can b e strange for systematic reasons su c h as you just men tioned. So is there an y wisdom other than “W ell, b e careful”? PH: I think the sp ecific lesson y ou can dra w from our c hildren’s gro wth data and from Coale and Stephan’s widows and In d ians is that if there are clusters of bad data, one h as to dig in b ecause v ery often bad data hav e m eaning. D A T A ANAL YSIS AND D A T A VISUALIZA TION HK: Can you tell us something ab out your views and exp eriences w ith d ata analysis? PH: The problem with data analysis is of course that it is a p erforming art. It is n ot something y ou easily write a p ap er on; rather, it is something y ou do. An d so it is difficult to publish. HK: If yo u analyze data in a sub ject area, don’t y ou pub lish it there, not in mainstream statistics? AB: Which y ou did , to o. Y ou w ork ed on EEGs with Gasser, prett y muc h a fter the Princeton study; actually Gasser w as w ith you in Pr inceton. PH: Y es, but the EEG work started earlier, with the C o oley–T uk ey algorithm, whic h m ust ha v e b een 4 Co ale, A. J. and S tephan, F. F. (1962). The case of the Indians and the teenage wido ws. J. A mer. St atist. Asso c. 57 338–347 . in 1965 . The Univ ersity of Zuric h had an EEG pr o ject in collab oration with ETH mathematicians. They , ho w ev er, w ere uncomfortable b eca use there was too m uc h stat istics in v olv ed. They sugge sted that I ta ke o v er. Th e p ro ject required stat ionary time series anal- ysis. A t the time I h ad just learned from a talk at an ISI meeting in Belgrade that some suc h thing as a fast F ourier transformation had b ee n in v ente d, and that it w ork ed b est w ith p ow ers of t wo . This w as enough of a hint to allo w me to p rogram it up myself. The regular F ourier transform with its n-square computational complexit y was muc h to o slo w to compu te on a CDC 1604 computer [CDC = “Con trol Data Corp orat ion”]; it requir ed hours and hours. This reminds me of an am using anecdote: The computer o p erat ors had a microphone h o ok ed up to the top bit in the accum ulator so one could h ear what w as going on in the computer. Un f ortunately , the fast F ourier transform had the prop ert y th at it would make a howling “o o oh o o oh o o oh” sound, whic h w as usu ally an ind ication that the currently runnin g program wa s stuck in an infinite lo op. As a result the op erators often terminated the pr ogram prematurely , which didn’t sp eed things up either. AB: Lo ok at that! An early example of sonifi- cation . . . HK: Later y ou also lo ok ed at higher-order sp ectra of the EE Gs. W as this successful? PH: The director of the pro ject, Guido Dumer- m uth, w as very muc h intereste d in higher-order sp ec- tra, so I look ed into it. W e didn’t get very far, though. W e calculat ed some, but they we re not easy to in - terpret and rather sensitiv e to artifacts. AB: That sounds lik e another robustness problem. PH: P ossibly; one had to b e v ery , v ery careful ab out tap ering and th e like to a v oid artifac ts. AB: So then in the seven ties you turned y ourself lo ose on compu ting. PH: Effi had a large role in that. When w e w ere at Princeton in 1970/ 71 s h e w ork ed in Langridge’s molecular biology lab. Langridge h ad receiv ed new equipmen t that w as v ery fancy for the ti me: a DEC- 10 [DEC = “Digital Equipment Corp oration”] and an Ev ans & Sutherland v ector graphics displa y , which had j u st arrived when w e came—the nake d displa y hardwa re, without an y soft ware. Effi got a job in this lab as a p ostdo c. Langridge told her something along the lines, “Here is this great new equipment, can y ou do something with it ? And by the wa y , w e ha ve crystallogra ph ic data on transfer-RNA, and also a geometric mo del of transfer-RNA.” Effi wa s curious A CONV ERSA TION WITH PETER HUBER 11 Fig. 6. Harvar d 1986, on the o c c asion of F r e d Mostel ler ’ s 70th birthday. Shaw-Hwa L o, PJH, Art Dempster, Herman Chernoff. enough to test whether the mo del would fit with th e crystallogra ph ic data, and that mean t among other things fitting a molecule in to its observ ed crystallo- graphic dimensions. In essence w e wrote a p r ogram for molecular pac king. In retrosp ect I’d sa y it was a fledgling exp ert system. AB: Meaning y ou used heuristic rules? PH: On e had to inpu t some initial co ordinates. Then the program would try to impro ve the pac king of the molecules, follo wing some simple rules whic h w e bettered o ve r time. When it b eca me stuc k, it got bac k to y ou, and one had to fi gur e what migh t hav e gone wr on g. F or example, some p rotrusion of one molecule might h a v e gotte n stuc k b et wee n protru- sions fr om other molecules, in wh ic h case one had to view the situation in 3-d, pu ll them apart and restart packing. I found the problem ve ry inte rest- ing. HK: S o then yo u thought this equipment would also b e useful for statistics? PH: In deed, and for data analysis in general. I agree with T u k ey in that I take data analysis to b e the larger thing, compr isin g s tatistics. I realize d that with high-p o wered graphics equipment one could do things one couldn’t do otherwise. Until then I had lo ok ed at graphics as just a toy . AB: Did you learn ab out the PRIM-9 movi e at the time? PH: The PRIM-9 mo vie, oh, it was made later. W e w ere b ey ond PRIM-9 b ecause of Langridge’s equip- men t. T uke y sometimes visited and wa tc hed when w e were there—he probably got the idea for PRIM- 9 from there. Later he would visit Jerry F riedman at SLAC (the S tanford Lin ear Accelerato r Cen ter) where he had Mary-Ann Fisherke ller program what lo ok ed to us lik e inadequate equipment. The things one could do with Langrid ge’s equipment were m uch more adv anced. AB: T ukey co uld ha ve sta ye d right there in Pr in ce- ton? PH: Actually , he co uld n ’t; Langridge wouldn’t let an yb o dy outside of h is group use the equipment. AB: Oh, so Effi was luc ky to b e part of the group. PH: Y es, she had an app ointmen t there. Also, the few p eople who knew h o w to op erate it w ere using it almost full time. Effi w ent there m ostly during the nigh t. AB: T ukey wasn’t the op erat or who would try to get his o wn equipment, so he w en t to S LA C instead? PH: The equipment w as also exp ensive. Y ou ma y not r ememb er th e pr ices w e paid for the interac tiv e equipmen t w e pur c hased for ET H. AB: . . . hundreds of thousands . . . PH: I seem to remem b er the budget w as ab out 3 million Swiss fr an cs . . . 12 A. B UJA AND H. R. K ¨ UNSCH AB: . . . for a DEC-10, a PDP-11 and an Ev ans & Sutherland displa y , whic h is th e equipmen t that yo u later p urc hased at ETH, after m uc h difficult y . PH: Y es, this brings u p the p oin t that the equip- men t at Pr inceton wa s muc h easier to op erate b e- cause there was no PDP-11 mini-computer b et we en the DEC-10 and the d ispla y . F or obscure r easons—I think it wa s price—Ev ans and Sutherland decided to ho ok their displa y to a PDP-11, whic h slo w ed d o wn the graph ics and complicate d the pr ogramming. AB: Ironically , so on after the arriv al of the e quip- men t at ETH, y ou left ETH for Harv ard. W erner Stuetzle programmed PRI M-ETH on t he equipment in Zuric h, and y ou started w ork on PRIM-H. PH: I should explain the irony . Our prop osa l had prev ent ed ET H from missin g out on an imp ortan t dev elopmen t in computing. Th e irony w as that in - stead of appreciating it, the administration w as v en- omously furious that w e had dared to in terfere with en trenc hed p ow ers and to comp ete for fun d s with the b atc h computing establishmen t. Just then, Har- v ard made me an offer, and I w as glad for it. A t Har- v ard I got acce ss to practicall y the same equipm ent, in the c h emistry department, after relativ ely smo oth negotiat ions. The graph ics soft w are w as mostly done b y Mathis Thoma and th e data h andler soft ware b y Da vid Donoho. HK: A t that time y ou also got in terested in pro- jection p ursuit. PH: By then, I had realized that it was difficu lt to searc h visu ally for in teresting views of more than three-dimensional data. And again, there is t his prob- lem that data analysis is a p erforming art, and in order to write pap ers one has to get in to theory . O ne p ossibilit y that came along was p ro jectio n p ursuit. AB: Y ou w ork ed on differen t pro jectio n pursu it criteria? PH: Y es, w ith Donoho. It was very inte resting, and he had some brillian t ideas in that area. On e problem was to describ e the in v ariance p rop erties pro jectio n pursuit ind ices should hav e and then fi g- ure out wh ic h indices had them. AB: Donoho d id n’t p ublish on pro jection pur s uit himself. PH: I d on’t think that he did. He w ould mo ve on . . . HK: Y ou said that data analysis is a p erforming art. Can you describ e to us some p ro jects or data that y ou analyz ed that we re particularly in teresting and what areas they came from? PH: I lik e analysis of out-of-t he-ordinary data. An example, cur ren tly of concern to me, are astro- nomical data where one face s p eculiarities su c h as the length of da y and the irregular rotatio n of th e earth. Suc h data require differen t time scales, one b eing the uniform, d ynamical time scale that u n - derlies the gra vitational theory of the solar system, and the other b eing the civil time scale that relies on the rotation of the earth. Because the rotatio n of the earth is irregular, the length of da y changes systematica lly o ve r time as well as randomly . If one tries to extrap ola te time for the cal culation of an- cien t eclipses, one has to ha v e an idea ho w b ig the extrap olation error might b e. Extrap olation alw ays in v olv es a mo d el, so one has to b u ild a mo del for the rotation of the earth, c hec k it for adequacy , and es- timate its parameters. I rather like such intricac ies. This w as, on the one hand, an analysis of the data and, on the other hand, an effort in mo del build- ing. It turned out that a Bro wn ian motion mo d el fitted th e data prett y wel l. I wr ote a little p ap er whic h I bu ried in a F estschrift, and lately I ha v e b een w orking on an up date which I w ould like to publish sometime. 5 As for data analysis in g eneral, b y 1990 I felt that I knew enough ab out the opp ortun ities and tec h- nicalitie s of in teractiv e data analysis and graph ics, and I drifted int o w hat migh t b e called the ph ilos- oph y of data analysis. The t wo pap ers of m in e that y ou published as editor of the Journal of Comp uta- tional and Gr aph ic al Statistics (JCGS), I lik e them quite a b it, 6 as w ell as a third one on strategy in data analysis. 7 I once w an ted to remak e all thr ee of them into a kin d of a prolegomenon to a b o ok on data analysis. AB: Y ou also wrote some comments on the p ast, presen t and fu ture of statistics. 8 W ould y ou sa y an y- thing d ifferen tly from what you said then? 5 Now p u blished as: Modeling the length of day and ex- trapolating the rotation of the Earth (2006). J. Ge o desy 80 283–30 3. 6 Massiv e data sets workshop: F our years after (1999). J. Comput. Gr aph. Statist. 8 635–652. Languages for statistics and data analysis (2000). J. Comput. Gr aph. Statist. 9 600– 620. 7 Strategy issues in d ata analysis (1997). In Pr o c. of the Confer enc e on Statist ic al Scienc e Honoring the Bic entennial of Stefano F r anscini ’ s Birt h (C. Malaguerra, S. Morgen thaler and E. Ronc hetti, eds. ) 221– 238. Birkh¨ auser, Basel. 8 Sp eculations on the Path of Statistics (1997). I n: The Pr actic e of Data An alysis , Essays in Honor of John W. T ukey (D. R. Brillinger, L. T. F ernh olz and S. Morgenthaler, ed s.) 175–19 1. Princeton Univ. Press. A CONV ERSA TION WITH PETER HUBER 13 Fig. 7. T e gernse e, 1990. Peter r e c eives Humb oldt Awar d fr om Pr of essor R eim ar L ¨ us t, pr esident of Humb oldt F ounda- tion. PH: I don’t think so. HK: Y ou men tioned this pap er by Coale and Stephan (19 62) that y ou find interesti ng. What other pap ers, if yo u look bac k, did y ou find particularly in teresting? PH: I s hould men tion T uk ey’s “F uture of data analysis”; the fir st few pages and the last few pages in m y opinion are a m ust for eve ry statistici an. Th e part in b etw een has lost in terest with time, but the first and last few pages are here to s tay . The most remark able asp ect is that this pap er w as published b y the Annal s . AB: Earlier w e talk ed ab out the question ho w dif- ficult it is to see problems in large data sets. Is there an ything else to sa y ab out massiv e datasets? PH: Y es, there are more asp ect s. A problem is that the statisticia n of old times wh o analyzed data by hand would notice if s omething was amiss, w h ereas the mod ern statistician sees masses of data filtered through computer manipulations that ma y conceal data pr oblems. Up to a few megabyt es, compu ter graphics can reve al if something is amiss, but go- ing b ey ond, it gets d ifficult to notice if something is ev en grossly wrong. This is, by the wa y , one of the problems with d ata mining. All too often when mining d ata one hits on trivial “n u ggets.” I r ecall the ca se of a data analysis problem that wa s part of the Ph.D. exam at Harv ard. I t w as a discrimin ant problem where one had to discriminate b etw een car- riers and noncarriers of a certain genetic disease. A studen t found th at the v ariable that discriminated b est w as “age.” What he had found w as that carri- ers and con trols w ere not p r op erly matc hed in age. No w, if one blind ly run s a b lac k-b o x algorithm on this problem, suc h as a genetic algorithm, the im- p ortan t, bu t trivial and p ossibly misleading, role of age will n ever b e detected. AB: This is a cla ssical problem with a confound ing factor. This is not something for whic h one can fin d an au tomatic solution. HK: I imagine it must hav e b een difficult for the studen t to prepare for s u c h an exam. PH: Y es, bu t it is very in teresting to see w h at differen t p eo ple do facing the same data analysis problem. AB: One of the difficulties for many studen ts in statistic s is v agueness, and th is is what seems to ha v e attracted yo u. Data analysis is p artly an art, and th at is v ery unsatisfacto ry for studen ts who w an t to kno w the ru les b y whic h they can obtain go o d grades. PH: Act ually , I experienced the same in mathe- matics w hen I taugh t a calculus course a t Cornell in 1963. Some stud ents complained that in high sc h o ol, Fig. 8. Beijing, 1997. L e ctur e tour in China. With the host, Pr ofessor Guoying Li, A c ademia Sinic a. 14 A. B UJA AND H. R. K ¨ UNSCH mathematics had b een v ery clear: there were clear problems with cle ar solutions. And no w th in gs w ere so v ague and unsystematic, one could do things so man y differen t wa ys. AB: . . . and that wa s a complain t. PH: As you said, stud en ts h av e problems with the art asp ect. AB: Are w e digging in the wrong p o ol of studen ts? PH: No t really; as a student y ou hav e to get us ed to the art aspect, and we h a v e to teac h it. I just r ead a little pap er by Stev en W ein b erg, the p h ysicist, on the problems of teac hing basic physic s to nonphysic s studen ts. It seems to b e v ery , v ery hard. I think they ha v e a similar problem. BABYLONIAN ASTRONOMY AND ASSYRIOLOGY HK: Ma yb e we can switc h topics. W e ha v e seen from your list of pu blications that y ou ha v e work ed also very muc h on Bab ylonian astronomy and As- syriology . Ma yb e y ou could describ e how y ou got in- terested in these areas and what y ou a c hieve d there. PH: Y es, I got into it in a fu nny w ay . In the “Gym- nasium” (colle ge-b ound high sc ho ol), wh en I wa s ab out 16, I read ev erything I could find on physics and astronomy and the like . I had gotten hold of a calculus b o ok from which I learned calculus, then I got on to more adv anced topics, and b y the end of the y ear I had read something lik e Hermann W eyl’s “Space-Time-Ma tter.” Then I suddenly had it u p to here. I kn ew I w ould b e going in to mathematics or physi cs l ater, but I jus t co uldn ’t con tin ue righ t no w. I had to d o something completely different. Some- ho w I ended up learning cun eiform . . . AB: But you could h av e found some ancien t Egyp- tian man uscrip ts, or y ou could ha ve found literature on Brazilian trib es: instead y ou pic k ed cuneiform . . . PH: Oh, the reason for not c ho osing ancien t Egyp- tian w as that it didn ’t ha ve vo wels. I couldn’t p ossi- bly learn a language if the v o we ls w ere not written and n ot kn o wn, s o I switc hed to cuneiform, although in fact I did fir st try ancien t Egyptian. AB: I see you h a v e here an Egyptian grammar. PH: Y es, I tried later again to see whether I could, and I d id, bu t at that time I couldn’t, so it wa s cuneiform. I learned quite a b it of it when I was in the Gymn asium . AB: It sa ys here, in your CV, that yo u found the p ertinent literature in an estate that wa s giv en to the libr ary . PH: The “Kan tonsbibliothek” (state library) of Aargau had quite an extensiv e selectio n of b o oks on cuneiform to start with. T o w ard the end of Gymna- sium, I disco v ered that Neuge bauer had published all the mathematical cuneiform texts a v ailable at the time, so naturally I got m yself into that, to o. AB: Can yo u giv e u s some backg round ? There are differen t types of cuneiform, Sumerian, Assyr- ian, Bab y lonian , . . . . Is there sufficien t commonalit y to learn them all? PH: T hey hav e basically the same script, b ut their use is somewhat differen t, and they are different lan- guages. AB: Ho w did y ou d eal with th e d ifferen t languages? Did you p ic k a particular one? PH: Akk adian I learned quite well , and then I learned some Hittite , and no w I am dabb ling in Sumerian. When I started at the Univ ersit y I found out that v an d er W aerden had written a b ook on ancien t mathematics including Bab ylonian m athe- matics. AB: . . . without knowing the original texts? PH: He had w ork ed with Neugebauer’s editio ns. Of cours e I approac hed v an der W aerden, and I ended up wr iting a couple of pap ers on Bab ylonian mathematics and astronomy . AB: Most und ergraduates w ould n’t ev en hav e a sense of what w ould be a n in teresting problem. Ho w did you fi nd interesting problems? PH: John T uk ey u sed to tell the fairy tale of The Three Princes of Serendip, who through sagacit y w ould disco v er the most incredible things by the w a yside. I guess that’s the wa y to do it. Of course I heard that story fr om him only late r. But Neuge- bauer’s “Astronomical Cuneiform T exts” con tained a couple of pieces that he h adn’t really interpreted. So I had a careful lo ok. I s imp ly tried to go b ey ond what other p eople had done in this case. AB: What typ e of mathematics did y ou find in cuneiform? PH: Ironically , I ran into a pr oblem treated in Bab ylonian math at a Gymnasium cla ss r eunion that I attended t w o we eks ago. A classmate, who do es gardening somewhere in F rance, faced the problem of h o w to divide a lot in the shap e of a trap eze in to three equal areas. She did it by ey e-balling and c hec king, but she felt there should b e a mathemati- cal m etho d. Am usin gly , this exact p roblem is solv ed in a Bab ylonian mathematica l text. I t leads to a quadratic equation. A CONV ERSA TION WITH PETER HUBER 15 AB: Did the Bab ylonians actually solv e the quadratic equations? PH: Most of the texts ha ve problems without so- lutions. I guess they b elo nged to the curr iculum of studen ts. I n this p articular case the so lution w as de- scrib ed step b y s tep. Sometimes one encoun ters o dd - ities, though. F or example, in one case there w as a solution with an error in that a mult iplication with a p ow er of 60 was missed; but then this error was “correcte d” by another error in the trad itional sex- agesimal (base-60 ) n um b er system. HK: So y ou con tin ued publishing in this area all y our life? PH: Roughly ev ery 10 yea rs I got an attac k of cuneiform. Th e first serious one, I think, w as as a Ph.D. student when I w as helping v an d er W aerden writing his b ook. I cont ribu ted some chapte rs on Bab ylonian astronomy . Later, w hen I wa s at Berk e- ley , I sat in courses on Akk adian, and similarly when I visited Y ale. A bigger effort to ok p lace when I w as asked to write a review of a b ook by J. D. W eir on the V en us tablets, that is, the old Bab ylo- nian V enus observ ations a nd t heir use for dating the Hamm urabi dyn ast y . This inv olv ed doing some cal- culations as well as some extensiv e programming to carry out these calculatio ns. Then, one da y in 1973 when I w as sitting in the library , v an der W aerden came along and said he had b een ask ed to partic- ipate in a panel discussion on V elik o vsky and he w as not k een on d oing it and whether I would b e in terested . . . I think he had argued with V elik o vsky b efore. AB: Y ou need to op en a paren thesis here . . . HK: . . . m ost readers will not know who he is. PH: Professionally , V elik o vsky wa s a psyc hiatrist. Because of a n int erest in Moses and the story of the Exo dus, he b egan w orking on Egyptian history . But his ev asiv e b eha vior, and his refusal to listen to argu- men ts, in m y opinion mean t that he w as a c harlatan when it came to history . He had a theory that in historical times there were big cata strophes: V enus jump ed forth fr om Jupiter as a kind of comet, erring around in the solar system for a while un til it settle d in its present orbit. All this was supp osed to hav e o ccurred s ometime in the second millennium BC. Suddenly , these V en us observ ations from the second millennium took on sp ecia l relev ance. So I said to m yself if I had to write this review a nd do some cal- culations, I migh t as well put the results to use on the V elik o vsky issue. This w ould also sp eed things up b y imp osing a time limit. The panel discussion w as to tak e p lace the follo wing ye ar in F ebru ary or Marc h. 9 Being on that panel w as a temp orary culmi- nation of this fora y in to ancien t cuneiform writing. Because of the big shots on the panel—V elik o vsky and Saga n—also my n ame made it for once in to the New Y ork Times, as a “Professor of ancien t history”! AB: What w as exactly the r esult? PH: Based on the V en us tablets I argued that V en us was in its present- day orbit bac k in the first part of the s econd millennium, and th ere w as evi- dence for V enus also f rom m uc h earlier times. AB: . . . and what w as V eliko vsky’s b asis for this theory? PH: His theory was a sham; h e had lots of fo ot- notes and cita tions, making it lo ok v ery learned, but if one chec k ed the fo otnotes one found that they w ere either obsolete or didn’t quite sa y w h at he claimed. AB: So why w ould he mak e this splash? Why w as he not just thrown int o obscurit y from the outset? PH: F or v arious reasons: Macmillan Pub lishers Ltd w as supp osed to p ublish his b ook, but Macmil- lan w as also one of the big textbo ok publishers. Sev- eral of their authors then threatened they w ould no longer pu blish with Macmillan. Macmillan with- drew, and V elik ovsky m ade a b ig noise ab out cen- sorship. He st ylized himself as b e ing maligned for h is views by the establishmen t, whic h attracted many p eople, including scien tists, w ho felt they were not sufficien tly recognized. A t this AAAS-sp onsored panel discussion, the audience was split—half was pro and half against V elik o vsky—and neither side wo uld ac- cept an y argum en ts. Both sides were con vinced they w ere right. Carl Sagan wa s on th e p anel, an d he w as ju st as muc h a pr ima donna as V elik ovsky . V e- lik o vsky w ould sa y the theories must b e wrong b e- cause they do not agree with his evidence, and Sagan w ould sa y V elik ovsky’ s data m ust b e wrong b ecause they d o n ot agree with the theory . Ram bling on and on, neither would k eep to the time limits, but no- b o dy dared stop them. Let’s sa y , it wa s an interest- ing exp erience . HK: But in the end y our evidence solve d the case, more or less. PH: Y ou do not solv e suc h cases. Evidence is s u r- prisingly irrelev ant ; opinions are stronger. What I claimed to b e gross errors in a corrupt manuscript, 9 It took pla ce at the AAA S meeting in San F rancisco , on F ebruary 25, 1 974. 16 A. B UJA AND H. R. K ¨ UNSCH hard-core V elik o vskians w ould tak e as evidence for the err atic b eha vior of V enus. AB: But d idn’t someone p ic k up on your results? PH: Y ou mea n p eople with a professional in terest, Assyriologists and historians? T hose p eople w ould b e concerned ab out the p recise date of the obser- v ations. V enus phenomena are fairly p e rio d ic; they rep eat themselv es almost exactl y after 8 y ears ex- cept that there is a sh ift in the lunar calendar by 4 da ys. After 7 or 8 such p eriods, 56 or 6 4 y ears later, they are shifted b y ab out a mon th, so they a re again in step with the mo o n, p lu s/min us t w o da ys. F or ex- ample, the V enus data fit well with a b eginning of Hamm urabi’s reign in 1848 BC (the so -called “long” c hronology), but also with 1792, 1784 and 1728 BC (the tw o “middle” and th e “short” c hronologies). These are the four most p opular c h ronologies among historians. Arou n d 1980, I came bac k to the prob- lem and sho wed that the astronomic al evidence o ver- whelmingly fav ored th e long c hr onology , and that the middle and short ones in all lik eliho o d were incorrect. This was based on a relativ ely delicate statistic al argument , com bin ing robust, fr equ entist and Ba y esian metho d s. A mong Assyriologists, some w ere con vin ced and some w ere not. Some distrusted the corrupt data, and some rejec ted the long c hronol- ogy b ecause it lea v es a dark p erio d in the m iddle of the second millennium, a h ole without histori- cal in formation. There is still a big discussion which c hronology is correct. AB: In 1980, what made y ou go bac k to this prob- lem again? Y ou thought y ou had some add itional insigh ts or computing w as b etter? PH: Earlier I had used only the V en us data and I wan ted to use the mon th-length data as well . Th e mon th-length data had b een used already by the first b ook on the V en us data in the 1920s. But n o w w e had muc h more data and I thought we could do b etter. So this w as the reason I w ent bac k. I w as in vited to present this material at an Assyriology meeting in London in 198 2, wh ic h I did. The next ef- fort w as when I r etired and had more time on hand. HK: W hat led to this b o ok in front of us? [P oin t- ing to a b ook on the table. 10 ] PH: That b o ok had a v ery long gestation p erio d, dating b ac k to the time when I help ed v an der W aer- den write his b o ok on Bab ylonian astronomy . In that 10 Huber, P. J. and De Meis, S. (2004). Babylonian Eclipse O bservations fr om 750 BC to 1 BC. Milan: I sIAO- Mimesis, 2004. Fig. 9. Houston, 2007. Thir d Erich L. L ehmann Sym- p osium. Chanc e enc ounter with a former Ph.D. Stu- dent: Emery Br own, Pr ofessor of Computational Neur olo gy, MIT/Harvar d. con text I had to lo ok at the lunar eclipse observ a- tion texts whic h had just b een pu blished in 1955. I tried to figure out what they really said. I did that and p r o duced a h andwritten m anuscript as a ba- sis for remarks in v an der W aerden’s b ook. Later, in 1973, I broke a leg and was immobilized for a while, so I wen t o ve r the material once more and wrote a more complete manuscript, in English th is time. Some p eople referred to this manuscript as “Hub er’s samizdat.” S omeb o dy claimed it w as the most qu oted un published manuscript in the area, whic h do esn ’t mean muc h , b ecause not to o many p eople work in that area. F r om time to time p eople ask ed for a copy . It was Salv o De Meis w ho push ed me into publishing. T h is was in the late 1990s. AB: . . . and De Meis is . . . ? PH: . . . a nuclea r engineer by profession, but he also do es ancien t astronom y as a hobb y . W e got to- gether and expanded the man uscript. S ince the for- mer dr aft, other texts had b een p ublished, so we included all of them. W e did some data analysis to solv e the problem of the difference b et w een civil time and d ynamical time. Recall that civil, u niv ersal time is essen tially Green wic h time, based on the rotation of the earth; eph emeris or dynamical time is based on the uniform time scale und erlying the gravit a- tional theories. AB: Do y ou see an y other pro jects along these lines, or op en problems? PH: There are man y prob lems, p e rh aps not enough data. Righ t no w I am lo oking into something very A CONV ERSA TION WITH PETER HUBER 17 nonstatistical , namely , the old Bab ylonian und er- standing of Sumerian grammar. O ld Bab ylonian gram- mar texts are fascinating b eca use they are the ear- liest s erious grammatica l do cument s, roughly from Hamm urabi’s time. Akk adian, sp ok en by the Baby- lonians, is a Semitic language; Sumerian is an agglu- tinating language without links to any other kno wn language. Sumerian d ied out appro ximately 2000 BC as a sp oken languag e, b ut it con tin ued to b e used unt il the v ery end of cuneiform, that is, u n til ab out y ear 0. J ust as with the old Babylo nian mathemat- ical texts, the in teresting thing ab out these texts is the history of science asp ect. They offer exten- siv e, d isciplined ve rb al paradigms with the Sume- rian forms on the left-hand side and the corresp ond- ing Akk adian forms on the righ t-hand side. Hence it sh ould b e p ossible to extract the old Babyl onian understanding of Sumerian v erb al grammar from these texts. This looks lik e an in teresting c hallenge to me. 11 HK: P eter, thank yo u for these fascinating sto- ries. W e wish y ou s u ccess with your adve ntures in Assyriology and hop e y ou will k eep in touc h with statistic s. 11 Huber, P. J. (2007). On the Old Bab ylonian Under- standing of Grammar: A Reexamination of OBGT V I-X. J. Cuneiform Studies 59 1–17.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment