Comment: Citation Statistics
Comment on "Citation Statistics" [arXiv:0910.3529]
Authors: Peter Gavin Hall
Statistic al Scienc e 2009, V ol. 24, No. 1, 25–2 6 DOI: 10.1214 /09-STS285D Main article DO I: 10.1214/09-STS285 c Institute of Mathematical Statisti cs , 2009 Comment: Citation Statistics P eter Gavin Hall Key wor ds and phr ases: Bibliometric analysis, bibliomet ric data, ci- tation analysis, impact factor, journal ranking, researc h assessment. I remem b er a US colleag ue commen ting, in the mid 1980s, on th e p r edilection of deans and other unive rsity manag ers for assessing academic statisti- cians’ p erformance in terms of the n umbers of pa- p ers they pub lished. The managers, he said, “don’t ha v e man y skills, but they can coun t.” It’s not clear whether the managemen t science of assessing re- searc h p erforman ce in univ ersities has adv anced greatly in the in terv ening qu arter cent ury , bu t there are certainly more things to coun t than eve r b efore, and there are increasingly s op h isticated wa ys of do- ing the counting. The p ap er b y Adler, Ewing and T a ylor is right ly critical of man y of the practices, and argumen ts, that are based on co unting citations. Th e authors are to b e congratulated for pro ducing a forthr igh t and inf orm ativ e do cum ent, which is alrea dy b eing read by scien tists in fields outside the mathemati- cal sciences. F or example, I ment ioned the p ap er at a meeting of the executiv e of an Australian science b o d y , and found th at its v ery existence generated considerable in terest. Even in fields where impact factors, h -factors and their brethren are more widely accepted than in mathematics or statistics, there is apprehension that th e u se of those n umb ers is get- ting out of h and, and that their implications are p o orly understo o d. The latter p oin t should be of p articular concern. W e know, sometimes fr om bitter exp erience, of some of the statistical c hallenges of comparing jour n als or scien tists on the basis of citation data—for ex- ample, the data can b e v ery hea vy-tailed, and there Peter Gavin Hal l is Pr ofessor of Statistics, University of Melb ourne, Victoria, Melb ourne, VIC 3010, Austr alia e-mail: halpstat@ms.unimelb. e du.au . This is an electronic reprint of the o riginal ar ticle published by the Institute of Ma thematical Statistics in Statistic al Scienc e , 200 9, V o l. 2 4, No. 1, 2 5–26 . This reprint differ s from the or ig inal in pagination a nd t yp ogr aphic detail. are v ast differences in citation culture among differ- en t a reas of science a nd technolog y . T here are ma- jor differences ev en within p r obabilit y and statistics. Ho wev er, we hav e only ru diment ary to ols for quan- tifying this v ariation, and that means that w e can pro vide only limited advice to p eople wh o are using citatio n d ata to assess the w ork of others, or wh o are themselv es b eing assessed using those data. Therefore, one of th e conclusions we should dra w from the stud y by Adler, Ewin g and T a ylor is that w e n eed to know more. Pe rhap s , as statisticians, we could u ndertak e a study , p ossibly fun d ed in part by a grant aw ardin g agency or our pr ofessional so ci- eties, into the natur e of citation data, the informa- tion they cont ain, and the metho ds for analysing them if one m ust. Th is would p ossibly require the assistance of companies or organizations that gather suc h data, for example, Thomson Reuters and the American Mathematical So ciet y . Ho we ve r, without a prop er study of th e d ata to determine its features and to develo p guidelines for p eople who are in- evitably going to use it, we are all in the dark. Th is includes th e p eople who sell the data, th ose wh o use it to assess researc h p erformance and those of us whose p erformance is ju dged. It sh ould b e men tioned, ho wev er, that too sharp a fo cus on citation analysis and p erform an ce rank- ings can lead almost inevitably to sh ort- r ather than long-term fostering of researc h excellence. F or ex- ample, the appropriate time windo w f or analyzing citatio n data in mathematics and s tatistics is often far longer than the t w o to three yea rs found in most impact factor calculati ons; it can b e more lik e 10–20 y ears. Ho wev er, u niv ersit y managers t ypically ob ject to th at sort of wind o w, n ot least b ecause they wish to assess our p erform ance o ve r the last few years, not o v er the last decade or so. More generally , fo- cusing sharp ly on citatio ns to m easure p erformance is n ot unlik e ranking a movie in terms of its b ox- office receipts. There are man y mo vies, and many researc h pap ers, that hav e a marked long-term im - pact through a complex p ro cess that is p o orly r epre- sen ted b y a simple a ve rage of n aiv e criteria. M ore- 1 2 P . G. HALL o ver, b y relying on a f orm ulaic approac h to mea- suring p erformance w e act to discourage the cre- ativ e young men and wo men whom we wa nt to tak e up researc h careers in statistical science. If they en- jo y ed b eing narr o w ly siz ed and measured b y b ean- coun ters, they’d m ost lik ely h a v e c hosen a differen t profession. T o illustrate some of the issues connected with ci- tation analysis I should men tion recen t exp eriences in Australia with the use o f c itation data to assess researc h p er f ormance. In the second half of 2007 the academies, so cieties and asso ciations represent - ing Australian academics w ere ask ed by our federal go vernmen t to rank national and in ternational j ou r - nals, as a p relude to a natio nal review of researc h and to the d ev elopmen t of new metho ds for dis- tributing “o verheads” to un iv ersities. The request w as not uniformly well receiv ed b y the academic comm unit y . F or example, I didn’t l ike it. Ho we v er, to the go v ernment’s credit it did endea vor to con- sult. Differen t fi elds drew up journal ran k in gs in four tiers, u s ing metho ds (e.g ., delib eration by commit- tee) that they deemed appropr iate. But the conser- v ativ e go vernmen t that prop osed this pro cess lost of- fice in No v em b er 2007, and a month later the Lab or go vernmen t that r ep laced it quietly b ut assiduously set ab out revising the rankings. They still used four tiers, consisting of the top 5%, next 15%, next 30% and low er 50% of the cohort of journals in a g iv en field. (Selecting the cohort w as, and is still, a con tro- v ersial matter.) Ho wev er, in man y cases the revised rankings differed su bstan tially from the earlier ones. In p robabilit y a nd stati stics, and app lied mathe- matics, the r evised rankings were w ork ed out by the bureaucracy and b y consultan ts whom the go v ern- men t e mploy ed, using five-y ear journal imp act fac- tors apparently computed from pur c hased data. The resulting ranking d eparted fr om acce pted n orms in a n umb er of imp ortan t resp ects, enough to shed signif- ican t d oubt on the credibility of the whole exercise. Initially the pro cedur es laid d o wn b y the Australian Researc h Council (AR C) for commenti ng on their revised ranking seriously restricted the abilit y of th e probabilit y and statistics comm unity to resp ond as a b o dy , for example through a committee. Ho we v er, thanks to timely in terv ent ion by the I MS Presiden t in early July 2008, we we re give n an opp ortu nit y to mak e a submission d irectly to the ARC. This enabled us to form a committee to recom- mend the correction of a num b er of s er ious prob- lems. F or example, the ARC’s revised ranking based on im p act fac tors had dictate d that no j ou r nals in probabilit y could b e in the top tier; p robabilists gen- erally publish less, an d are cited less, than statisti- cians. Eve n within statistics there were a num b er of what I regarded as signifi can t errors. F o r exam- ple, s ome high impact factor journals, dedicated to sp ecific fields of app lication, w ere placed into muc h higher tiers than reno wned journals th at fo cu sed more on the dev elopment of general s tatistical metho d - ology . Still other imp ortant journals w ere omitted en tirely . The committee set to work to remedy these problems. As yo u can imagine, the redistribution of j ournals among tiers w as n ot without significan t deb ate. I receiv ed v ery strong email messages from, for exam- ple, a m edical s tatistician who ob j ected s tr en uously to Statistics in M e dicine b eing in a lo w er t ier than the The Annals of Pr ob ability . As he p oin ted out, the committee revising the r anking had “no ob jec- tiv e criterion” for journal ranking other than im- pact factors, and in Thomp s on Reuters’ most re- cen t (i.e., 2007) list of those f actors, The Annals of Pr ob ability had an impact fact or of only 1.270 , whereas Statistics in Me dicine en j o y ed 1.547. Th en there w ere the upset probabilists, who ob jected to the large num b er of statistic s journals in the top tier, relativ e to the sm all n umb er of p robabilit y jour nals. One probabilist suggested a su bstan tial r eduction in the n umber of statistics journals b eing considered. Sev eral argued th at to o muc h atten tion w as b eing paid to impact facto rs. (I w as u nsuccessful in p er- suading my statistics colle agues to mov e far enough a wa y from an impact-fact or view of the wo rld to put th e The A nnals of Applie d Pr ob ability into the top tier, but collea gues on the applied mathemat- ics committe e generously adopted the journal and placed it in their first tier.) As these experiences indicate, the l ac k of a cle ar understand ing by the probabilit y and statistics com- m unity of the s tr engths and w eaknesses of citation analysis is causing m ore than a few pr ob lems. If th e Australian go v ernment h as its wa y , whether a p ap er is p ublished in a first- or second-tier jour nal will in- fluence the standin g of th e asso ciated r esearch, and will affect the “o ve rhead” comp onent of fundin g that flo ws to a un iv ersit y in connection with that work. I think th is is q u ite wr ong, b ut at p resen t we do not ha v e muc h c h oice other than to make the b est of a b ad deal. In that con text, if our communit y do es not h a ve a clear and authoritativ e u nderstandin g of DISCUSSI ON 3 the nature, and hence the limitations, of imp act fac- tors (and more generally of citation data), then we cannot react in an authoritat ive w a y to arguments that w e feel are inv alid, but are neve rtheless strongly held. F rankly , we need to kn o w more ab out citation data and citation analysis, and that r equires in v est- men t so th at w e can in v estigate the topic.
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment