The normalization of citation counts based on classification systems

If we want to assess whether the paper in question has had a particularly high or low citation impact compared to other papers, the standard practice in bibliometrics is to normalize citations in respect of the subject category and publication year. …

Authors: Lutz Bornmann, Werner Marx, Andreas Barth

Accepted for publication publications ISSN 2304-6775 www.mdpi.com/journal/publications Article The normalization of citation coun ts based on classification systems Lutz Bornmann 1, * ,† , Werner Marx 2,† , Andreas Barth 3 ,† 1 Division for Science and I nnovation Studies, Administrative Headquarters of the Max Planck Society, Hofgartenstr. 8, 80539 Munich (Germany), E-Mail : bornmann@gv.mpg.de 2 Max Planck Institute for Solid State Research, Heisenbergstrasse 1, D-70569 Stuttgart (Germany), E-Mail: w.marx@fkf.mpg.de 3 FIZ Karlsruhe, Hermann-von-Helmholtz-Platz 1, D-76344 Eggenstein-Leopoldshafen (Germany), E-Mail: andreas.barth@fiz-karlsruhe.de † These authors contributed equally to this work * Author to whom correspondence should be addressed; E-Mail: bornmann@gv.mpg.de Received: / Accepted: / Published: Abstract: If we want to assess whether the paper in question has had a particularly hi gh or low citation impact compared to other papers, th e standard practice in bibliometrics is to normalize citations in re spect of the subject category and publication year. A numbe r of proposals for an improved procedure in the normalization of citation impact have been put forward in recent years. Against the background of these proposals this study des cribes an ideal solution for the normaliz ation of c itation impact: in a first step, th e reference set for the publication in question is collated by means of a classification scheme, where ever y publication is associated with a single principal re search field or subfield entr y (e. g. via Chemical Abstracts sections) and a publication y ear. In a second step , percentiles of citation counts are calculated for this set and used to assign the no rmalized citation i mpact score to the publications (and also to the publication in question). Keywords: bibliometrics, normalized citation indicators, percentiles 1. Introduction OPEN ACCESS Publications 2013 , 1 2 If we wish to assess whether a paper has had a particularl y hi gh or particularly low citation impact compared to other papers, the standard practice in bibliometrics is to normalize citation counts which are field- specific. “Each field has its own publicat ion, citation and authorship practices, makin g it difficult to ensure the fairness of between-field comparisons. I n some fields, researchers tend to publish a lot, often as pa rt of lar ger collaborative teams. In other fields, co llaboration takes pla ce onl y at relatively small scales, usually involving no more than a few researchers, and the average publication output per researcher is si gnificantly l ower. Also, in some fields, publications tend to have long reference lists, with many references to recent wo rk. In other fields, reference lists may be much shorter, or the y may point mainly to old er work. In the latter fields, publications on average will receive onl y a relatively small number of citations, while in the former fi elds, the average number of cita tions per publication will be much larger” [1]. The citation impact can only be used fo r cross-fi eld comparisons after a field -specific normalization of the citation impact of p apers has been unde rtaken. In bibliometrics, normalizing procedures use statistical methods to calcu late citation impact values which are comparable across different fields and times. 2. The use of reference sets When normalized citation indi cators are ge nerated, two standards are used as reference sets: “the average citation r ate of the journal and the average citation rate of the field. In the first case, the citation count of a paper is compared with the averag e citation rate for t he particular journal in t he particular year. In the second case, th e citation count of a paper is compared with the average citation rate for the p articular field or subfield for the particular year” [2, p. 172 ]. These two standards are then used to calculate relative citation indices, which are called the R elative Cita tion Rate [3]. With this standard one has to take into account that most bibliometric studies are ba sed on journals: either th ey are based on individual journals, or journal sets are used, where individual journals are combined to form a field- specific set. “Pri mary journ als in science are generally a greed to contain coherent sets of papers both in topics and in professional standa rds. This coherence stems from the fact that many journals are nowadays specialized in quite nar row sub - disciplines and their ‘gatekeepers’ (i.e. the editors and referees) con trolling the journal are members of an ‘invisible college’ sharing th eir vi ews on questions like relevance, validity or quality ” [ 4, p. 314]. Even though both standards (normalization based on individual journals or on field -specific journ al sets) have already been used in a variety of wa y s in bibliometrics, there are a number of a rguments in favour of a preference for the use of research field s in the normalization rather than the use of individual journals: 1) Indicators represent incentives for academics to shape their publication behaviour in a particular way [ 5]. Academics are therefore guided b y the in dicators which are used i n research evaluations. T he normalization based on individual journals rewards publications in journals of little reputation: in these journals it is easier for individual publications to be above the r eference value [ 6]. The use of an indicator which is norm alized on the basis of individual journals therefore encourages academics to publish in journals of lesser reputation. 2) In g eneral, reference values should be used to take account of (or to disregard) factors in the citation anal y sis which may have an impact on citations but are not rela ted to resea rch qualit y . The Publications 2013 , 1 3 year of publi cation affe cts the citation impact of a publication, for example, although the year of publication has no beari ng on qualit y . We can assume that a publication from 2000 is not of a higher quality per se than a publi cation from 2005 - eve n if the old er publication is usually cited more often than the more recent one. W e also know (see above) that the research field has an influence on citations. The different citation rates between th e research fields do not reflect differences in qual ity between the papers in the research fie ld s, however. Whereas mean citation impact values for different subject categories r eflect onl y the dif ferent citation behaviours within different research fields, the values for different individual journals reflect not only th e different behaviours, but also the different journal qualities. We kn ow that certain jou rnals publish (on average) higher qualit y p apers than o ther journals [7]. Thus, the citation impac t score for a journal is also quality driven, but not the citation impact score for a resea rch field . This feature leads to the fact that results on the basis of standards which are based on individual journals are not meaningful without accompany ing indi cators. 3) Indicators on the basis of a normalization based on individual journals must therefore alwa ys be accompanied b y indicators which provide info rmation on the quality of the journals where the res earch under evaluation has been published. The normalized score on its own is not meaningful: if two institutions A and B hav e the same above-average score, it is not clear whether the score is b ased on normalization to journals with high or low citation impact. Institution A, which was normalized t o a high citation impact would have published in reputational journals and at the same time achieved more citations. This institution would therefor e be s uccessful in two respects. I nstitution B, which was normalized to a low citation impact, would have published in unimportant journals and exceeded o nly this low standard. Institution B has in fact a worse position (in two respects), a fact which is not expressed by the norma lized score [8]. Only the quality of th e journals e nables an assessment to be made as to whether an i nstitution has trul y achieved a high impact with its publications when it has a comparably low normaliz ed citation impact (because it has published ma inly in reputational jour nals with a high citation impact), or whether it has trul y achieved a low impact (because it has published mainly in journals of little reputation with a low citation impact). 4) In bibliometric evaluations the mean normalized citation impact (of in stitutions, for ex ample) is often shown as a function of the individual years of publication. Since individual journals usuall y enter into the calcula tion of normalized impact scores with significantly smaller publication sets than do journal sets, this leads to the normalized scores based on individual journals exhibiting greater variations over the publication years than the nor malized scores based on the publications in a specific research field. The variations often make it almost impossible to recognize a true trend ov er the publication years for normalized scores based on individual journals. Given these probl ems, other publications have already recommended that the field -normalization should be given preference ov er normalization based on single journals : Aksnes [ 2], for example, writes t hat “the field average should be considered as a mo re adequate or fair baseline [than the Relative Citation Rate], a conclusion that is also supported by other studies” (p. 175). The Council of Canadian A cademies [9] recommends th at “for a n assessment of the scientific impact of research in a field at the national level , indicators based on relati ve, field -normalized citations (e.g., avera ge relative citations) offer the best a vailable metrics. At this level of a ggregation, when appropriately normaliz ed by fi eld and b ased on a sufficie ntl y long c itation window, these me asures provide a defensible and informative assessment of the impacts of past research” (p. xv). Publications 2013 , 1 4 The preference for the use of a res earch field inst ead of an individual journal in bibliometrics does not mean that the end of the discussion on normalized indicators has now been reached, however. Recent proposals for improving the calculation of normalized im pact scores refer primaril y to (1) the use of better alternatives for journal sets and (2) the avoidance of the arithmetic average when calculating reference score s. 3. Determination of research fields In most studies the dete rmination of research fields is based on a classification of journals into subject categories dev eloped by Thomson Reuters (Web of Science) or Elsevier (Scopus). “The Cen tre for Science a nd Technology Studies (CWTS) at Le iden University, the I nformation Science and Scientometrics Research Unit (ISSRU) at Budapest, and Thomson Scientific [now Thomson Reuters] itself use in their bibliometric anal y ses r eference standards based on journal classification sch emes” [10]. Each journal is classified as a whole either to one or to several subject categories. The limitations of journal classification schemes become obviou s in the case of multidisciplinary jou rnals such as Nature or Science and highl y specialized fields of research. Papers that appear in mul tidisciplinary journals cannot be assigned exclusivel y to one field, and for highl y specialized research fields no adequate reference values exist. To overcome the limi tations of journal classification schemes, Bornmann , et al. [11] and Neuhaus and Daniel [ 10] proposed an alternative possibil ity for the compilation of comparable sets o f publications (the refere nce standard) for the papers in question. Their normalization is based on a publication-specific classification where each publication is associated with at least one sing le principal field or subfield entry, highlighting the most important aspect of the individual publication [12]. The datab ases offered b y Chemical Abstracts Service (CAS), a divi sion of the American Chemical Society (ACS), are the most comprehensive databases of publi cly disclo sed research in chemistry and related sciences. The C AS literature database (CAplus) includes both papers and patents published since around 1900. This database not onl y covers publications in the classical fields of chemistry, but also in many other natural science disciplines like materials science, phy sics and biology. CAS has defined a three-level clas sification scheme to categorize chemi stry-related publications into five bro ad headings of chemical research (section headings) which are divided in 80 different subject areas called Chemical Abstracts sections . (see Table 1). Each of the 80 sections is further divi ded into a var ying number of sub -sections. Each individual paper is assigned to onl y one section or subsection accordi ng to the main subject field and interest. If the subject matter is appropriate to other sections, cross - references are provided. Detailed des criptions of all sections can be found on th e CAS webpage and in Chemical Abstra cts Service [13]. This classification is applied to all publica tions of the CAplus literature and patent database (see https://www.cas.org/content/ca-sections):  “Each CA section covers only one broad area of scientific inqui ry  Each abstract in CA appears in only one sec tion  Abstracts are assigned to a section according to the novelt y of the process or substance that is being reported in the literature  If abstract information pertains to a section(s) in addition to the on e assigned, a cross- reference is established” Publications 2013 , 1 5 Table 1. Summary of CA Section Hea dings. For Organic Chemistry the individual sections are listed for illustration. Section Heading Number of Sections BIOCHEMISTRY (BIO/SC) 20 ORGANIC (ORG/SC) 14 21. General Organic Chemistry 22. Physical Organic Chemistry 23. Aliphatic Compounds 24. Alicyclic Compounds 25. Benzene, Its Derivatives, and Condensed Benzenoid Compounds 26. Biomolecules and Their Synthetic Analogs 27. Heterocyclic Compounds (One Heter o Atom) 28. Heterocyclic Compounds (More Than One Hetero Atom) 29. Organometallic and Organometalloidal Compounds 30. Terpenes and Terpenoids 31. Alkaloids 32. Steroids 33. Carbohydrates 34. Amino Acids, Peptides, and Proteins MACROMOLECULAR (MAC/SC) 12 APPLIED (APP/SC) 18 PHYSICAL, INORGANIC, AND ANALYTICAL (PIA/SC) 16 The number of p apers p er section and year v aries largely. E.g., in 2010 the average number of papers per section is 11869, with 73273 (section 1) as the highest and 24 0 (section 32) as the lowest number of papers. From a statistical poin t of view, this is widely sufficient for a reliable normalization. If sections are too large, it is questionable whether section s are suffici ently homogeneous in terms of citation practices (t his has to be investigated further). It could be that a section covers sub areas of chemical research with different citation practices. I n any c ase, the sub-sections rather the complete sections ca n be consulted for normalization, provided that the number of pa pers per year meets statistical requirements [14]. An advantage of th e sections of Chemical Abstracts for bibliometric anal yses is that indexers assign the relevant sections to the papers int ellectually. This classification is not affected b y what is called the “indexer effect”: According to Braam and Bruil [15] , the classification of papers into 80 sections in Chemical Abstracts is in accordance with author preferences for 80% of all papers. The sections of Chemical Abstracts thus seem to provide a promising basis for the d escription and comparison of publications and impact profiles of journals. Hence, for evaluation studies in the field of chemistry and related fields [14,16], comparable papers can be compiled to r eference sets using a specific CA section Publications 2013 , 1 6 or sub -section which covers the content of this specific publication set to a large ex tent. In contrast to the classification of journals in journal sets, this procedure also assigns papers from multidisciplinary and wide-scope journals to a specific field. In addition to Chemical Abstracts there a re a number of other spe cialist databases which categorize publications on a paper- by -paper b asis in terms of research fields (e. g. Medline or Scitation). In the field of Mathematics a common Mathematical Subject Classification (MSC) has bee n developed by the two providers of large mathematical li terature databa ses: Zentralblatt Math from FIZ K arlsruhe and MathSciNet from the American Mathematical Societ y ( http://msc2010.org). After a revision in 2010 both providers apply the same classification codes to all documents in their databases. On the top level there are 63 subject he adings with a considerable number of specific cl assification codes. Each document includes at least one MSC. Since the MSC is a rather detailed set of classifications which are systematically applied it should be well suited to a bibliometric analysis of papers with respect to their research fields. Scopus manually adds index terms from different specialist areas fo r t he majority of the titl es included in Scopus. These index terms are adopted from thesauri which the datab ase op erator Elsevier itself owns or licenses (e. g. Medline). The ter ms are added to the publication records in order to improve retrieval from a field-specific se arch for publications. In addition to MeSH terms (Medline) for the fields of life sciences and health scien ces t he EI th esaurus, for example, is used for the fields of engineering, technology and the physical sciences. In some fields, it ma y be difficult to find one complete paper classificati on scheme . In these fi elds, several schemes must be indicated. For example, in mathematics a major scheme exists, but in economics it is hard to find an appropriate scheme. Moreover, some fields (e.g. information sciences) do not have a classification scheme that is accepted by e ver y one. 3. Percentiles of citation counts Two significant disadva ntages are inherent in the calculation of the R elative Citation R ate [7]: (i) As a rule, the distribution of citations over publication sets is skewed to the right. The arithmetic m ean value calculated for a reference set is therefore determined by a few highly cited papers. The arithmetic mean as a measure of central tendenc y is no t suitable for skewed data. This is the only reason wh y, for example, in the L eiden R anking 2011/2012 the University of Götti ngen occupies position 2 in a ranking b y citation impact; the relevant mean score for this university “turns out to have been stron gl y influenced by a single e xtremely highly cited p ublication” [ 17, p. 2425]. (ii) The quotient permits merely a statement about whether a publication is cited more or less than t he average in the reference set. Other attributes which could describe the citati on impact of a publication as ex cellent or outstanding are b ased on (a rbitrary) rules of thumb with no relationship to statistical citation distributions [18]. Using percentiles (or percentile rank classes) to normalize citation impact can give bett er comparisons of the impact of publications than normalization using the arithmetic mean [19-22]. The percentile provides info rmation about the citation impact the publication in question has had compared to other publications. A percentile is a value below which a certain proportion of observations fall: the higher th e per centile for a publication, the more citations it has rece ived compared to publications in Publications 2013 , 1 7 the same research field and publication year. The percentile for a publi cation is determined usin g t he distribution of the percenti le ra nks over all publ ications: for ex ample, a value of 90 means that the publication in question is among the 10% most cited publications; the other 90% of the publi cations have achieved less impact. A value of 50 represents the median and thus an average citation impact compared to the other publications (from the same research field and publication year). For a publication set und er study, each publication in the set must be normalized using it s specific reference s et with publ ications from the same field and publication year. In other wo rds, each publication receives its specific p ercentile which is calcul ated based on it s specific ref erence set. However, this normalization using percentiles is sensit ive to the coverage of a bibliographic database. For instance, the more local or n ational journals (typically of low impact) a database contains, the easier it becomes for a p ublication in an international journal (t ypically o f a higher im pact) to have a high percentile rank [23]. There sim ply are mor e lowly cited publications and therefore a public ation that is sufficiently highly cited will end up in a higher p ercentile. W hat is even more problematic is that there can be differences between fields in database coverage. In so me fields there may be many local or national journals covered by a database, making it relatively easy to end up in a high percentile, whereas in other field s there may be only a few local or national journals, making it more difficult to end up in a h igh percentile. In some f ields a database such as Web of Science also covers popular magazines (e.g., Forbes and Fortune in business ). Thes e magazines, which can h ardly be considered scientific, receive few citations and therefore it becomes relativel y easy for other publications in these fields to end up in high percentiles. These issues shoul d be considered if a certain database is selected for a specific research e valuation stud y. 4. Conclusions Given the new possi bilities for cal culating reference v alues and the s trengths and weakne sses of existing standard indicators several recent research papers have propos ed alternative solutions for the normalization of citation counts. It is one object of current research to compare the different methods empirically and to find the “best” field -normalizing method. For ex ample, Leydesdorff , et al. [24] compare norm alization by counting citations in proportion to the leng th of the refere nce list (1/ N o f references) with rescaling b y dividing citation scores b y the arithmetic mean of the citation r ate [25] . The former normalization method uses the citing papers as the reference se ts across fields and journals, and then attributes citations fractionally from thi s pe rspective. I n the l atter normalization method proposed b y R adicchi , et al. [26], the normalized (field-specific) citation count is c f = c / c 0 , in which c is the raw citation count and c 0 is the average number of citations per unit (article, journal, etc.) for this field – or more generally – this subset. The results of Leydesdorff, Radicchi, Bornmann, C astellano and de Nooy [24] show, for example, that rescaling outperforms fractional counting of citations. Our approach is based on a hi gh-quality classification system which is intellectually and systematically applied to all publications in a given database. The normalization of the citation impact can be c arried out in two steps: in a first step, the reference set for the publication in question is collated by means of a classification scheme, where every public ation is associated with a single principal field or subfield entr y, e. g. vi a CA sections. In a second step, p ercentiles are calculated for this set, and are then used to assign a normalized citation impact score to the publication in question. Publications 2013 , 1 8 This approach offers a si mple operational solution for th e normalization of the citation impact [ 7]. It provides a significant improvement with respect to both existing solutions (journal or field based) as well as to other approaches currently under investigation. The major adv antages are th e application of a systematic hi gh-quality classification s ystem, the simplicit y of the proce dure, and most im portantly the balance or fairness of the resulting citation counts. Conflict of Interest The authors declare no conflict of interest. References 1. Waltman, L.; van Eck, N.J . A s y stematic empirical comparison of different approaches for normalizing citation impact indicators. (February 6), 2. Aksnes, D.W., Citation rates and per ceptions of scientific contribution. Journal of the American Society for Information Science and Technology 2006 , 57 , 169-185. 3. Schubert, A.; Braun, T., Relative indicators and relational charts fo r comp arative assessment of publication output and citation impact. Scientometrics 1986 , 9 , 281-291. 4. Schubert, A.; Braun, T., Cross-field normalization of scientometric indicators. Scientometrics 1996 , 36 , 311-324. 5. Bornmann, L., Mimicry in science? Scientometrics 2010 , 86 , 173-177. 6. Vinkler, P., The c ase o f scientometricians with the “ absolute relative” impact indicator. J. Informetr. 2012 , 6 , 254-264. 7. Bornmann, L.; Mutz, R.; Marx, W.; S chier, H.; Daniel, H.-D., A multilevel modelling approach to investig ating the predictive validity of editorial decisions: Do the editor s of a high -profile journal select manuscripts that are highl y cited after publicatio n? Journal of the Royal Statistical Society - Series A (Statistics in Society) 2011 , 174 , 857-879. 8. van Raan, A.F.J., Measurement of central as pects of scientific research: Performance, interdisciplinarity, structure. Measurement 2005 , 3 , 1-19. 9. Council of Canadian Academies Informing r esearch choic es: Indicators and judgment: The expert panel on science performance and r esearch funding. ; Council of Canadian A cademies: Ottawa, Canada, 2012. 10. Neuhaus, C.; Daniel, H.-D., A new r eference standard for citation analysis in chemistry and related fields based on the sections of chemical abstracts. Scientometrics 2009 , 78 , 219-229. 11. Bornmann, L .; Mutz, R.; Ne uhaus, C.; Daniel, H.-D., Use of c itation counts for re search evaluation: Standards of g ood practice for ana lyzing bibliometric data and pres enting and interpreting results. Ethics in Science and Environmental Politics 2008 , 8 , 93-102. 12. van Leeuwen, T.N.; Calero Medina, C., Redefining the fi eld of e conomics: Improving field normalization for the application of bibliometric techniques in the field of economics. Res. Evaluat. 2012 , 21 , 61-70. 13. Chemical Abstracts Se rvice Subject coverage and arrangement of abstracts by sections in chemical abstracts ; Chemical Abstracts Service (CAS): Columbus, OH, USA, 1997. 14. Bornmann, L.; Schier, H.; Marx, W.; Daniel, H.-D., I s interactive open a ccess publishing able to identify hi gh-impact submissions? A stud y on the predictive validit y of atmospheric chemistry and physics by using percentile rank classes. Journal of the American Society for Information Science and Technology 2011 , 62 , 61-71. 15. Braam, R.R.; Bruil, J., Quality of indexing information: Authors views on indexing of their articles in chemical abstracts online ca-file. J. Inf. Sci. 1992 , 18 , 399-408. 16. Bornmann, L.; Daniel, H.-D., Selecting manuscripts for a high impa ct journal throu gh peer review: A citation analysis of communications that were accepted by angewandte che mie Publications 2013 , 1 9 international edition , or rejected but published elsewhere. Journal of the American So ciety for Information Science and Technology 2008 , 59 , 1841-1852. 17. Waltman, L.; Calero -Medina, C.; Kosten, J .; Noyons, E.C.M .; Tijssen, R.J.W.; van Eck, N.J.; van Le euwen, T.N.; van Raan, A.F.J .; Visse r, M.S.; Wouters, P., The leiden ranking 2011/2012: Data collection, indicators, and interpretation. Journal of the American Society for Information Science and Technology 2012 , 63 , 2419-2432. 18. Leydesdorff, L.; Bornmann, L.; Mutz, R.; Opthof, T., Turning the tables in citation analy sis one more time: Principles fo r comparing sets of documents. Journal of the American Society for Information Science and Technology 2011 , 62 , 1370-1381. 19. Bornmann, L.; Le y desdorff, L.; Mutz , R., The use of percentiles and percentile rank classes in the analysis of bibliometric data: Opportunities and limits. J. Informetr. 2013 , 7 , 158-165. 20. Rousseau, R., Basic pro perties of both percentile rank sco res and the i3 indicator. J ournal of the American Society for Information Science and Technology 2012 , 63 , 416-420. 21. Schreiber, M., Uncertainties and ambiguities in percentiles and how to avoid them. Journal of the American Society for Information Science and Technology 2013 , 64 , 640-643. 22. Waltman, L.; Schreiber, M., On the calculation of pe rcentile -based bibliometric indicators. Journal of the American Society for Information Science and Technology 2013 , 64 , 372-379. 23. Schubert, T.; Michels, C., Pl acing articles in the l arge publisher nations: Is there a “free lunch ” in terms of higher impact? Journal of the American Society for Information Science and Technology 2013 , n/a-n/a. 24. Leydesdorff, L.; Radicchi, F.; Bornmann, L .; Castellano, C.; de Nooy , W., Field-normalized impact factors: A comparison of rescaling versus fractionally counted ifs. Journal of the American Society for Information Science and Technology in press . 25. Leydesdorff, L.; Bornmann, L ., How fractional counting of citations affects the impact factor: Normalization in terms of differences in citation potentials among fields of science. Journal of the American Society for Information Science and Technology 2011 , 62 , 217-229. 26. Radicchi, F.; Fortunato, S.; Castellano, C., Universa lity o f citation distributions: Toward an objective measure of s cientific impact. Proceedings of the National Academy of Sciences 2008 , 105 , 17268-17272. © 2013 b y the authors; licensee MDP I, Basel, Switzerland. This article is an o pen access article distributed under the terms and conditions of the Creative Commons Attribution license (http://creativecommons.org/licenses/by/3.0/).

Original Paper

Loading high-quality paper...

Comments & Academic Discussion

Loading comments...

Leave a Comment