Bibliometrics training from Thomson Reuters
I attended a training course held at Oxford University last Friday: it was a session delivered through Mimas, and provided by Thomson Reuters (TR) who publish citation data.
The session began with reference to two University rankings, which I have blogged about in the past. The ARWU from ShanghaiRanking and the Times Higher Education (THE) World University Ranking, both of whom use Thomson Reuters' citation data. There are other University Rankings, of course: QS used to provide THE's ranking and now have their own World University Ranking, and there is the Webometrics Ranking Web of World Universities, who do look at citation data but they use TR's competitor Elsevier's data, available in their product Scopus. And there are other rankings too, which are not at all interested in citations... but the two mentioned on Friday were ARWU and THE.
ARWU's approach is interesting: they are interested in whether any researchers have published in two particular high profile and cross-disciplinary journal titles: "Nature" and "Science". Our trainer also mentioned that ARWU seem to use other citation data, possibly from TR's Essential Science Indicators product. THE's ranking methodology shows that about a third of their ranking score is due to citations data from TR. More reading on University rankings: "International ranking systems for universities and institutions: a critical appraisal" http://dx.doi.org/10.1186/1741-7015-5-30 looks particularly interesting and it cites some other important looking articles on the topic, although all are too old to shed light on the THE's latest methodology, and the way in which THE have normalised for discipline seems to me to be particularly significant.
Our trainer did suggest that we can create our own normalisation by searching for articles in a given journal or from a given subject set, and in a particular year and then creating a report on Web of Science, which would tell us the expected citation rate for that set of articles. (This is the small link towards the top right hand side of the screen "Create citation report", which you can do for any set of results in Web of Science.) I think it's unlikely that the THE did this: they would probably have bought the raw data to manipulate, or at least have purchased it through InCites where you can get reports on expected citation rates.
Criticising the measurements
When using these kinds of citations metrics, or indeed any bibliometrics, you need to bear in mind the source of your data, and our presenter did show us some slides indicating that 40% of the journals in Web of Science carry the vast majority of all citations. TR do add new journal titles to their collection (and they drop some), and they evaluate about 2,500 new titles each year for suitability. They have records for all citations from the journals they index, i.e. including those to journals which they do not index. This means that they have data to indicate that the journals they have not indexed are in fact attracting lots of citations and therefore they ought to cover them...
But we're still only talking about journals and conference proceedings, in the main. TR have mentioned a couple of times recently that they are planning some kind of citations index for books to be launched next year, but they are playing their cards very close to their chests about their source of data for any such index!
We spoke about self-citations and whether these ought to be included in citation measuring sets. I would recommend self-citing from a "bibliometrics optimisation" perspective, although of course there are other reasons than citation measurements to self cite or not.
A colleague from Warwick who was also at the session, Professor Robert Lindley who heads our Institution for Employment Research, also suggested that TR stopped referring to the measure of how many articles an author (or unit) has published as a measure of "Productivity". It is a volume of output, perhaps, but even then only of particular outputs so it would be best to label it as just what it is, the number of journal items published. TR also suggest an "Efficiency" rating which is the percentage of papers with citations as opposed to those without any, and an "Impact" rating of the average number of citations per paper (as used for Journal Impact Factors). Pitfall to avoid: this citation impact is not at all the same as impact in the context of the REF: REF impact is about effects outside the scholarly community, whilst TR's measurement of citations is an activity that clearly happens within the scholarly community.
Journal Impact Factors
The calculation of the Journal Impact Factor was explained, and the purpose of the 5 year Journal Impact Factor as well, which was useful for me to pick up on: I wondered why there were two measures! The original one was measured over two years, and a graph showing the average time for citations for an article to appear by discipline clearly showed that for some disciplines, the peak number of citations will happen after two years since publication. In other words, the measure of a journal's impact being over a two year period was advantaging journals from disciplines which are quickest to cite. These are primarily science, technology and medicine journals, so the 5 year Journal Impact Factor could be really useful for those involved cross-disciplinary and interdisciplinary research who are looking to target journals for their articles which will get them the best reach within the scholarly community as a whole. The 5 year JIF is a better measure if you are trying to compare journals from different disciplines... although of course, it can never take into account the relative value of a citation from each discipline, and indeed the fact that citations from some disciplines will happen in books or other kinds of publication which WoS doesn't (currently) index...
There was more about the H-index and some useful slides that I hope to get a link to in the near future.