May 05, 2009

What should we be measuring …and why

More thoughts on repository statistics!

My basic reasons for looking at repository statistics are:

-1 Can assess and demonstrate that you are meeting aims/targets (& set such targets).
-2 Can gain interest/approval/support on the back of large numbers!
-3 Providing authors with information about who is looking at their work could motivate them to deposit.
-4 Might generate some competitive spirit!
-5 Identifying popular content might help in measuring citation impact of repository deposit.

Looking back at the basic reasons:

1) Aims and targets need to be set for the next 12 months, as we have emerged from our JISC funded project. I can only aim for things that I can measure so this becomes a circular argument! Ideally, I would like to be able to ensure that we are getting deposits of all appropriate items, across the whole University - and to know that we can handle such numbers. So what I really need to be able to do is to measure what the University's authors are actually producing.

I need to know about numbers of visitors to WRAP, and whether or not these can be boosted, in order to meet the goal of WRAP being a showcase for Warwick research.

Measuring how people get to WRAP is important, because if they all come via Google and bypass our metadata entirely, then this might cause us to review our metadata creation workflows. The value of metadata goes beyond bringing visitors to the repository, however, and that also needs to be documented.

2) Shouting about large numbers is fairly crude as a way of getting attention, so a crude measurement such as GA is probably appropriate. Having said that, the Apache logs record higher numbers, so I should be reporting those numbers rather than GA ones!

3) Providing information to authors. Well, GA is entirely inappropriate for that. I can provide some information for some authors, and that has been welcome. But the ideal scenario would be for authors to be able to access such information for themselves, whenever they want. And it really is a huge gap in the knowledge that I can share with authors, if I can't tell them about accesses to the actual pdf files. I'm not sure what authors' interest in statistics is in those repositories who do help authors to check for themselves. Authors here aren't clamouring for figures about who is accessing their work: some are pleased when I write to them with figures, but that is probably because I only write to our top content authors so I'm always spreading good news! 

Generally, authors want to know if visitors are indeed academic, which is often very difficult to tell but GA does give me some clues. Being able to tell authors a little bit about visitors to WRAP is reassuring for them, and whilst addressing their every concern is more than I can manage, not knowing about pdf file visitors is a huge gap.

Authors are also concerned about their publishers, and it would be great to be able to demonstrate that repositories like WRAP don't harm publisher interests. This would not only reassure authors, but also perhaps reassure publishers and it would make the business of populating a repository so much simpler if publishers were supportive.

4) The competitive spirit could be between individuals or departments or even institutions. It could be based upon numbers of items in the repository, or numbers of visits or all sorts of different criteria. The competitive spirit ought to be directed towards appropriate measures. Focussing on numbers of items in the repository is probably enough for now: our main goal is to grow the repository.

Some element of benchmarking against other institutions is also going to be important, when it comes to resourcing decisions. This will mean measuring how many items we have, of what type and whether of full text or metadata only. Measuring how fast the collection is growing will help us to plan our workflows accordingly, and also be useful for benchmarking.

5) Measuring impact on citation: this is something that we claim as a benefit of repository deposit. I am always very cautious to claim this only in as far as it is common sense that more readily accessible work will be read more, and that more widely read research will be cited more. Even so, departments are asking me for evidence that repository deposit will boost citations. The repository does seem to fit into departmental meetings along with departments' concerns to raise citations so it is no surprise that the two are so closely associated. Evidence of this sort of impact would be highly influential in terms of encouraging deposit, if I could find it. I believe that the problem is that, by the time a repository has had its effect, it will be one of a number of factors influencing higher citations.

What I can hope to do, is to prove that the most highly visited items in the repository become the most highly cited.  I need to know which items are most highly visited, and to look at the reasons why that might be.

- No comments Not publicly viewable

Add a comment

You are not allowed to comment on this entry as it has restricted commenting permissions.

May 2009

Mo Tu We Th Fr Sa Su
Apr |  Today  | Jun
            1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

Visit the WRAP repository

Twitter Feed

Search this blog



Most recent comments

  • @Jackie, thanks! I'm very proud of the team and everything we have achived in the past year. Looking… by Yvonne Budden on this entry
  • That's an impressive amount of full text Yvonne. Congratulations to everyone at Warwick. by Jackie Wickham on this entry
  • In my opinion the DEA is a danger to digital liberties and should be thrown out, period Andy @ Lotto… by Andy on this entry
  • Has anyone tried an assessment using the suggested PIs– including the author of the paper? It seems … by Hannah Payne on this entry
  • Hi Yvonne I came across this article myself recently. And I was wondering how much of an issue this … by Jackie Wickham on this entry

Blog archive

Not signed in
Sign in

Powered by BlogBuilder