All entries for Monday 19 January 2009
January 19, 2009
Reports and Statistics I need from WRAP
1) 6 monthly report of data changes on WRAP to show which records have been altered since the date they were added into the live repository. (For sharing data with Warwick’s Research Support Services.) Not currently possible.
2) A graph to show how the pattern of new record creation/repository growth has gone, over the last x months/year. I can get this from ROAR. (http://www.roar.org)
3) Monthly report of all records added since last month, with data in specific formats to suit RSS’ InfoEd system (and/or other departments at Warwick). Key issues with sharing with RSS: need to store staff number (or key to call up staff number) for each Warwick staff member amongst the authors, and lack of security for such data in WRAP. Also, page range is currently exported as, eg 51-72, whereas RSS need it as "start page 51, end page 72". More investigation into the technical possibilities for data sharing needs to be done. It may be significant that InfoEd attaches information to a person’s profile, relating to publications (& other activities). Whereas WRAP attaches information about authors to a record describing a publication.
4) Statistics on visitors to WRAP and what they are clicking on, where they come from, etc. Google Analytics does this well enough for me: I can see where they’re clicking, what keywords brought them to WRAP, to where in WRAP, and who their network provider is, (which is a clue to some academic interest, and also helps to identify internal interest). I can see what countries visitors are in, and what cities, etc. I can do all this at a per paper level, but I have to know which paper(s) I want to look at.
5) To look at features like those listed above, for a set of data (eg all by one author, or all for a particular department). Departments and authors may well want to know who is looking at their work in WRAP. I can look at particular paper, but not at a set: I would have to collate reports for each paper, in some way. IRStats should be able to do this, if we were to install it successfully on WRAP… although it may require some change in our workflow. At the moment, most papers are added to WRAP by our very own administrator, since authors use a separate (& simple) submission form. Authors do not upload data about their own publications and therefore the papers are not attached to separate accounts in WRAP. I believe that IRStats would need separate accounts to be used for each author’s papers, in order to produce reports on all of an author’s papers. Our administrator could create accounts in authors’ names and then log in as the author before creating the record… but that all presupposes that we can get IRStats to work, and that it does work as I expect.
6) It would also be better for me (and for those interested in the data) if I did not have to look up statistics such as those already provided by GA myself, but if those interested could just look them up, on demand. In theory, I can grant access to the GA reports to anyone with a Google account… although this requires some intervention from me. And Google Analytics is great for those who know how to use it, but I can see academics being put off learning how to use it. There are barriers to authors getting data about all the wonderful good WRAP is doing in bringing an audience to their work!
7) GA is great for looking at the site and our html files, but tells us nothing about pdf/word document downloads. The difference between “the most downloaded document” and “the most looked at record” could be very important indeed, if any correlation with citations is to be explored. Also, I can tell from GA if someone has followed the link to the DOI on a particular record. I can’t tell whether anyone has followed the link from within the pdf file to the full text, published version, though.
8) What are people searching for from the repository's own search form: which fields do they search by? GA can only tell me whether people click through from our Advanced form to the Simple search one, and indeed whether people follow the link to search the repository in the first place from our home page. Thus far, there aren’t so many people searching, and we expect that people will not search through our form but on search engines like Google, with keywords which GA does record and tell us, so this isn’t particularly crucial.
I’m also not sure of how to make GA discount visits from members of the WRAP team… but I expect that’s something I ought to look into.
I’ve learnt a lot about what GA can tell me about WRAP and its visitors. I find it fascinating to delve in every now and again and see what brings people to us. It can be used as a website management tool, to see how to make important links more visible and hence more clicked upon. It can be used in advocacy to authors, explaining why they might want to put work into WRAP, showing that others do look at it.
What I would like to do is to compare our statistics with those of other repositories, at other institutions. It’s not easy to find other repositories that are comparable with ours in their features (full text, mediated metadata, voluntary deposit), never mind such repositories at comparable institutions. But it is possible to find those who are much further ahead of us, and it would be good to see where we might be heading, in terms of visitor profiles, whether most visitors came from search engines (as now) or direct links, etc. I would like to know whether the most popular content in others’ repositories is journal articles or unpublished content, and whether there is a particular subject that gets heavier attention than others. So, I would like to be sure that, whatever statistics package we use for WRAP, it is one that would enable us to compare our repository with others. There isn’t such a package or method of using a combination of statistics packages, yet.