All 16 entries tagged Open Access
View all 57 entries tagged Open Access on Warwick Blogs | View entries tagged Open Access at Technorati | There are no images tagged Open Access on this blog
August 09, 2011
Writing about web page http://www.rsp.ac.uk/events/romeo-for-publishers/
Thankfully I'm new enough to the whole repository busy that I've never had to try to manage or populate an open access repository without the help of SHERPA's RoMEO service and I hope I'll never have to try! So an event presenting a number of new developments and the chance to engage with Publishers representatives was too good to miss out on!
The event itself gave two really clear messages: we are all on the same side and clarity is everything. The clarity message was raised again and again, all the various players in this community need clarity and consistency in who says what, what means what and what we can do with what (to badly paraphrase Bill Hubbard). Another message that came from both RoMEO and representatives of the Repository community (Enlighten Team Leader Marie Cairney) was that at the end of the day, as much as we care about Open Access, we don't mind being told 'no' as long as it's clear that that is what you are saying.
Some highlights from the sessions:
- "Change is coming" was the title of the latter part of Bill Hubbard's (Centre for Research Communication) presentation and highlighted the many areas (peer-review, end of the Big-Deal (?), social research tools (Mendeley etc.), demands for free access, cross-discipline research, possibility of institutions taking more control of the intellectual property produced by the institution and more) where we might be seeing change that affect the way we work in the next ten years. No doubt there will be others we haven't thought of yet.
- Azhar Hussain (SHERPA Services) continued the theme of opportunity by highlighting some interesting statistics for RoMEO. The service currently stands at 998 publishers covering 18,000+ journals and bringing in nearly 20,000 visits a month. Also highlighted was the growth of usage from within CRIS systems, something RoMEO is tracking closely.
- Mark Simon from Maney Publishing spoke about the reasons behind the companies decision to 'go green' as well as highlighting the fact that for Maney, as they broadly publish for learned societies, the copyright of published work often does not rest with Maney itself, but with the Society. Mark also highlighted the cost of their 'Gold OA' options (STM journals $2000, Humanities journals $800, Some tropical medicine journals $500) stating that the cost disparity was due to the cost of STM journals to produce and the fact that more people want to publish in STM journals.
- Marie Cairney (Enlighten, Glasgow University) spoke about some of the recent developments to Enlighten, including using the 'Supportworks' software to better track enquiries and embargoes. She also highlighted the changes to publisher policies over the years that have caused problems for her team, most of us can guess which ones she mentioned! Marie's final message was that the more clarity we can get on policy matters, the more deposits we can get.
- Jane Smith (SHERPA Services) spoke on a similar subject and touched on many of the common pitfalls that can occur when contacting Publishers to clarify policy. These included, no online policy, no single point of contact, two contradictory responses from different parts of the company and more. Jane ended with a plea for the publishers to let RoMEO know when their policy changes so they can get the information out as quickly as possible and for copyright agreements/policies to be written in clear English.
- Emily Hall from Emerald was up next. One point clearly highlighted from the outset was that Emerald was a 'green' publisher (it couldn't really have been any other colour!). Emily also spoke about the decision not to offer 'Gold OA' options (not felt to be good for the publisher or work for the discipline they mostly publish) and touched on issues with filesharing. (Trivia: Emerald's most pirated book 'Airport Design and Control 2nd Ed.') Emily did mention that Emerald haven't been able to 'see' the content in Mendeley (as of this morning listing more than 100 million papers) yet but they are looking for a way to do this. One thing that came out of the discussion at the end of the talk was an idea for publishers to return versions to authors with coversheets clearly indicating what they can and can't do with that version.
- Peter Millington (SHERPA Services) finished the presentations with a demonstration of a new policy creator tool developed to be used with RoMEO. This tool, based on the repositories policy tool created as part of the OpenDOAR suite of tools, would allow publishers to codify their policies into standardised language as a way of helping people to read and understand the policy of their publisher/journal. I for one hope publisher's start using this tool as standard. The prototype version of the tool is available now and can be found here.
The breakout session that followed the presentations asked us to consider four questions (and some of our answers):
- How can RoMEO help Publishers? (Track changes to policy, Visual flag for publishers to use on their websites to indicate the 'colour' of the journal, act as a central broker for enquiries so one service has a direct contact to the publisher that can be accessed by all creating a RoMEO Knowledge Base of all the enquiries for all repositories to use)
- How can Publishers help RoMEO? (Nominate a single point of contact, create a page for Repository Staff similar to their pages for 'Librarians', ways to identify academics (see previous blog post), clarity of policy)
- What message do Publishers have for Repository Administrators? (Thank you for the work done checking copyrights, don't be scared to talk to us, always reference and link back to the published item.)
- What message do Repository Administrators have for Publishers? (Clarity (please!), make it clear what is OA content on your website, educate individuals on copyright, communicate with us!)
A full run down of the answers to those four questions can be found at the link above.
The final panel discussion raised interesting questions that we didn't really find answers for! Issues on multimedia items in the repository; including datasets in the repository or finding ways to link the dataset repository to an outputs repository - DOI's for datasets (see the British Library's project on this topic); and the matter of what to do in the case of corrects and/or retractions being issued by publishers. The last one at least gave me some food for thought!
The event was another valuable day from the RSP featuring lively discussions on current situations and challenges facing the repository community and an invaluable opportunity to meet and have frank discussion with the Publishing Industry representatives. I think both groups got a lot out of the day along with the realisation that we have a lot more in common than might seem obvious at first glance.
August 08, 2011
Writing about web page http://www.irios.sunderland.ac.uk/index.cfm/2011/8/1/IRIOS-Workshop-Parellel-Sessions
One thing I took away from the workshop session was that both systems ROP and IRIOS were doing the right things and going in the right directions but weren't quite there yet. A big concern to me as an IR manager (and as a former Metadata Librarian) was that the IRIOS system creates yet more unique identifiers (see later in this entry for further discussion of unique IDs). Also automation of the project linking to outputs can't come fast enough, especially for services like WRAP where we spend a not inconsiderable amount of time tracking down funding information from the papers. However we could also benefit from taking information from systems such as this, which tie the recording of information about outputs much more closely to the money, which is always a motivator for people to get data entered correctly!
I think it is telling that more and more of these 'proof of concept' services are being developed using the CERIF dataformat (after R4R I'm looking forward to hearing about the MICE project early next month) but the trick with a standard is that it is only a standard if everyone is using it. I don't think we are quite there yet, I think this coming REF has been such an uncertain process so far that I think there is a lot more chance of CERIF being the main deposit format in the next REF. (If I'm still here for the next REF I'll have to reflect back on this and see if I was right!)
The afternoon of the work shop was taken up with a number of workshop discussions on a range of topics, below are a few of the notes I took in the two discussions I took part in. To see the full run down of all of the discussions please see the link above.
Universal Researcher IDs (URID)
It was generally accepted by all in the discussion that unique IDs for things, be they projects, outputs, researchers or objects were a good idea in terms of data transfer and exchange. They must be a good idea as there are so many different ones you can have (in the course of the discussion we mentioned more than eight current projects to create URIDs). Things are much easier to link together if they all bear a single identifier. However when it comes to people the added issue of data protection rears its head and can potentially hamper any form of identification if it is 'assigned' to the person. A way round this was suggested to allow people to sign up to identifiers, thus allowing those who wish to opt out to do so. Ethically the best route perhaps but unless a single service was designated we could end up with a system similar to the one we have now where everyone is signing up, but not using a whole array for services. The size of the problem is the size of the current academic community and global in scope. Some of the characteristics of URIDs we came up with were they just be; unique (and semantic free - previously mentioned privacy issues), have a single place that assigns them, have a sustainable authority file, not be tied to a role. One current service in place that fulfils many of the above criteria is the UUID service, however this falls down in that there is no register of assigned IDs so people can apply for multiple IDs if they forget them (and lets face it the likely hood of remembering a 128 number is kind of low) ... and we're back in the same situation again. I'm not sure there is a single perfect solution to this problem, though my life would be easier if there was!
This was a free form discussion that covered the REF, REF preparations and 'Life after the REF' in various guises. HEFCE are currently tendering for the data to be used in the REF at the moment, needless to say the two services bidding are the expected two, Thomson Reuters and Scopus, but HEFCE will only be buying one lot of data. Bibliometrics were touched upon in relation to the REF, is it better to have two people select a really highly cited paper or choose two lower cited papers? Discussions on the HESA data, checking the data once it comes back from HESA, possibilities of mapping the future HESA data to the REF UoA for long term benchmarking rather than a single point hat goes out of date very quickly. Do people's CRIS systems really hold all of the data required for a return? What are the differences between the impact as measured/requested by HEFCE and the Impact measured by RCUK? Selection policy and training, the possibility of sector wide training, possible best practise mentioned in the idea to train a small core group of people who would handle all of the enquires centrally. Would it be possible for institutions to get the facilities data on a yearly basis rather than just before the REF and then have to try and chase people who may not remember/have left to try and verify the data?
One interesting comment from the discussion was the news that NERC, at least, has seen a big increase in the number of grant applications including a direct cost for Open Access funding. Interesting particularly is that there had been a number of comments made to me that researchers didn't want to do that are they feared making their grant application too expensive.
All in all the day was very interesting for me as an introduction to a 'world beyond publications' (as I was attending both for myself and for a member of our Research Support Services department) and as an indication of what we need to do to go forward.
Writing about web page http://www.irios.sunderland.ac.uk/index.cfm/2011/7/28/IRIOS-Workshop-Presentations
The IRIOS (Integrated Research Input and Output System) workshop at the JISC Headquarters was designed to demonstrate the preliminary look of a system designed to take information from the RUCKL funders on Grant awards and combine it with the information from University's IRs and CRIS systems. The event was attended by research managers, representatives of four RCUK funders and repository managers and all of the presentations can be seen at the link above.
The event kicked off with a video presentation from Josh Brown of JISC discussion the RIM (Research Information Management) programme of projects. One interesting statistic was that it is estimated the £85mill/year is spent on submitting grant proposals and administering awards. Once again the savings that can be realised with the use of the CERIF data format was raised and the point that REF submissions can be made to HEFCE in CERIF was highlighted as a sign of the growing importance of the standard. IRIOS was highlighted as a step towards a more integrated national system of data management. Josh closed with the news of a further JISC funding call to investigate further uses of CERIF due to be announced soon.
Simon Kerridge (Sunderland) was up next to discuss the landscape and background of the project and the need for interoperability and joined up thinking between a number of different University departments if we are to make the most of an increasingly competitive environment. He also spoke of the ways in which IRIOS might feed into the RMAS (Research Management and Administration System) project further enhancing the cloud based system. Simon finished by touching on the challenges (research data management and unique researcher IDs anyone) and opportunities for the future (esp. the JISC funding call).
Gerry Lawson (NERC) was up next with a whistle stop tour round the RCUK 'Grants on the Web' (GoW) systems. Starting with a stern warning that if the funders and institutions don't find a way to match up the data held by both parties commercial services will find a way to fill the gap (for example Elsevier's SciVal is already starting this process), thus putting both parties back into the situation where we have to buy our own data back. Other products are also making a start on this process, as can be seen in the UK PubMed Central's grant lookup tool. Gerry made the vital point that all of the information is available but linking it is going to take work. The RCUK 'Grants of the Web' system is a start in this process as it brings together all of the grants by all RCUK funders in a single system. The individual research councils then use this centralised data to populate their individual GoW interfaces with each interface being set up to the specifications of the individual research councils. With one exception, AHRC, grant data about individual funded PhDs is not included in the GoW systems due to the RCUK preference for handling funded PhDs through their network of Doctoral Training Centres. Gerry closed saying there was a real desire from the RCUK to start linking outputs with funding grants (and expanding into research data and impact measures) especially in relation to monitoring compliance with Open Access mandates. Challenges still remained; a need for a common export format (CERIF); authority files for people, projects, institutions; the issue of department structures within institutions changing over time etc.
Dale Heenan (ESRC), ably assisted by Darren Hunter (EPSRC), discussed the RCUK 'Research Outcomes Project' (ROP). The project was based on the ESRC's Research Catalogue (running on Microsoft Zentity 2.0) extended to meet the needs of the four pilot councils, AHRC, BBSRC, EPSRC, ESRC. (Worth noting that MRC and STFC use the e-Val system). The ROP system is designed to create an evidence base to demonstrate the economic impact of funded research and also designed to attempt to reduce duplication of effort. Upload of data can come from a range of stakeholders, grant holders (PIs or their nominated Co-Is), institutions, IRs etc. and can cover individual items or bulk uploads. Management Information is provided using Microsoft Reporting Services. Future challenges for the system include ways to automate the deposit of research outputs, development/adoption of standards such as CERIF, ways to pull data from external services like Web of Knowledge, PubMed Scopus etc.
The main presentation for the day is of the IRIOS demonstrator by Kevin Ginty and Paul Cranner (Sunderland). The IRIOS project is a 'proof of concept' demonstrator of a GoW like service using the CERIF dataformat and is based on the 'Universities for the North East' project tracking system (CRM). One feature of the service is that four levels of access (hidden, summary, read only, write) can be assigned to three distinct groups (global, individual, groups of users) allowing a fine level of dynamic control over the data contained in the system. All grants and publications have a unique ID that is automatically generated by the system and any edits mad in the current system do not feed back to the system that originated the data. Currently the system is only accepting manual linking of grant to output but there are plans to look into automation of this process. In the future it might be possible to import data from larger databases like Web of Knowledge but information gathered by the research councils indicates that only 40% of outputs are correctly attributed to the grant that funded the research.
If you would like to try the demonstrator version of the IRIOS system details on how to login can be found here.
Comments on the presentations and information on the workshops is to follow in part two.
July 27, 2011
Writing about web page http://wrap.warwick.ac.uk/36226/
Following the announcement in February that we had reached 4000 items WRAP’s growth continues to be impressive and is now supported by the development of the University of Warwick Publications service. Visitors are coming from more than a 160 different countries every month and in June 2011 WRAP items were downloaded more than 21,000 times.
Today we announce that WRAP’s 5000th item is:
Mercer, Justine (2009) Junior academic-manager in higher education : an untold story? International Journal of Educational Management, Vol.23 (No.4). pp. 348-359. ISSN 0951-354X http://wrap.warwick.ac.uk/36226/
Authors are encouraged to submit their journal articles to WRAP online at: http://go.warwick.ac.uk/irsubmit
Visit WRAP: http://wrap.warwick.ac.uk
Find out more about WRAP: http://go.warwick.ac.uk/lib-publications
March 16, 2011
Writing about web page http://www2.warwick.ac.uk/alumni/knowledge/themes/03/secure_future/
Quick link to flag up my contribution to the Warwick Knowledge Centre‘s fortnightly theme of dealing with data.
They asked me for an 800 word article on the pros and cons of storing and accessing research data, which in my hands gained a slight open access slant to it! A little bit ‘research data 101′ for any practitioner but aimed at being a sort introduction the kinds of issues raised by data, awareness raising being the name of the game at the moment. Any questions or comments and I'd be happy to hear them!
January 31, 2011
Writing about web page http://crc.nottingham.ac.uk/
This event, at RIBA, looked at creating an environment of 'joined-up' thinking about research. A area that many of the institutions attending had all made a start on, at least between the Library and the Research Support Offices, but that all needed to expand to include all the actors in the research cycle, from the research funders down.
The introduction helped to set the scene and emphasised the problem that too often the research management we have at the moment is too narrowly focused and does not take into account the full breadth of the issues that are inherent in 'research'. Especially the fact that you cannot look to manage research if you are not also managing teaching. One speaker even posed the question of whether it is even possible to 'manage' research! Overall it was felt that a dialogue needed to be begun between all the areas involved in supporting research to stop the wasteful duplication of effort that is often present currently.
Three of the case studies introduced a collection of different approaches to 'research management', through a broad and integrated IR (Glasgow), through the Research Information System (Newcastle) and using a full CRIS (St Andrews). The final case study looked at paying for open access publication (Nottingham) as a way of looking at the ways the University can support the dissemination part of the research cycle. The funders were representatives from the Wellcome Trust and the Natural Environment Research Council (NERC) and looked at the issues they had in ensuring compliance for their open access policies. Despite the ease of compliance for Wellcome trust funded research through publishers and UKPMC they still only have a compliance rate of 50%. The Wellcome trust emphasised their current activities in working with publishers to ensure compliance through this route (currently 85% of the Wellcome funded research in UKPMC came from publisher deposit). As well as the things that the institution could be doing in terms of advocacy and awareness raising with their academics, particularly in terms of the funding available within institutions (Warwick readers can find details of the Wellcome Trust open access fund here). Gerry Lawson from NERC looked at the issue form the perspective of a single funder and looked at the possibility of monitoring compliance through IR harvest (interesting as NERC mandates deposit to NORA, but useful for other funders). This was proposed to take place in the beginning few months of 2012 to cover all outputs from 2011. If this really is the case the funders will need to start confirming that this is the case soon to allow institutions to prepare!
The group and panel discussions focussed on two questions:
- What do we need to know?
- What do we need to do next?
This lead to some very interesting points:
- Research funders are restricted in the ways they can give money to a institution;
- Libraries are happy to administer central OA funds but want some guidance from the faculties/departments as to criteria as to where to allocate the limited funds;
- Can funders really do more, after all the open access requirement is part of the contact that academics sign;
- Funders really need more figures on spend on OA publishing to to take the argument with the publishers (subscription charges in relation to revenue for open access) forward;
- Would it help if RCUK and HEFCE pushed for the REF2020 to only grant eligibility to OA papers (80% of the submissions to the RAE2008 could have been made OA through their existing journal (but how to pay for this!));
- Standardisation needs to be a much bigger priority to allow these diverse systems to talk to each other better;
- Are sanctions from the funders the best way to push up compliance? Is there a happy medium available?;
- Possibility of extending the writing up period? RLUK and ARMA to look to creating a request to RCUK to move this forward.
Sadly the discussion ran out of time but produced some much needed enthusiasm to look at taking some of these points forward in the future. All round a very valuable day (and chance to meet some new faces from the research support side of things) and many thanks to the CRC for organising. The was a suggestion to run the day again due to the huge demand for places, if they do I would highly recommend it!!!
October 22, 2010
Writing about web page http://go.warwick.ac.uk/lib-openaccess
The 4th International Open Access Week is drawing to a close now and looking back of a busy week of events I think that we can be quietly proud of the way things have gone here at Warwick. This year we celebrated in a number of ways:
- We held two experimental drop-in sessions which generated some interesting discussion on the citation advantage and how to convince colleagues. As well as a discussion on the importance of accurate metadata!
- I recorded my first conversational podcast for display on the new Knowledge Centre website.
- We hosted a well attended event, intended for researchers but better attended by Library staff. The researchers missed a really excellent talk by Gerry Lawson of the Natural Environment Research Council about the views and attitudes of funder's to the Open Access as well as talks by myself and Jen Delasalle about a whole collection of other Open Access topics.
- I was invited to speak at the regular meeting of our subject staff to give them a refresher on WRAP, Open Access and other things! I found this meeting really useful and I think both sides came away with ideas to better support the work of the other, which is always fantastic!
- And finally I celebrated Open Access Week with the addition of two new members of my team who have managed to more than double the size of the WRAP team in one go! The timing was coincidental but it was a great way for the Library and University to demonstrate their commitment to open access and WRAP!
There have been lessons learnt from my first Open Access Week but I think overall it was a moderate success and the WRAP competition continues to run and I'll announce the winner of that early next week!
I'll close with a huge thank you to Gerry Lawson for speaking at Wednesday's event and an equally big thank you to Jen for speaking and for co-organising the whole week!
September 07, 2010
Writing about web page http://www.repositoryfringe.org/
I'm just back from a trip to gloriously sunny Scotland (which was obviously breaking out the good weather for the festival) and the 2010 Repository Fringe Event.
Hosted at the National E-Science Centre (NESC), in the heart of Edinburgh the sessions began with Sheila Cannell (Director of Library Services University of Edinburgh) asking us to consider fireworks. She invited us to join in with the firework display at the end of the Edinburgh festival, which in her works were 'open fireworks' (paid for by a combination of public money and the 'subscriptions' of a few), and use thinking that would light up the sky. This nicely set up the tone for the next couple of days.
The keynote by Tony Hirst (Open University) followed where he presented us with an outsider's view of repositories on the theme of openness. The central theme of the talk was "content as data" and urged us to consider new ways to store and present the information in our repositories to our users. New ways to manipulate the data and new ways to present the data were central as well as information we might want to start recording but currently aren't doing so, such as 'open queries' showing users exactly how the charts in an article were generated from the underlying data. In a nice touch Dr. Hirst finished with a revisit to S.R. Ranganathan's 'Five Laws of Library Science' as he encouraged us to keep our repositories as living organisms rather than as a place research is dumped and forgotten about.
The following session by Herbert Van de Sompel (Los Alamos National Laboratory) introduced us to the Memento Project a way to provide web users with time travel! A clever way to allow your web browser to access the web as it would have been on a certain date using the same uri that you have for the current version and with as much of the functionality the page had originally as possible. This is one thing I'm looking forward to experimenting with, if you use Firefox the link above will lead you to the gadget to try it out for yourself!
Repo Fringe was my first experience of the Pecha Kucha style of presentations (for those not in the know, 20 slides, 20 seconds a slide, autorun for 6mins 40 per presentation) and the looked just a nerve wracking as you might expect! On the first day we had an update on the Open Access Repository Junction, beauty and the Jorum repository, Glasgow's Enlighten repository through the metaphor of cake, the problem of dataset identity, Research data management and the Incremental project and finally the Edina Addressing History project. I will admit I was hard pressed to choose my favourite when the time came to vote! I was also impressed at how many ways there are to approach these sessions and how much information you can pack into just under seven minutes.
The EPrints team reinforced their reputation for giving some of the more entertaining presentations that any conference is likely to see with their live demo of EPrints 3.3 and the new Bazaar functionality. A very interesting look at what is to come in terms of the software many of us in the audience is using!
The first round table of the conference for me was on the thorny issue of the relationship between an institutions' CRIS (Current Research Information System) and its institutional repository (IR). The talk was sparked by the work done on the CRISPool project which was aimed at creating a cross-institutional CRIS to cover the Scottish University Physics Alliance (SUPA) research group. The discussion invited us to consider whether the distinction is a false one or whether the issue is to consider what functionality best fits where in the system. Is it right that IR's exist when we could all have CRIS's? Could we create a centralised, national IR and all our CRIS's harvest from there? Should we be looking to integrate CRIS functionality with IR's? What impact does the REF have on the discussion? In all we didn't come to any definite answers (not that I think that that is the purpose of round tables of this sort) but we all took away something to think about.
Day two began with Chris Awre (University of Hull) discussing hangover cures through the ages (the Romans apparently favoured deep-fried canaries) before moving on to the main meat of his presentation on the Hydra Project, a collaboration between the Universities of Hull, Virginia, Stanford and Fedora Commons. This unfunded project is aimed at providing solutions to identified common problems on the understanding that no single institution can (or needs to) create a full range of content management solutions on their own. For Hydra collaboration is the key to the success of a project with each institution providing what they can to the project. The project makes use of ruby on rails technology and the work of Project Blacklight, an open source 'next-gen discovery tool' to allow a more sophisticated search function.
The second round table of the event was focused on linking data to research articles. This is an area that we are looking to move forward into in the future and so I was fascinated to hear some of the comments and opinions from places that already had systems running. Form the responses of the attendees I was not alone in this, many institutions seem to realise that this is an important area and that the implications of a project such as this can be huge. The keyword here was always going to be linking, but linking what to what? What is a dataset? As there is a clear difference between a dataset associated with an article and a working dataset can we pull out only the data that was used in the article and storing it with the article without loosing the meaning of the data? The point was made that the cost of storage (while large) pales to the cost of curating many small things as with curation you have the cost associated with each item. We discussed the fact that with archives the expectation is that you just put things inside it and with repositories you have the added issue of people trying to reuse the data. In the current age of research funding cuts the reuse of data is going to become critical as fewer and fewer institutions are going to be able to afford to run the experiment again from scratch! The issue of trust was discussed, can we trust a conversion of a dataset for preservation? Will it have maintained all of the formulae that are inherent in the dataset? The spectre of 'ClimateGate' was raised will the availability of the data safeguard against this in the future? If we are linking to things inside of a dataset do we have the functionality to 'cite' a small part of the larger whole without making the link meaningless? All this and metadata schemeas were touched upon in a stimulating discussion that could have run a lot longer than it did. Again we came to no conclusions but everyone I spoke to afterwards had gained at least one thing that they hadn't considered before to think about!
The second round of Pecha Kucha talks were as interesting as the first and covered: The Ready for REF project looking at the XML output needed for the REF reporting, JISC RePosit working to simplify the deposit process through use of research information systems like Sympectic, more on research data management this time from the Edina team and looking particularly at the creation of training tools, the JISC CERTIS services and their approaches to open educational resources, ShareGeo and the Digimap and finally the SONEX think tank on work done by this group.
Possibly the most challenging presentation of the event was from Michael Foreman (University of Edinburgh) introducing the concept of 'Topic Models'. The concept from a paper by Blei and Lafferty (2009) about their work with articles in JSTOR allows people to create maps of related documents based on the statistical analysis of the frequency of words within the article. A lot of the meat of the statistics did stretch my understanding to the limit but anyone (and everyone in the room certainly did) could see the value to be gained from work of this variety as we search for more and more automated ways to define the content of items in our repositories and the way they relate to others.
The closing presentation from Kevin Ashley (Digital Curation Centre) gave us a round up of the presentations that had gone before it, a round up of the development of the repository world as a whole and as a way of looking forward revisited the idea of citing data. He urged us to be aware that we are "Standing on the shoulders of Giants" and also to remember that sometimes fireworks are a good way to burn a lot of money very quickly! Curation issues were raised; what to keep? How long do we keep it for? The fact that repositories have not yet had to consider throwing things away and that we may have to at some point! The concept of the value of data being unknowable was also raised, with the example being given of the data from ships logs were used three times, first to navigate, secondly to tell historians about economic and trade conditions and finally most recently to discover evidence of climate change. Again we came back to the idea of the 'data behind the graph' the information in the article that we just can't get hold of. As well as the fact that people don't always realise that data can be changing all the time, nothing is truly static.
Overall the two days in Edinburgh were packed with many interesting things but the thing I took away from it most was the fact that there is always a different way of looking at something but that you should never forget your foundations.
July 12, 2010
Writing about web page http://or2010.fecyt.es/publico/Home/index.aspx
There will be a full report of the event going up here soon but I thought I'd get a few of the highlights (non-football related, I'm afraid) up in advance. Presented in no particular order here are some of the things I took way from the conference.
- News that Spain's new law for Science, Technology and Innovation will mandate the open access publishing of all publicly funded research no more than 12 months after completion in a repository, is (hopefully) to be ratified later this year (Proyecto de Ley de la Ciencia, la Tecnología y la Innovación, Article 36).
- The 'buzzword' of the conference was 'linked-data', why you should use it, how to code it and most of all how to share it.
- Need for a awareness that the published paper is only part of the process, research is not just about the results but also about the process of getting the results. It is just as valuable to researcher for us to archive this data as well.
- Everyone knows what the problems and issues are in the broad areas of repositories and Open Access and the solutions are a numerous as the problems. However at the moment development is so close to the present that people are not having as much choice about waiting for their preferred option to be ready.
- Some institutions want their mandate in place before they even have a repository. This has definitely helped them in that they are now starting the repository from a position of community engagement but I can see problems if they have any delays in the building of the repository.
- Interoperability and integration with other library systems were highlighted as particular issues and concerns and a number of presentations touched on this, bringing us again back to linked-data.
- Repository drivers (particularly in terms of research assessment) are sometimes driving repositories away from the 'core' or 'ideal' of open access to research.
- Non-text research outputs lead to non-standard repositories. Possibly obvious, but it's worth bearing in mind we don't all have the same challenges, and that even if we think we've got it worked out, unexpected deposits can play havoc with systems. Also it is to our advantage not to get locked into the idea of a single output type.
- Disambiguation is the next big challenge and a number of different projects were presented in this area, both in session and as posters.
- Libraries in general and repositories in particular need to be aware that each discipline has it's own 'language'. We need to strive to be the common language that allows them all to communicate, not another language for them to learn.
- The more we can move into their preferred working environment instead of forcing them to learn a new one the better, lessons can be learnt from the social networking world (hands up how many of you have linked all your accounts so you only have to update one!?!).
- The Carrot vs Stick debate: both approaches work and some institutions are using some very big sticks indeed!
- Digital Preservation doesn't have to be hard, but you do have to want to do it!
Finally, congratulations to Richard Davis and Rory McNicoll of the University of London Computer Centre for winning the 'Developers' Challenge' (for details see here) with a tool to hugely increase the number of useful links out of a repository record. Also to Colin Smith, Chris Yates and Sheila Chudasama of the Open University for winning the poster contest (available here).
December 14, 2009
Last week I attended a meeting with some publishers and it seems to me that there is considerable potential for confusion amongst those not involved in repository management, about what repository deposit actually means. The two main areas of confusion seem to be:
1) Not all content in all repositories is necessarily open access. Some repositories have metadata-only records along with some records which also have full text items available on open access. Some also have full text items that are locked such that only repository staff and the author can see them, or such that only members of the institution can see them. Some repositories add a "request a copy" button to their records so that those who can't see the locked full text can request it from the author. Sometimes the locked access is in order to meet a publisher's requirement or sometimes it is because the author prefers that requests are sent to him/herself so that s/he can know who is reading his/her work.
Publishers' agreements with authors and their information about what can and can't be done usually refer to whether repository deposit is allowed or not. I suspect that more of them would allow repository deposit if the article were locked to be accessible only within the institution or only to the author and repository staff.
2) Just because an item is available on open access, that does not mean that it is available for further copying by anyone! Publishers might also be more inclined to allow repository deposit and open access availability if they knew that allowing this is not granting permission for others to on-copy from the repository. Some repositories do also ask authors to grant a Creative Commons (CC) licence for the use of the article they deposit, and when this is the case then the article will also be available for further copying. Authors can do this when it is clear that they own the copyright themselves. Those repositories which do use the CC licence don't all expect every single item they hold to be deposited with such a licence, although perhaps that would be an ideal scenario. WRAP isn't one of those repositories which asks authors to sign a CC licence, for now. It would just be another hurdle to deposit and our main aim is to make the works available without subscription barrier.
Publishers' agreements with authors who have paid for their article to be made available on open access on the publishers' site do not state that repository deposit is also allowed, although it seems that (some, at least) do expect that to be the case without their stating it. Perhaps their agreements with the authors do grant copyright back to the authors and that's why they expect it, but it's not always clear to repository managers that this is the case.
We don't put open access articles into the WRAP repository unless permission is expressly granted by the publisher or clearly owned and granted by the author. Open access seems to have been conflated with waiving of copyright, but copyright still exists in open access works. BioMed Central are very clear that their open access articles can be further copied, and they state how, etc, so they're an example of how open access should be handled by publishers, in my opinion. This is another reason that I wouldn't consider deposit in WRAP to be a form of publication. WRAP has no copyright owndership over the works it holds: that still rests with the rights owners.
For WRAP, we are clear that we want full text, to be made available on open access for all journal articles and for as many PhD theses as possible. We don't have metadata-only records for journal articles but we do for theses, and we also allow theses to be deposited but locked to repository staff only. The works in WRAP are not made available with any particular licence and rights owners would still need to be consulted before further copying could be done.
It seems to me that there are so many different flavours of repository, all with ever so slightly different aims and purposes and so we're all doing slightly different things with them. No wonder there is so much potential for confusion! In any case, I was very glad to begin speaking to publishers as I did last week with some representatives from the Highwire publishers, in my role as Chair of the UK Council of Research Repositories.