All 3 entries tagged Reflections
View all 228 entries tagged Reflections on Warwick Blogs | View entries tagged Reflections at Technorati | There are no images tagged Reflections on this blog
July 26, 2012
Writing about web page http://or2012.ed.ac.uk/
The second day started early with my final workshop, the Place of Software in Data Repositories, this workshop focused on work that had been done by the Software Sustainability Institute on the role of software in the research process. Again illustrating the slow move away from the rewards for researchers being tied to traditional publications and an acknowledgment that research now is a diverse process and involve a huge array of skills. But for researchers to gain the rewards for their work there needs to be a systematic way of storing and making these things available. The host of the workshop, Neil Chue Hong from the University of Edinburgh, spoke briefly on a new idea of the 'software metapaper' as a way to cite all the different parts of a project. The 'metapaper' is a neat idea to get around one of the problems that has been discussed in terms of citing datasets, the fact that some journals limit the number of references you can use (which seems anti-intuitive to me but that's another blog post!). The 'metapaper' will include create a complete record of a project, citing within it any publications, methodology, datasets or software objects that might all be published in different places into a single citable object. The first journal of this kind, the Journal of Open Research Software is due to launch soon. The issue of the long term preservations of software arising form JISC funded projects was also mentioned as an issue that JISC is beginning to grapple with now.
Breakout groups were centred around the range of factors that needed to be considered when making software available in a repository. Which brought up many of the main issues that we had been discussing in relation to datasets, issues of versioning, external dependencies (software is not PERL code alone), drivers to deposit and trust issues can up again. Key amongst these challenges was the issue of sustainability and also of reuse of the software by the software customers. Much of the discussion centred around what exactly it was that you needed to archive for the curation of software, just the text document containing the code? Or would you need to host executable files and the associated virtual machine interfaces as well? What does a trusted repository look like? One interesting issue that also came out of this was the issue of needing to store malicious software, for training purposes and testing, but needing to make these items really clear in an open repository for fear of range of problems! This morning was a great eye opener on the range of questions specialist types of material can raise for repositories making it ever more clear that generalist repositories, like our institutional repository, may not be suitable to try and store everything.
The main conference started in the afternoon with a fantastic keynote opening by Cameron Neylon, the new director of advocacy at the Public Library of Science (PLoS). I had so much to say about this, it has it's own post (to follow)!
Following the Keynote was the 'Poster Minute Madness', a brilliant idea for presenting posters at a conference and a way to get people excited about the content of the posters. A surprisingly nerve-wracking experience for presenters despite being only 60 sections long! (Our poster can be found in WRAP, self-archived on the way to the conference.) As before when I last saw this at OR10 in Madrid, I was blown away by the range of activities being undertaken by repositories around the world and the exciting projects people are thinking up! Highlights for me were:
- Brian Kelly and Jen Delasalle on social networks and repositories
- Chris Awre and others on history data management plans
- Helen Kenna and Karen Bates on Salford's digital archives
- QUT's poster on enhanced usage stats (very much our ideal situation)
But all of the posters were well worth the time taken to read them, I was just disappointed that I didn't get more time talking to people about my poster or talking to others about theirs! The poster reception followed the last events of the day at the stunning Playfair Library.
From here the conference started the parallel sessions, which as usual reminded me of being at a music festival where three of the bands you want to see are all playing at the same time on different stages! (Here I'll add a huge thank you to the organizers for videoing all the presentations so I could watch the ones I missed!) In the end I plumped for the sessions on the development of shared services, which gave an interesting view of a number of countries who are using national shared services as the base of their repository infrastructure. For every advantage of this kind of service I think of I'm reminded of the really rich, heterogeneous environment we have in the UK where every repository works a little differently for different people and I think it's worth the frustrations that always arise when you try to make the systems talk to each other! It was good to hear about the progress of the UK Repository Net+ project that looks like it has the potential to do a lot of good for repositories in the UK and the news from the World Bank of their aggressive Open Access policies is also really encouraging!
July 16, 2012
Writing about web page http://or2012.ed.ac.uk/
This is the first of a series of blog posts on my reflections on the 7th International Conference on Open Repositories. I've split the post by the days of the conference mainly to avoid this being the longest blog post ever and to make it easier to refer to later.
Day one was taken up with half day workshops, a fantastic idea and allowed a level of interaction that some of the later sessions couldn't. All the workshops seemed to feature great discussions on relevant topics and a great comparison of different practises in different countries and institutions. My day one workshops were:
- ISL1: Islandora - Getting Started
- DCC: Institutional Repositories & Data - Roles and Responsibilities
- And an optional evening workshop on EDINA's Repository Junction Broker project.
The Islandora workshop was fascinating! I'd not seen very much of the software or it potential before and the workshop was a great introduction to everything about the software, from the architecture and underlying metadata to the different Drupal options for customising the front end. Their system of 'solution packs', Drupal modules that allow you to drop in functionality for different functionality and content types into the system is a great idea and allows the system a degree of flexibility not found in other systems yet (although the EPrints Bazaar might get there soon). They demo-ed a books solution pack for paged content as well as discussing forthcoming solution packs for institutional repository (IR) functionality and Digital Humanities projects. Islandora maintain a web-based sandbox environment to allow people to experiment which is wiped clean each evening which I'm looking forward to playing with as we scope new software for future projects. I also like the fact that the software is completely open source, following the replacement of Abbyy OCR software with the open source equivalent Tesseract. Islandora as the 'new' player in the market is managing to provide the same functionality that the other systems do with a collection of exciting add-ons, however I do see that as you add the extra functionality you are having to maintain a number of additional modules as well as the core software which could have resource implications down the line.
The afternoon workshop run by the Digital Curation Centrewas a nice mix of presentations on the current thinking of a number of projects from around the world and group debate on the weeks 'hot button' topic of Research Data Management (RDM). This topic was to come up time and again in the week as most of the talks and discussions touched on it at least a little. As the title suggested the main thrust of the discussion was around who was responsible for what! Discussions covered a range of topics and some of the messages that came out most strongly for me where:
- Use the discipline data centres as much as possible, no IR (data or otherwise) can, or should, do everything.
- Knowing where the other data centres are is essential.
- Try not to get bogged down trying to 'fix' everything first time, fix what you can and work on the rest later or you could end up doing nothing.
- Interesting point from Chris Awre at Hull, use the IR as a starting point for discussions to move the researcher's thinking from what you have to what they need.
- Try to get into the researchers workflows as early as possible as it makes creating the metadata easier for the researcher, which in turn helps the archive.
- Are repositories qualified to appraise the data deposited with them?
I'll admit that the whole area of RDM is a scary one but it was good to realise that there are both a, a lot of people out there feeling the same and b, a lot of assistance there for when its needed. The idea of just getting something in place and fixing the rest later feels a bit anti-intuitive to me but, on the other hand, it's what I've been doing with WRAP's development of the last two years, it's just that someone else had to take the first step!
The final workshop of the day was an informal one in the evening discussing the development of the EDINA's Repository Junction Broker project which is going to form part of the services offered by the UK Repository Net+. This discussions centred around the development of the extension of the middleware tool developed by EDINA to allow publishers to feed deposits directly into repositories as a service to researchers. As ever this sound like a fantastic idea and the debate was active and enthusiastic as the various stakeholders discussed how to make this work for both repositories and publishers. Certain as far as WRAP is concerned if what we need to do is get our SWORD2 endpoint up and running that that is what we have to do, the service offered by the Repository Junction are far too good to miss out on! I'll be watching this develop with interest....
More on day two soon....
August 08, 2011
Writing about web page http://www.irios.sunderland.ac.uk/index.cfm/2011/8/1/IRIOS-Workshop-Parellel-Sessions
One thing I took away from the workshop session was that both systems ROP and IRIOS were doing the right things and going in the right directions but weren't quite there yet. A big concern to me as an IR manager (and as a former Metadata Librarian) was that the IRIOS system creates yet more unique identifiers (see later in this entry for further discussion of unique IDs). Also automation of the project linking to outputs can't come fast enough, especially for services like WRAP where we spend a not inconsiderable amount of time tracking down funding information from the papers. However we could also benefit from taking information from systems such as this, which tie the recording of information about outputs much more closely to the money, which is always a motivator for people to get data entered correctly!
I think it is telling that more and more of these 'proof of concept' services are being developed using the CERIF dataformat (after R4R I'm looking forward to hearing about the MICE project early next month) but the trick with a standard is that it is only a standard if everyone is using it. I don't think we are quite there yet, I think this coming REF has been such an uncertain process so far that I think there is a lot more chance of CERIF being the main deposit format in the next REF. (If I'm still here for the next REF I'll have to reflect back on this and see if I was right!)
The afternoon of the work shop was taken up with a number of workshop discussions on a range of topics, below are a few of the notes I took in the two discussions I took part in. To see the full run down of all of the discussions please see the link above.
Universal Researcher IDs (URID)
It was generally accepted by all in the discussion that unique IDs for things, be they projects, outputs, researchers or objects were a good idea in terms of data transfer and exchange. They must be a good idea as there are so many different ones you can have (in the course of the discussion we mentioned more than eight current projects to create URIDs). Things are much easier to link together if they all bear a single identifier. However when it comes to people the added issue of data protection rears its head and can potentially hamper any form of identification if it is 'assigned' to the person. A way round this was suggested to allow people to sign up to identifiers, thus allowing those who wish to opt out to do so. Ethically the best route perhaps but unless a single service was designated we could end up with a system similar to the one we have now where everyone is signing up, but not using a whole array for services. The size of the problem is the size of the current academic community and global in scope. Some of the characteristics of URIDs we came up with were they just be; unique (and semantic free - previously mentioned privacy issues), have a single place that assigns them, have a sustainable authority file, not be tied to a role. One current service in place that fulfils many of the above criteria is the UUID service, however this falls down in that there is no register of assigned IDs so people can apply for multiple IDs if they forget them (and lets face it the likely hood of remembering a 128 number is kind of low) ... and we're back in the same situation again. I'm not sure there is a single perfect solution to this problem, though my life would be easier if there was!
This was a free form discussion that covered the REF, REF preparations and 'Life after the REF' in various guises. HEFCE are currently tendering for the data to be used in the REF at the moment, needless to say the two services bidding are the expected two, Thomson Reuters and Scopus, but HEFCE will only be buying one lot of data. Bibliometrics were touched upon in relation to the REF, is it better to have two people select a really highly cited paper or choose two lower cited papers? Discussions on the HESA data, checking the data once it comes back from HESA, possibilities of mapping the future HESA data to the REF UoA for long term benchmarking rather than a single point hat goes out of date very quickly. Do people's CRIS systems really hold all of the data required for a return? What are the differences between the impact as measured/requested by HEFCE and the Impact measured by RCUK? Selection policy and training, the possibility of sector wide training, possible best practise mentioned in the idea to train a small core group of people who would handle all of the enquires centrally. Would it be possible for institutions to get the facilities data on a yearly basis rather than just before the REF and then have to try and chase people who may not remember/have left to try and verify the data?
One interesting comment from the discussion was the news that NERC, at least, has seen a big increase in the number of grant applications including a direct cost for Open Access funding. Interesting particularly is that there had been a number of comments made to me that researchers didn't want to do that are they feared making their grant application too expensive.
All in all the day was very interesting for me as an introduction to a 'world beyond publications' (as I was attending both for myself and for a member of our Research Support Services department) and as an indication of what we need to do to go forward.