{"id":363,"date":"2009-09-23T21:33:02","date_gmt":"2009-09-23T20:33:02","guid":{"rendered":"http:\/\/blog.soton.ac.uk\/keepit\/?p=363"},"modified":"2009-09-23T21:33:02","modified_gmt":"2009-09-23T20:33:02","slug":"data-repositories-the-next-new-wave","status":"publish","type":"post","link":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/2009\/09\/23\/data-repositories-the-next-new-wave\/","title":{"rendered":"Data repositories: the next new wave"},"content":{"rendered":"<p><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-full wp-image-417\" src=\"http:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-content\/blogs.dir\/sites\/52\/2009\/09\/dcc_logo.png\" alt=\"dcc_logo\" width=\"220\" height=\"70\" \/>Should institutional repositories do data curation? Underlying this question is: what is a repository, and is that changing?<\/p>\n<p>Without going into full detail, there are some straws in the wind. First, back when I was working for the Repositories Support Project, Tony Hey, VP Microsoft and formerly head of our school (Electronics and Computer Science) at Southampton University, gave the keynote at the\u00a0RSP Repository Softwares Day:\u00a0Repositories \u2013 Past, Present and Future (<a href=\"http:\/\/www.rsp.ac.uk\/events\/RepositorySoftwareDay2009\/tony_hey_microsoft.ppt\">slides<\/a>). Look in particular at the part on the future.<\/p>\n<p>Unprompted, I dashed off a report on the meeting for the RSP Web site. It wasn&#8217;t used &#8211; clearly I wasn&#8217;t <a title=\"Another Triumph for the Repositories Support Project, 24 March 2009\" href=\"http:\/\/www.rsp.ac.uk\/news\/news2009-03-20SoftwaresDayNewsItem.php\" target=\"_self\">effusive<\/a> enough about the event &#8211; but my main point was this: If repositories are the new wave of scholarly communication, the next\u00a0new wave was glimpsed in a keynote presentation by Tony Hey. He pointed to\u00a0&#8216;cloud computing&#8217; as the way forward, with the potential for recasting\u00a0the way repositories are structured and managed.<\/p>\n<p>Hey&#8217;s view was reinforced by Dave Flanders of JISC in a closing panel\u00a0session.\u00a0So how does the cloud transform repositories? The cloud provides the technical\u00a0infrastructure, so that repositories don&#8217;t have to, and it offers\u00a0flexibility, leaving repositories to focus on what they want to do,\u00a0whether this is to restructure repositories to reflect institutional\u00a0priorities, to be led within institutional structures such as schools\u00a0and departments, or to manage different digital types such as research\u00a0materials and learning objects. Put the repository in the cloud and\u00a0then ask the questions again, is how Flanders saw it.<\/p>\n<p>More recently, at the end of July, I went to the <a title=\"The Sessions\" href=\"http:\/\/wiki.repositoryfringe.org\/index.php\/The_Sessions\" target=\"_self\">Edinburgh\u00a0Repository Fringe<\/a> to join the DataShare and DCC data workshops. Instead of hearing about the latest exciting repository developments in other sessions, we ploughed into data management documentation. While I didn&#8217;t find the open forum of the DataShare workshop to be particularly enlightening, the underlying support document, the <a title=\"Policy-making for Research Data in Repositories: A Guide, DISC-UK DataShare, May 2009\" href=\"http:\/\/www.disc-uk.org\/docs\/guide.pdf\" target=\"_self\">policy-making guide<\/a> produced by the project, is helpful. What&#8217;s good is this guide recognises that everyone produces data; it&#8217;s not just for specialists, and the multilayered presentation makes it more approachable.<\/p>\n<p>Next morning it was time for\u00a0Digital Curation 101 &#8216;Lite&#8217;. Despite lasting three hours, this was a clearly a skim, yet somehow\u00a0it gave a sense of the <a title=\"DCC Digital Curation 101 Workshop\" href=\"http:\/\/www.dcc.ac.uk\/events\/digital-curation-101-2008\/\" target=\"_self\">full 101 course<\/a> that Digital Curation Centre has compiled.\u00a0I was impressed. The short team exercises revealed that there are others in the institutions who are approaching these issues from a completely different angle to repositories. There&#8217;s the clue.<\/p>\n<p>Nevertheless, I came away thinking these data management issues, which are institution-wide and transformational in scale, are not going to happen in the next year, the remaining timeframe of the KeepIt digital preservation project. Our <a title=\"Tag: exemplar profiles, Diary, various\" href=\"http:\/\/blog.soton.ac.uk\/keepit\/tag\/exemplar-profiles\/\" target=\"_self\">exemplar repositories<\/a> are not going to be transformed in that time. Perhaps I should drop it.<\/p>\n<p>Then Dorothea Salo unwittingly opened my mind to the prospect again. A major theme of Salo&#8217;s latest blog incarnation is data curation and it connects well, rather unusually, with institutions and their repositories. <a href=\"http:\/\/scienceblogs.com\/bookoftrogool\/2009\/08\/if_not_now_when.php\" target=\"_self\">If not now, when?<\/a> (27 Aug 2009):\u00a0&#8220;who&#8217;s going to do data curation &#8230; we can have a pretty good idea who&#8217;s not going to do it: anybody who isn&#8217;t <em>right this very minute<\/em> planning to do it. This is no time for analysis paralysis.&#8221;<\/p>\n<p>So when we convened earlier this month to redraft our project training plan I resolved to put the case for including data management to our exemplar repositories. I noted that each of these repositories exemplifies a different aspect of data management. I\u00a0suggested that JISC and DCC, as well as UKRDS, research funders and,\u00a0eventually, institutions will be the drivers for this. Coincidentally,\u00a0immediately after the meeting\u00a0a <a title=\"Data's shameful neglect, Nature, 10 September 2009\" href=\"http:\/\/www.nature.com\/nature\/journal\/v461\/n7261\/full\/461145a.html\" target=\"_self\">Nature editorial<\/a> came to light saying essentially the same thing. How can we not go forward after this admonishment.<\/p>\n<p>So, finally, here is my take on how repositories may be changing. We have to separate people and content or data from systems and infrastructure. At the moment we tend to take a systems-based approach\u00a0(e.g. is it EPrints or DSpace) to managing a thinly-defined type of content, and the focus is the <em>repository<\/em>. Yet as these repositories grow institutionally, that is, to represent and present all the substantive activities and outputs of the institution, we can see the expansion and <a title=\"Repositories at the crossroads, JISC Preserv project\" href=\"http:\/\/preserv.eprints.org\/guide\/repository\/\" target=\"_self\">transformation of the system<\/a> in the &#8216;cloud&#8217;, and the emergence of <a title=\"Repository storage: the impact of data growth and diversity, JISC Preserv project\" href=\"http:\/\/preserv.eprints.org\/guide\/storage\/\">intermediate services<\/a> to manage repository systems within this flexible infrastructure.<\/p>\n<p>There are already many people supporting systems and IT infrastructure in institutions; there are <a title=\"Data deletion: it happens, Diary, September 16, 2009\" href=\"http:\/\/blog.soton.ac.uk\/keepit\/2009\/09\/16\/data-deletion-it-happens\/\" target=\"_self\">fewer people designated to manage data<\/a> and support data creators. We can already see in our exemplar repositories the types of data that might be managed: arts, science (crystallography), teaching materials, research papers, etc., and probably within disciplines and sub-disciplines for some data types. The people responsible for these repositories tend to be called repository managers, but they are not systems experts; they are data experts. We need many more data experts across the institution.<\/p>\n<p>As repositories grow they will essentially become teams of data experts working with data producers. There will be repository managers, but they will be team leaders, coordinating data and systems teams, rather than repository managers we know today. What kind of people will they be? Salo has an idea (<a href=\"http:\/\/scienceblogs.com\/bookoftrogool\/2009\/08\/the_accidental_informaticist.php\" target=\"_self\">The accidental informaticist<\/a>, 17 Aug 2009): &#8220;can-do souls comfortable with a lot of uncertainty and able to learn fast.&#8221;<\/p>\n<p>It is my expectation that we need to allow the repository managers of our exemplars to develop as people rather than simply as fronts for repository systems. This project is unlikely to see that process complete, but if we have the vision we can at least make a start. It will be a major topic\u00a0and a big challenge to embrace it, but at least we know who to turn to\u00a0for help.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Should institutional repositories do data curation? Underlying this question is: what is a repository, and is that changing? Without going into full detail, there are some straws in the wind. First, back when I was working for the Repositories Support Project, Tony Hey, VP Microsoft and formerly head of our &hellip;<\/p>\n","protected":false},"author":869,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"categories":[4],"tags":[266,267],"class_list":["post-363","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-data-curation","tag-data-management"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/posts\/363","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/users\/869"}],"replies":[{"embeddable":true,"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/comments?post=363"}],"version-history":[{"count":0,"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/posts\/363\/revisions"}],"wp:attachment":[{"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/media?parent=363"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/categories?post=363"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/tags?post=363"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}