{"id":2449,"date":"2009-06-02T11:10:28","date_gmt":"2009-06-02T10:10:28","guid":{"rendered":"http:\/\/blog.soton.ac.uk\/keepit\/?p=80"},"modified":"2009-06-02T11:10:28","modified_gmt":"2009-06-02T10:10:28","slug":"keepit-repositories-initial-survey-ecrystals","status":"publish","type":"post","link":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/2009\/06\/02\/keepit-repositories-initial-survey-ecrystals\/","title":{"rendered":"KeepIt repositories initial survey: eCrystals"},"content":{"rendered":"<p><a href=\"http:\/\/ecrystals.chem.soton.ac.uk\/\"><img loading=\"lazy\" decoding=\"async\" class=\"alignright size-medium wp-image-81\" src=\"http:\/\/blog.soton.ac.uk\/keepit\/files\/2009\/06\/logo_ec.jpg\" alt=\"eCrystals logo\" width=\"142\" height=\"55\" \/><\/a><\/p>\n<p>Our third repository exemplar is eCrystals, which manages scientific, specifically\u00a0crystallography, data that might be referred to broadly as e-data\u00a0or e-science.<\/p>\n<p>To recap the purpose of\u00a0these initial surveys of the four exemplar repositories in the KeepIt project, we are seeking to characterise the repositories not in terms of their preservation activity but in terms of factors that will influence possible preservation strategies.<\/p>\n<p>Since this repository operates somewhat differently from the others, some brief background information is needed. The repository is\u00a0operated\u00a0by the\u00a0National Crystallography Service (NCS) based at the University of Southampton, so is a national service. It manages two types of data: the &#8216;raw&#8217; data generated directly by crystal analysis, and the results data\u00a0&#8216;derived&#8217;\u00a0from the raw data.<\/p>\n<p>NCS offers two types of experimental service to its users<\/p>\n<ol>\n<li>Full determination, where NCS generates raw data and works up derived data into results. This is deposited in eCrystals (generally initially embargoed);<\/li>\n<li>Data Collection only, where NCS collects the raw data and turns it into the first stage of derived data. This derived data is then sent to users and they work it up into results.\u00a0None of the user-derived or results data is deposited in eCrystals.<\/li>\n<\/ol>\n<p>The future plan is to use eCrystals for 2, where NCS deposits first-stage derived data, the user picks it up, turns it into a result and deposits the result into eCrystals.<\/p>\n<p><a href=\"http:\/\/ecrystals.chem.soton.ac.uk\/\" target=\"_self\">http:\/\/ecrystals.chem.soton.ac.uk\/<\/a><\/p>\n<p>ROAR: <a href=\"http:\/\/roar.eprints.org\/index.php?url=http:\/\/ecrystals.chem.soton.ac.uk\/\" target=\"_self\">http:\/\/roar.eprints.org\/index.php?url=http:\/\/ecrystals.chem.soton.ac.uk\/<\/a><\/p>\n<p><em>Current status of repository<\/em><\/p>\n<p>Was funded by JISC, eCrystals open-access archive project, to end March 2009.<\/p>\n<p>Archive service provided as part of NCS at Southampton University, funded by EPSRC. This funding is subject to periodic review in the forthcoming research cycle.<\/p>\n<p><em>Mission<\/em><\/p>\n<p>eCrystals Southampton is the archive for Crystal Structures generated by the Southampton Chemical Crystallography Group and the\u00a0EPSRC UK National Crystallography Service (NCS).<\/p>\n<p><em>Management structure and decision-making, reporting tree<\/em><\/p>\n<p>Management structure of the repository is headed by the director of the national service.<\/p>\n<p><em>Staffing (no. FTE)<\/em><\/p>\n<p>0.5 FTE systems administrator<\/p>\n<p><em>Policy<\/em> (documentation, e.g. mandate, format policy, retention policy, take down policy?)<\/p>\n<p>Embedded into <a href=\"http:\/\/www.ncs.chem.soton.ac.uk\/pub_pol.htm\" target=\"_self\">NCS Publication Policy<\/a>:\u00a0&#8220;We have created an archival method for crystal structure data which is designed to reside on Institutional Repositories.<\/p>\n<p>&#8220;At the present time, we are operating two archives. One is a private resource, visible only within the Southampton firewall, which is used as a comprehensive laboratory management and data archival system, to which we now routinely upload all completed and validated crystal structure determination outputs. The other archive is an open access resource, visible <a title=\"eCrystals repository\" href=\"http:\/\/www.ecrystals.chem.soton.ac.uk\" target=\"_self\">externally<\/a>, which we are now using as a direct route to dissemination of structural data. Each entry is assigned a Digital Object Indentifier (DOI) so that the entry may be referenced in any future publication.&#8221;<\/p>\n<p>For journal publications that report and link to crystal structure determinations presented in the repository the policy recognises it is important to satisfy publishers and the public that it will have the same stability and longevity as journal publications.<\/p>\n<p>The &#8220;two&#8221; archives referred to here are concerned with just the derived and results data, not raw data. The difference today is effectively embargoed and not embargoed. The &#8220;raw&#8221; data is stored at the Atlas Data Store at the Rutherford Appleton Laboratory, essentially a closed repository that is used as an internal store, but this data is available on request (by email \/ post \/ dropbox type solutions).<\/p>\n<p><em>Planning the repository (formal planning approach?)<\/em><\/p>\n<p>Repository founded on JISC project planning and design<\/p>\n<p>Data architecture carefully matched to crystallography requirements<\/p>\n<p>Investigated preservation needs and options:  A study of <a href=\"http:\/\/wiki.ecrystals.chem.soton.ac.uk\/images\/a\/a7\/ECrystalsCuration.pdf\" target=\"_self\">Curation and Preservation Issues in the eCrystals Data Repository and Proposed Federation<\/a><\/p>\n<p><a href=\"http:\/\/wiki.ecrystals.chem.soton.ac.uk\/images\/a\/a7\/ECrystalsCuration.pdf\" target=\"_self\"><\/a>Further preservation reports due:<\/p>\n<ol>\n<li>Representation Information for Crystallography Data;<\/li>\n<li>Preservation Planning for Crystallography Data;<\/li>\n<li>Preservation Metadata for Crystallography Data<\/li>\n<\/ol>\n<p><em>Budget (contingency for preservation?)<\/em><\/p>\n<p>Budget covers storage of raw data.<\/p>\n<p>Results (derived) data not formally budgetted but this is to be reviewed.<\/p>\n<p><em> Infrastructure (institutional, network, etc.)<\/em><\/p>\n<p>eCrystals server managed by sys. admin.<\/p>\n<p>Archive is backed up nightly.\u00a0No offsite backups of server. The backup is within the chemistry department, to a building connected by corridor.<\/p>\n<p>Personal curation culture &#8211; analysis of crystal structures performed on series of linux boxes<\/p>\n<p><em>Tools, services and support (which v. EPrints?) <\/em><\/p>\n<pre>version:\u00a0eprints-3.0.3-rc-1<\/pre>\n<p>Reconfigured repository software, core code modified, \u00a0bespoke standalone code and third-party Web services used.<\/p>\n<p><em> Storage (current, strategy?)<\/em><\/p>\n<p>Record of the raw data\u00a0back to about 2002, including frame images, at the Atlas Data Store at the Rutherford Appleton Laboratory.<\/p>\n<p>Testing storage of raw data (500 GB) from\u00a0just the last couple of years\u00a0on (School of ECS, Southampton) Honeycomb server (Honeycomb hardware platform discontinued by Sun, support continues to 2013).<\/p>\n<p>Data from 1998-2002 is on USB disks stored in our lab, migrated from CDs written at the time of generation.<\/p>\n<p>Institutional solution preferred.<\/p>\n<p><em>Content profile &#8211; volume, types, formats (content control?) <\/em><\/p>\n<p>The information contained within each entry of this archive is all the fundamental and derived data resulting from a single crystal X-ray structure determination, but excluding the raw images.<\/p>\n<pre>21\/05\/09<\/pre>\n<pre>archive: 480, buffer: 26, inbox: 65, deletion: 7, eprint: 578<\/pre>\n<pre>document: 565<\/pre>\n<p><a href=\"http:\/\/roar.eprints.org\/index.php?action=profile&amp;url=http:\/\/ecrystals.chem.soton.ac.uk\/\" target=\"_self\"> Preserv format profile<\/a> (large number of files &#8216;unknown&#8217; to profiling tool)<\/p>\n<p><em> Growth projections (scaling up?) <\/em><\/p>\n<p>Plan to expand remit of repository to cover user-derived data (see above).<\/p>\n<p>Scientific instrumentation has a typical lifespan of 10 years. As equipment is renewed there is likely to be an order of magnitude increase in data volumes.<\/p>\n<p><em> Future plans for the repository (any major changes planned?) <\/em><\/p>\n<p>Change storage model &#8211; cloud?<\/p>\n<p>Target more learned society involvement.<\/p>\n<p><strong> Summary <\/strong><\/p>\n<ul>\n<li>Part of national service provision<\/li>\n<li>Detailed repository data architecture design developed over several project iterations<\/li>\n<li>Highly customised (EPrints) repository software<\/li>\n<li>Well informed and proactive on preservation<\/li>\n<li>Funding uncertainties pending review<\/li>\n<\/ul>\n<p><strong> Proposed actions <\/strong><\/p>\n<ul>\n<li>Review storage provision<\/li>\n<li>Examine infrastructure options and prospects, strengthen current arrangements<\/li>\n<li>Assess scope for policy provision beyond publishing policy<\/li>\n<li>Consider how to cultivate and embed personal curation practices endemic in this field of science<\/li>\n<li>Produce more complete profile of deposit formats<\/li>\n<li>Consult on upgrade to EPrints v3.2 when available, or assess how to integrate preservation-support tools from this version in the customised repository software<\/li>\n<\/ul>\n<p>Thanks to Simon Coles, Manjula Patel and Richard Stephenson for sharing and clarifying this information.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Our third repository exemplar is eCrystals, which manages scientific, specifically\u00a0crystallography, data that might be referred to broadly as e-data\u00a0or e-science. To recap the purpose of\u00a0these initial surveys of the four exemplar repositories in the KeepIt project, we are seeking to characterise the repositories not in terms of their preservation activity &hellip;<\/p>\n","protected":false},"author":869,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_et_pb_use_builder":"","_et_pb_old_content":"","_et_gb_content_width":"","footnotes":""},"categories":[4],"tags":[274,281,283,327],"class_list":["post-2449","post","type-post","status-publish","format-standard","hentry","category-uncategorized","tag-ecrystals","tag-exemplar-profiles","tag-exemplar-surveys","tag-science-repositories"],"jetpack_featured_media_url":"","_links":{"self":[{"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/posts\/2449","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/users\/869"}],"replies":[{"embeddable":true,"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/comments?post=2449"}],"version-history":[{"count":0,"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/posts\/2449\/revisions"}],"wp:attachment":[{"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/media?parent=2449"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/categories?post=2449"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/generic.wordpress.soton.ac.uk\/test-media\/wp-json\/wp\/v2\/tags?post=2449"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}