Research data documentation and training materials

The final within-project version of the Orbital Research Data Management training materials are now live on the Orbital Researcher Dashboard website. They have been written collaboratively by the Orbital project team, and draw on a lot of existing RDM training and guidance material from across the web (in particular, from the DCC).

We intend that these materials will continue to be maintained and developed as part of the new University-wide research information service mentioned in a previous blog post.

Screenshot of the Researcher Dashboard

The training materials can be accessed at https://orbital.lincoln.ac.uk/ and cover the following areas:

  1. What is research data?
  2. The research data lifecycle
  3. Policies affecting your research data
  4. Data Management Planning (DMP)
  5. Data search and discovery tools
  6. Data storage and security
  7. Legal and ethical issues
  8. Tools for working with your data
  9. Data publishing and citation
  10. Licences for sharing your data
  11. Data curation and preservation
  12. Workshops and training events
  13. Help and support

The source text for each page is stored in an open Github repository (at http://github.com/unilincoln/rdm) in Markdown format. The page admin tools in the Researcher Dashboard can then be used to link to the source document, which is then formatted in the University’s Common Web Design.

These web pages will be used to support the ongoing RDM training for postgraduate students, which will shortly be rolled out to University staff.

Orbital project team meeting: notes

Here are the notes of the most recent Orbital project team meeting (31 January 2013).

Present: Nick Jackson, Harry Newton, Paul Stainthorp, Joss Winn.

The project team discussed the following development tasks. The aim is for the following to be completed by the end of February 2013:

  • Demonstratable AMS-CKANEPrints workflow in Orbital Bridge (a minimal but operational RDM infrastructure);
  • Researcher dashboard to include projects and project metadata;
  • Users able to display and create datasets in CKAN from within Orbital Bridge (N.B. need to check changes to CKAN APIs between versions);
  • Demonstrator using the DataCite test API (until a budget is agreed for use of the live DataCite service);
  • Ability to publish dataset metadata to EPrints Repository, with a complete ‘publish’ UI in Orbital Bridge (to be tested on the University’s upgraded EPrints 3.3 Repository in March) – questions over versioning/locking of deposited metadata to be resolved;
  • Researcher dashboard to include analytics fom EPrints, CKAN, AMS, and bibliometric/citation services – add links to external profiles (Scopus, WoS, ORCID, Google Scholar) in the first instance. ACTION: JW to contact Planning to discuss reporting from the researcher dashboard (also data.lincoln.ac.uk; bibiometrics).

JW presented the Orbital business case to the University Senior Management Team on 14th January 2013. JW to work with the Dean of Research (Lisa Mooney) / Deputy V-c (Ieuan Owen) to discuss ongoing resourcing for RDM.

ICT are undertaking a cloud major scoping study, including RDM storage requirements.

The draft RDM policy is to be presented to the Research & Enterprise committee in April.

NJ, HN and PS are working on the display of RDM training and documentation in Orbital Bridge, with versioned text stored as Markdown in Github. Pages in Orbital can be linked to Github.

The next RDM training for postgraduate students will take place on 6th March 2013. ACTION: PS to embed a calendar feed of training events on the Orbital website.

Upcoming events:

Orbital training and documentation

I’ve been quiet—too quiet—about the Orbital project recently. While I’ve not been blogging, Joss, Nick and Harry have overseen several fairly important developments:

As Orbital-the-product (coherent set of products, really) develops, my own focus between now and the end of the project (March 2013) will be on Orbital-the-servicetraining, support, documentation, and implementation of RDM policy at the University of Lincoln. I’ll work closely with the Research & Enterprise department on these aspects.

Four level hierarchy of documentationAs part of this strand of the project (which cuts across workpackages 7, 11, and 12), I want to consider the following:

  1. The current usability of ownCloud, CKAN, EPrints, etc. – what ‘sticking plaster’ help materials do we need to provide right now (if any?).
  2. How the production of documentation fits in to the software development release cycle (“change management“?) – particularly so in an agile/iterative environment, and how we ensure we meet our responsibility to ‘leave no feature undocumented’ as well as provide adequate contextual information on RDM. Related: I’m thinking about a four-level hierarchy of documentation (see right): how do the different levels relate to each other (how do we ensure internal consistency?), and how do we ensure all four levels are covered?
  3. [How] should we contribute to an (OKFN-co-ordinated) open research [data] handbook initiative (c.f. the Open Data Handbook; Data Journalism Handbook) instead of—or as well as—writing our own operational help guides? Contributing to and re-consuming community-written RDM materials will be more efficient than writing our own guidebook from scratch, but we need to make sure our local documentation is relevant to Lincoln.
  4. I’ve already started collated a list of other peoples’ RDM help materials (Joss has collected many more) – I’ll publish the list to this blog soon. I’ll be looking to see what we can re-use. There are some very good, openly-licensed training materials available, but I don’t want us to use them uncritically.
  5. How do we use our (still not-yet-accepted) RDM policy as a jumping-off point for training events?
  6. What did we learn from our recent(ish) Data Asset Framework exercise? How can we use researchers’ priorities as identified in the DAF to inform training? Should we re-run the exercise and/or follow it up with more detailed discussions?
  7. It possible/likely that we will shortly have a new member of staff to work with the Lincoln Repository and the University’s REF submission. What responsibility might that person have for RDM training and support?

Next I need to organise a meeting with the Research & Enterprise department to plan our ‘version 0.1’ training programme, possibly consisting of (i) a discussion of the issues raised in our DAF survey and people’s current RDM practice, (ii) a discussion of the RDM policy, and (iii) presentation of the various VRE tools available (CKAN, ownCloud, EPrints, DataCite, DMPOnline). We’ll probably pilot this on a group of willing PhD students in the School of Engineering.

Orbital notes, 24 May 2012

The Orbital project team met today (24 May 2012) and agreed the following:

  • Documentation
  • User documentation will focus on the “why”s of Research Data Management, rather than being a point-and-click guide to the Orbital UI (which should not require detailed explanations).
  • JW will create a changelog (human readable text file) for each major release of Orbital, so that documentation for each feature is review if that feature is updated.
  • PS will lead on writing documentation (as HTML pages, stored in the GitHub repository), with documentation for release v0.N completed and available by the launch of v0.N+1
  • PS will email colleagues from the Library and Research/Enterprise for assistance on writing documentation.
  • Training
  • JW will invite Melanie Bullock and David Sheppard on to the Orbital working group. He is meeting Annalisa Jones to discuss RDM training for staff.
  • Releases/development
  • Orbital v0.1.1 (including bug fixes) met all of the initial ‘minimum viable product‘ requirements specified by Dr Tom Duckett, and also includes the basics of project administration.
  • v0.2 will include improvements to the file upload/management, project management, and license management interfaces, as well as clearer distinction between language files and operating code.
  • NJ demoed the current version of Orbital to Siemens staff. He now has access to Siemens machine data for testing within Orbital.
  • The group discussed the LNCD plans for internal servers/private cloud, and about the disk space requirements and costs.
  • Integration
  • The current version of the DMPOnline tool has been installed on a test server. The group discussed our approach to integration between external tools/software (such as DMPOnline, R, Gephi) and Orbital.
  • NJ is going to email Adrian Richardson at the DCC to ask when the DMPOnline APIs will become available.
  • RDM policy
  • JW presented the draft policy to the University RIEC committee. The committee have been asked to send comments to Joss. (One comment at the committee meeting was that our having a policy too geared around the requirements of the Research Councils may not be appropriate for Lincoln, which generates a lot of non-RC income. However it was noted that the good practice specified by the RCs is good practice for management of all research data, whatever the funding source.)
  • Conferences and meetings
  • The group discussed the recent DAF survey which we conducted at the University of Lincoln.
  • JW will convene a sub-group to consider the responses in detail, and plan follow-up interviews.
  • Business case
  • JW is currently gathering costs for long-term data storage. This will form the first strand of the Orbital business case, which will be presented to University SMT (along with the agreed RDM policy) in September 2012.

Jenkins, build my software!

Orbital is going to be a big bit of software, with lots of things doing lots of other things. A big part of putting together such a large bit of software – alongside our Pivotal Tracker instance – is the regular process of ‘building’ the software from source code into something that can actually be used, testing it and getting it onto our development servers so that we can actually see what it’s doing. As part of Orbital we’re taking a step into what is a relatively unexplored frontier for the development team here at Lincoln – Continuous Integration.

Continuous Integration means that as we develop our software it’s constantly being built, tested and deployed to make sure that it’s behaving as expected. We’re using the popular Jenkins server to manage everything that’s going on as part of this process, as well as provide reports on what’s happened. We’re slowly adding more things to the list of what’s actually happening when the magic starts, but here’s what we’re going to be doing by the end of the project every single time that somebody makes a change to our codebase:

  • Ensure that the source code is available from GitHub.
  • Invoke Phing to do all kinds of additional goodness as part of an automated build, including:
    • Run unit tests on our code using PHPUnit.
    • Verify that the code adheres to certain style standards (We use the CodeIgniter Style Guide) using PHP Code Sniffer. Specifically we’re using Thomas Ernest’s implementation of the guide.
    • Run a whole battery of analysis that looks for messy code structure and duplicate code.
    • Automatically build the technical documentation using DocBlox. This isn’t the end-user documentation, but it does tell us exactly what all our code is supposed to be doing so that we have a reference.
    • Perform token replacement on the resultant codebase. This means that we can keep the code repository clear of all environment and institution specific configuration, since these are replaced as we perform a build.
  • Deploy the built codebase to our development and testing platform so that we can actually use it.
  • Tell us the results of all of the above in a variety of pretty graphs and reports.

Continue reading “Jenkins, build my software!”