This is the poster we’ll be presenting at the JISC MRD Launch Meeting in Nottingham next week.
Meeting our users, the Engineers
Paul, Nick and I had a great meeting with the two principal Engineering users last week, where we set out our broad objectives and discussed their involvement on the Orbital Project. It’s always been our intention to work with three types of user: academic staff, a commercial research partner and a PhD student. This morning, we met Chris Bingham, Prof. of Energy Conversion, and Stuart Watson, Head of Remote Monitoring and Diagnostics at Siemens. Later this week, we’ll be meeting Reader in Optimisation and Symbolic Dynamics, Dr Wing-Kuen Ling and one of his PhD students, who are also interested in contributing to the Orbital project.
At last week’s meeting we discussed Chris and Stuart’s current practice of working on sensor data from Siemen’s turbines, which involves a combination of physically secured machines, secure web services and sanitised data. As is common practice when working with commercial partners, the resulting research papers go through an approval process with the commercial partner prior to being submitted for publication and data is routinely abstracted so that confidential and commercially sensitive data isn’t made public.
We discussed how these current practices might be improved over and above the ‘baseline’ method used now. Chris and Stuart both felt that improvements could be made around physical access to the data (possibly PKI card integration) and a system that does not encourage copies being made of the data. There should be no need for Engineers to take data away with them, but rather always be available from a single data store. We also discussed the use of the Cloud for storing data and both Stuart and Chris acknowledged that attitudes towards Cloud Computing were changing and that it’s worth considering it.
Their measure of success of our research data infrastructure is whether it increases productivity and overcomes some of the barriers to access and sharing of the data that currently exist. They also expressed an interest in how the infrastructure can also help manage related artefacts, such as presentations and research papers. Ideally, they want something that helps manage all aspects of their research environment rather than fragment it into disparate systems.
Actions from the meeting were to introduce the project at the next all staff meeting of the School of Engineering (done), arrange to meet the Developer of Siemen’s in-house software and, as mentioned above, speak to Dr Ling about his involvement as a user on the project, recognising that his area of research is different to Chris and Stuart’s and presents us with a different type of data and workflow. Finally, we also agreed to invite Dr. Mansur Darlington of the ERIM project to hold an extended meeting in late January with Engineering staff to discuss the outcomes of the ERIM project.
Notes from RDMF7 workshop
I’ve been at the University of Warwick today, for a workshop organised by the Digital Curation Centre (DCC), entitled RDMF7: Incentivising Data Management & Sharing. There appeared to be a wide range of attendees, from data curators & data scientists, ICT/database folk. actual researchers and academics, as well as at least one fellow library/repository rat.
Unfortunately I was only able to attend part of the event (which ran over two days). The following notes have been reconstructed from the Twitter stream (hashtag #RDMF7)!
The first speaker I heard was Ben Ryan of the funding council, the EPSRC. He talked about the “long-established” principles of responsible data management [links below]… this may be my own interpretation of Ben’s presentation, but I don’t think I was imagining undertones of “…so there’s really no excuse!“. He also covered individual and institutional motivations for taking care of data [much more about which later], policy and the enforcement of policy, dataset discoverability/metadata, funding (including the EPSRC’s expectation that institutions will make room in existing budgets to meet the costs of RDM), and embargo periods (inc. researchers’ entitlement to a period of “privileged use of the data they have collected, to enable them to publish” first – important to stress this in order to allay fears/get researchers on board?).
Some links:
- UKRIO (UK Research Integrity Office) Code of Practice for Research
- RCUK (Research Councils UK) Common Principles on Data Policy
- EPSRC Policy Framework on Research Data
Next up was Miggie Pickton, ‘queen bee’ of the University of Northampton‘s repository (and self-described RDM “novice”, indeed!), talking about their participation in the multi-institution, JISC-funded KeepIt project, which aimed to design “not one repository but many that, viewed as a whole, represent all the content types that an institutional repository might present (research papers, science data, arts, teaching materials and theses).” This work lead almost by chance to Northampton’s undertaking of a university-wide audit of its research data management processes using the DCC’s Data Asset Framework (DAF) methodology. This helped them to make the case for an institutional research data management working group and [eventually, and not without resistance] to establish a mandatory, central policy for RDM. (Show of hands at this point: how many other institutions have completed a DAF? I counted perhaps only three, Lincoln certainly not being amongst them. Q. Should the University of Lincoln complete a Data Asset Framework exercise as part of the Orbital project?)
After coffee, we heard a third presentation from Neil Beagrie of (management consultancy partnership) Charles Beagrie Ltd. Neil delivered a very comprehensive explanation of the KRDS (“Keeping Research Data Safe”) project, which has developed both an activity model and a benefits analysis toolkit for the management and preservation-of-access to ‘long-lived data’. I have to come clean here and admit that I was a little bewildered by the detail: much of it went through both ears without sticking to the brain on the way through. I need to go back over the tweets more carefully and have a look at the KRDS toolkit and reports at: beagrie.com/krds.php
The morning’s presentations over, we split into three groups for breakout discussion.
I attached myself to the second of the three groups, led by (JISC programme manager for Orbital) Simon Hodson; our job to consider the question: “What really are the sticks and carrots that will make a long-term difference to the pursuit of structured data management processes?“. After spending some time picking apart the terminology, and what each of the various ‘processes’ might include, we had a wide-ranging (and allocated-time-overrunning) discussion about the things that genuinely motivate scientists, universities, and funding councils(!) to care about RDM; about some of the problems caused by the complexity and inconsistency of metadata for datasets; also about the issue of citations/digital object identifiers for data—how those citations might be treated by publishers and citation data services—and how that relates to any notions of ‘peer review’ in experimental data.
As requested, our group came up with three actions which we believe will help address the question of motivation:
- Data citation – publishers should consistently include e.g. DOIs for datasets in final published articles, so that citations of the data can be measured.
- Measurement of RDM “maturity” – departments and whole institutions should adopt a standardised quality mark for research data management, to give [potential] researchers, funding bodies, and the public confidence in their ability to handle data appropriately.
- Discovery – the research councils (probably) should push for common metadata standards for describing datasets and underlying data-generating research/experimental processes.
Lunch followed, and I had time to hear two more presentations in the afternoon before I had to run for a bus:
Catherine Moyes of the Malaria Atlas Project: in effect, demonstrating what really clear and consistent management of large-scale (geo)data looks like. This seems to consist of an extremely rigorous approach to requesting, tracking, and licensing data from the contributors of the project’s data… and an equally strict (but in a good way) expectation of clarity when dealing with requests from third parties to use the data. If that all comes across as restrictive, I’d point to Catherine’s slide on ‘legalities’ of the data that the Malaria Atlas Project has released openly – it’s about as open as it gets, with no registration needed, no terms & conditions placed on re-use of the published data, and all software/artefacts released under very permissive and free licences (Creative Commons or GNU). N.B. the Orbital project should look at the Malaria Atlas Project’s “data explorer”, available via map.ox.ac.uk, as an example of a really nifty set of applications built on top of openly accessible and re-usable data.
Finally (and I’m sorry I only got to hear part of his presentation), University of So’ton chemistry professor Jeremy Frey on their IDMB (Institutional Data Management Blueprint) Project—southamptondata.org—and some rather funny anecdotes about the underlying knowledge, expectations, and problems faced by researchers managing their own data, which emerged when they were surveyed as part of the above project.
Lots to take in (lots). But some useful suggestions for Orbital, which I’ll be bringing to the next project meeting: and plenty more reading material which I’ll add to the project reading list asap.
—Paul Stainthorp, lead researcher on the Orbital project.
1st Steering Group meeting. Plan for the future
We met this morning for our first Steering Group meeting of the Orbital Project. Following a discussion about the objectives of the MRD programme in general, the main agenda point was to discuss the Project Plan prior to me sending it to JISC. I will publish the Plan on this website once it has been signed off.
Questions were raised by the Steering Group specific to the research data of Engineers and the confidential and commercial nature of their work. Our School of Engineering was established through a partnership with Siemens and therefore the research undertaken by some of our researchers uses data provided under strict confidentiality agreements. The Orbital project has always been aware of this and it is one of the interesting challenges which we highlighted in our bid to JISC. It raises very important questions over ownership, authenticity, privacy and liability. Further discussions on this topic will be forthcoming.
Another point was raised by Dr. James Murray, our IP Manager, around the use of open licenses for documentation and code and whether the infrastructure we develop might have any commercial value. On a project of this size, it’s an important question and one I had given some thought to. Personally, I admire the way that the University of Southampton has created a commercial service around their open source EPrints software, which we use and subscribe to at Lincoln. I was asked if we might invite someone from EPrints Services to come to discuss their experience with the Steering Group at our next meeting in February. I was pleased that this was brought up at this early stage as developing a Business Case for Orbital is not only vital to the long-term sustainability of our work, but a required output of the project, too. Given the project team’s preference for employing and publishing open source software, I’m keen that a Business Model based on open source software be given thorough consideration. It’s very early days to be thinking about this, but such considerations do take time to work out, too.
Finally, Prof. Andrew Hunter, Head of the College of Science and our Senior User, identified other areas of our STEM research that would benefit from the work of Orbital. This is not something we need to concentrate on right now in this MRD pilot project, but it, too, is an important consideration in planning for the long-term deployment and use of Orbital.
Project Planning: Quality Assurance
I am currently completing the Orbital Project Plan prior to submission to JISC next week. The writing of a Project Plan, using JISC’s template, which I think is a derivation of PRINCE2 documentation, is undeniably a useful exercise in defining the project we’re embarking on. It is also undeniably a tedious process, too, as it requires a particular style of thinking and writing: granular, incremental and forward looking, yet reflective; ambitious and creative yet restrained; serious yet mostly in a dumb tabular form. I find myself having intense bursts of concentrated writing and then having to step away from the document to restore myself both physically and mentally.
I’m currently at the Quality Assurance section of the document, which is in a tabular format to aid both author and reader. However, what I really want to write is this, taken from the Agile Manifesto:
We follow these principles:
Our highest priority is to satisfy the customer
through early and continuous delivery
of valuable software.Welcome changing requirements, even late in
development. Agile processes harness change for
the customer’s competitive advantage.Deliver working software frequently, from a
couple of weeks to a couple of months, with a
preference to the shorter timescale.Business people and developers must work
together daily throughout the project.Build projects around motivated individuals.
Give them the environment and support they need,
and trust them to get the job done.The most efficient and effective method of
conveying information to and within a development
team is face-to-face conversation.Working software is the primary measure of progress.
Agile processes promote sustainable development.
The sponsors, developers, and users should be able
to maintain a constant pace indefinitely.Continuous attention to technical excellence
and good design enhances agility.Simplicity–the art of maximizing the amount
of work not done–is essential.The best architectures, requirements, and designs
emerge from self-organizing teams.At regular intervals, the team reflects on how
to become more effective, then tunes and adjusts
its behavior accordingly.
I think I will print this on A3 and stick it to our wall.
What attracts me most to Agile methods of software development (I lean towards XP), is the emphasis on human interaction and the focus on values such as trust, respect, simplicity, autonomy and courage. All too often when running a project, the objective of delivering the product dominates and diminishes the creative and social process of producing something that improves our environment.
For me, as Project Manager, the Orbital Project is not only an interesting Research and Development project but also an opportunity to practise a method of human sociability and creativity over a defined period of time. Although I’ve tried to use attributes of Agile methods on projects in the past, this is the first time that I’ve started a project from scratch with this is as the principle method, and a project where I know the Lead Researcher and Lead Developer are likewise keen to work in this way.
This blog is a record of our project over the next 18 months or so. For my part, I’ll be reflecting honestly about the ups and downs of running the project and learning to work closely with people according to the principles quoted above. I’m sure we will fail at times and the process will get lost to the product, but we will learn, even during those times. And gradually, we’ll get better and produce better.