1. The Documents
SWORD: Facilitating Deposit Scenarios
http://dlib.org/dlib/january12/lewis/01lewis.html
A review of deposit use cases found during work around SWORD and SONEX in 2012. Each use case is supported by a number of real-world
examples.
Data Deposit Scenarios
http://swordapp.org/2012/07/data-deposit-scenarios/
A review of research data deposit scenarios, documented during work around SWORD and SONEX in 2012. We identified data types, sources,
and target repositories and documented them. The document lists some real-world examples of data deposit. It then goes on to list
generalised requirements for data deposit and carries out a gap analysis on SWORDv2.
SWORD Community Development Document
https://docs.google.com/document/d/1Rh80CbH3F7P8pqK4CqyEMpi-efDclETRMNqyqPO71Z0/edit
A document circulated by Jisc in 2016 to gather input on case studies, implementations, requirements and sustainability options. This
document is written piecemeal by a number of contributors working with SWORD, so is direct input from implementer base in some cases.
SWORD Statement of Requirements
https://docs.google.com/document/d/1fajFcmFL4jRw4ym_pQTIyec8tqy8NKeLFhemdfqRRts/edit
A short document pulled together from various notes gathered by Jisc on SWORD, going into 2017.
2. Working Principles
From those documents we devised a short set of working principles that are not requirements or usage patterns, but which would be worth
keeping in mind as the project progressed.
- The more optional features, the harder true interoperability
- Simpler the better - aim to remove any unusued features from SWORDv2
- Research data support is key, though not at the expense of existing features
- Make it easy for the community to engage and developers to pick up
- Make it easy to maintain and extend
- Be clear about the distinction between protocol and implementation
- One single simple (as possible) document describing the protocol
- Pay attention to anti-patterns: only one file, only one metadata schema, etc.
- Prioritise current, validated and pressing use cases
- Make it easy to relate implementations to the parts of the protocol
- Minimise the effort to implement against a repository (as few special features as possible)
3. Usage Patterns
A usage pattern is what we called our single units of functionality that we wanted to support.
Smaller than a use case, larger than a user story.
All usage patterns were derived from analysis of the source documents, and from implementation experience with SWORDv2.
Full set here (41 in total)
Some examples below.
- Research data deposit
- Researchers should be able to easily deposit data for publication, discovery, safe storage, long-term archiving and preservation
- Transmission of data meeting metadata standards
- The protocol should support the transfer of well-understood data formats and profiles such as PCDM, METS, RIOXX, etc.
- Automated machine-to-machine deposit
- Autonomous systems should be able to communicate with eachother as needed via the protocol
- Man-in-the-middle broker
- Deposits should be possible via a brokerage service or other itermediate, which stands between the depositor and the target archive(s)
- Real-time file-storage
- The protocol should offer the facilities to enable the repository to behave like a real-time file store for user-facing systems
- Monitor workflow progress
- Be able to track the state of an item as it is in the repository - whether it is in a workflow, in the archive, or if other actions have happened to it
- Arbitrarily large files
- Deposited files may be very large
- Send files by reference
- Send one or more links to files to be ingested and attached to an item. Some links may not need to be ingested, a reference may just need to be created
4. Requirements
The requirements are clean single statements about the capabilities that the specification must have.
Full set here (66 in total)
Some examples below.
- Deposit must support arbitrarily large files
- Metadata/Content + Metadata deposits should support one or more metadata formats/profiles
- Authentication for any action must not require human intervention on any individual request
- Be able to retrieve information about an item's workflow state in the repository
- All information should be machine-readable: use of markup (e.g. XML), and URIs for status information
- Be able to determine when an item which was previously deposited is no longer available in the repository
- Be able to retrieve the latest version of a file which has been replaced one or more times
- Be able to deposit content which is not yet ready to archive or put into an ingest workflow
- Allow files to be deposited in segments which are re-assembled on the server-side
- Be able to provide one or more files (individual files or packages) as URI references for the repository to download as part of the deposit process.