There are 4 major new features in SWORDv3:
Servers MAY implement Concurrency Control, to prevent clients from unintentionally overwriting data.
The Server provides the ETag
header on every response, which contains a unique version number for the Object.
The client must then provide the If-Match
header with every request to change data, which reflects the latest ETag
Objects may change for a number of reasons after their initial creation, such as:
Servers are not required to support Concurrency Control.
Clients MUST check response headers for the presence of an ETag
. Presence of the ETag
indicates that the server requires the client to
pay attention to its concurrency control procedures, and to carry out later requests with an If-Match
header.
ETag
MUST be provided for each SWORD resource: the Object, the Metadata, the FileSet and any Files.ETag
that it expects to represent the current version with every request to change the resource (POST, PUT,
DELETE) by placing it in the If-Match
headerETag
supplied by the client in the If-Match
header does not match the current ETag
for the resource, the deposit will failETag
MUST changeETags
of all resources within which it is contained MUST change.SWORD allows the client to deposit arbitrary metadata onto the server through agnostic support for metadata formats
The server can list Metadata formats that it will accept in the acceptMetadata
field of the Service Document.
If no acceptMetadata
field is present, the client MUST assume the server only supports the default SWORD metadata format
(http://purl.org/net/sword/3.0/types/Metadata).
{
"acceptMetadata": [
"http://purl.org/net/sword/3.0/types/Metadata"
]
}
During deposit, the client SHOULD specify a Metadata-Format
header which contains the identifier for the format. For example, if supplying
the default SWORD metadata format:
Metadata-Format: http://purl.org/net/sword/3.0/types/Metadata
POST /Service-URL HTTP/1.1
Authorization: ...
Content-Disposition: ...
Content-Type: application/json
Digest: ...
Metadata-Format: http://purl.org/net/sword/3.0/types/Metadata
[Metadata Document]
HTTP/1.1 201
Content-Type: application/json
[Resource created, responds with Status Document]
SWORD provides a default metadata format which MUST be supported by the server.
It is serialised as JSON and with a JSON-LD @context
It contains dc
and dcterms
vocabulary elements, and any other arbitrary elements added by the client
It does not pre-suppose any particular profile of usage of these vocabulary elements.
{
"@context" : "https://swordapp.github.io/swordv3/swordv3.jsonld",
"@id" : "http://example.com/object/1/metadata",
"@type" : "Metadata",
"dc:title" : "The title",
"dcterms:abstract" : "This is my abstract",
"dc:contributor" : "A.N. Other"
}
POST Service-URL
Content-Type: application/xml
Content-Disposition: attachment; metadata=true
Digest: SHA-256=74b2851bd2760785b0987ba219debea69c228353f7ccc67a2bdcd9819f97fc71
Metadata-Format: http://www.loc.gov/mods/v3
<mods xmlns:mods="http://www.loc.gov/mods/v3">
<originInfo>
<place>
<placeTerm type="code" authority="marccountry">nyu</placeTerm>
<placeTerm type="text">Ithaca, NY</placeTerm>
</place>
<publisher>Cornell University Press</publisher>
<copyrightDate>1999</copyrightDate>
</originInfo>
</mods>
If a client has a very large file that it wishes to transfer to the server by value, then in may be beneficial to do this in several small operations, rather than as a single large operation.
In order to transfer a large file, the client can break it down into a number of equally sized segments of binary data (the final segment may be a different size to the rest). It can then initialise a Segmented File Upload with the server, and then transfer the segments. The server will reconstitute these segments into a single file, and then the client may deposit this file by-reference.
Servers MAY support Segmented File Upload. To do so, it must provide a staging area where file segments can be uploaded prior to the client requesting a specific deposit operation. In the Service Document:
{
"maxAssembledSize": 30000000000000,
"maxSegmentSize": 16777216000,
"maxSegments": 1000,
"minSegmentSize": 1,
"staging": "http://example.com/staging",
"stagingMaxIdle": 3600
}
Obtain the Staging-URL from the Service from which to request an Temporary-URL
Request a Temporary-URL from the Service, via a Segmented Upload Initialisation request.
Upload all the file segments to the Temporary-URL
Carry out the desired deposit operation as a By-Reference deposit, using the Temporary-URL as the by-reference file.
POST /Staging-URL HTTP/1.1
HTTP/1.1 201
[Temporary-URL created]
POST /Temporary-URL HTTP/1.1
Authorization: ...
Content-Disposition: ...
Content-Length: ...
Digest: ...
[Segment to be added to the Resource.]
HTTP/1.1 204
[Segment Received]
At any point after creating a Temporary-URL, the client may request information on the state of their Segmented File Upload. This can be done via a GET to the Temporary-URL.
{
"@context": "https://swordapp.github.io/swordv3/swordv3.jsonld",
"@id": "http://example.com/temporary/1",
"@type": "Temporary",
"received": [
1,
2,
4
],
"expecting": [
3,
5
],
"assembledSize": 10000000,
"segmentSize": 2000000
}
By-Reference Deposit is when the client provides the server with URLs for Files which it would like the server to retrieve asynchronously.
This could be useful in a number of contexts, such as when the files are very large, and are stored on specialist staging hardware, or where the files are already readily available elsewhere.
Servers MAY support By-Reference deposit. If a server supports By-Reference it SHOULD indicate this in the Service Document
using the field byReferenceDeposit
:
{
"byReferenceDeposit": true
}
Clients may use a By-Reference Deposit anywhere a by-value deposit could be carried out. Instead of sending any Binary content, the client sends the By-Reference Document containing one or more (depending on context) URLs to files which the server can retrieve.
{
"@context" : "https://swordapp.github.io/swordv3/swordv3.jsonld",
"@type" : "ByReference",
"byReferenceFiles" : [
{
"@id" : "http://www.otherorg.ac.uk/by-reference/file.zip",
"contentType" : "application/zip",
"contentLength" : 123456,
"contentDisposition" : "attachment; filename=file.zip",
"packaging" : "http://purl.org/net/sword/packaging/SimpleZip",
"digest" : "SHA256=....",
"ttl" : "2018-04-16T00:00:00Z",
"dereference" : true
}
]
}
If carrying out a Segmented File Upload, the final deposit stage is to send the Temporary-URL to the server as part of a By-Reference deposit.
{
"@context" : "https://swordapp.github.io/swordv3/swordv3.jsonld",
"@type" : "ByReference",
"byReferenceFiles" : [
{
"@id" : "[Temporary-URL]",
"contentType" : "application/zip",
"contentLength" : 123456,
"contentDisposition" : "attachment; filename=file.zip",
"packaging" : "http://purl.org/net/sword/packaging/SimpleZip",
"digest" : "SHA256=...."
}
]
}
The server receives a By-Reference Document with one or more files listed and creates records for each of these files that it plans to dereference.
The server responds to the client with the appropriate response for the action
At its own pace the server obtains all the files that are marked for dereference.
Once the Files are downloaded and processed, the server sets the file status appropriately in the Status Document
If there is an error in downloading or otherwise processing the file, the server sets the status to error and provides a meaningful log message.