Version: SWORD 3.0

Last modified: 2021-09-01 09:54

See also: SWORD 3.0 Behaviours which provides a denormalised view of the specification's protocol operations, especially useful for implementers.

1. Credits

Technical Lead: Richard Jones, Cottage Labs

Community Lead: Neil Jefferies, University of Oxford

Funded By: NII, Jisc, EBSCO

Funder Liaisons: Masaharu Hayashi, NII; Dom Fripp, Jisc; Christopher Spalding, EBSCO

Technical Advisory Group: Adam Rehin, Adrian Stevenson, Alan Stiles, Alex Dutton, Catherine Jones, Claire Knowles, David Moles, David Wilcox, Eoghan Ó Carragáin, Erick Peirson, Gertjan Filarski, Goosyara Kovbasniy, Graham Triggs, Hideaki Takeda, Jan van Mansum, Jauco Noordzij, Jochen Schirrwagen, John Chodacki, Justin Simpson, Lars Holm Nielsen, Marisa Strong, Martin Wrigley, Masaharu Hayashi, Masud Khokhar, Mike Jackson, Morane Gruenpeter, Neil Chue Hong, Paul Walk, Peter Sefton, Ralf Claussnitzer, Ricardo Otelo Santos Saraiva Cruz, Richard Rodgers, Scott Wilson, Shannon Searle, Stephanie Taylor, Stuart Lewis, Tomasz Parkola, Vitali Peil

2. Introduction

SWORD 3.0 is a protocol enabling clients and servers to communicate around complex digital objects, especially with regard to supporting the deposit of these objects into a service like a digital repository. Complex digital objects consist of both Metadata and File content, where the Files may be in a variety of formats, there may be many files, and some may be very large. The protocol defines semantics for creating, appending, replacing, deleting, and retrieving information about these complex resources. It also enables servers to communicate regarding the status of treatment of deposited content, such as exposing ingest workflow information.

The first major version of SWORD [SWORD 1.3] built upon the Resouce creation aspects of AtomPub [AtomPub] to enable fire-and-forget package deposit onto a server.

This approach, where the depositor has no further interaction with the server is of significant value in certain use cases, but there are others where this is insufficient. Consider, for example, that the depositor wishes to construct a digital artifact file by file over a period of time before deciding that it is time to archive it. In these cases, a higher level of interactivity between the participating systems is required, and this is the role that SWORD 2.0 [SWORD 2.0] was subsequently developed to fulfil.

As the use cases for SWORD have developed further, it became clear that the increasing size of files repositories were being asked to deal with was an issue. As a result of this, and the fact that the technological approach for SWORD 2.0 was starting to show its age, a new version, SWORD 3.0, has been developed. This is a radical departure from SWORD 2.0, eliminating ties with AtomPub, and moving to a much stricter REST+JSON approach, utilising JSON-LD for alignment with Linked Data. Its key differences to SWORD 2.0 from a functional perspective are:

Support for By-Reference file deposit
Support for Segmented file deposit
More advanced native packaging and metadata formats

3. Notational Conventions

The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in [RFC2119].

4. Terminology

4.1. URLs

File-URL: A single Binary File within the Object
FileSet-URL: The aggregate of all Binary Files associated with the Object which are available for SWORD protocol operations to be carried out on them
Metadata-URL: The Metadata resource associated with the Object
Object-URL: An Object that exists on the server, probably as a result of a deposit operation, which is a container for Metadata and zero or more Files.
Service-URL: The location of the document which describes the server's capabilities for the user, and which may accept initial deposits
Staging-URL: A URL provided by the Service where clients can initialise segmented file upload requests
Temporary-URL: A staging area where file segments can be uploaded to the server prior to a deposit operation, obtained from the Staging-URL

4.2. Document Types

Service Document: Describes the capabilities of the server with respect to the user
Metadata Document: A format for depositing and retrieving object Metadata
By-Reference Document: A format for describing one or more files to be deposited By-Reference.
Metadata+By-Reference Document: A single expression of both the Metadata and the By-Reference file deposits.
Status Document: A document describing the current status of the Object and its components
Binary File: An opaque binary file
Packaged Content: A serialisation of the entire Object, consisting of its Metadata and Binary Files.
Error Document: Describes an error that occurred while processing a request.
Segmented File Upload Document: A document describing the current status of a Segmented File Upload

4.3. Namespaces

http://purl.org/net/sword/3.0/: All SWORDv3 extensions are defined within this namespace. This Namespace also serves to identify the SWORD version for a given Service.
http://purl.org/net/sword/3.0/terms/: All predicates associated with SWORDv3
http://purl.org/net/sword/3.0/package/: All packaged formats defined by SWORDv3
http://purl.org/net/sword/3.0/error/: All error documents defined by SWORDv3
http://purl.org/net/sword/3.0/state/: Terms used to describe the state of Objects in SWORDv3
http://purl.org/net/sword/3.0/types/: Namespace for all document types used in SWORDv3
http://purl.org/net/sword/3.0/filestate/: Terms used to descript he state of Files in SWORDv3
http://purl.org/net/sword/3.0/discovery/: Terms used for auto-discovery of SWORDv3 services
http://purl.org/dc/elements/1.1/: The Simple Dublin Core elements. This document uses the prefix dc for this namespace name; for example dc:title
http://purl.org/dc/terms/: The Extended Dublin Core terms. This document uses the prefix dcterms for the namespace name; for example dcterms:abstract

5. Structure of SWORD Objects

Objects, as represented by SWORD, have the following structure:

Figure 1: Structure of a SWORD Object

All Metadata and all types of Files are contained within the Object
Some Files are expressions of the Object's metadata.
There is a single Metadata Resource which is the abstract representation of the Metadata through which SWORD operations are carried out
Some Files are expressions of the Object as Packaged Content
Some Files are considered part of the "FileSet", which means they are available for SWORD operations to be carried out on them (replaced, removed)
The Object may contain arbitrary other Files which do not fit into the above categories

The SWORD Object is expressed as JSON via the Status Document, along with all its supporting metadata and workflow information.

Each of the three primary File categories can be identified by their rel values, as they appear in the Status Document:

Metadata Expressions: http://purl.org/net/sword/3.0/terms/formattedMetadata
FileSet Files: http://purl.org/net/sword/3.0/terms/fileSetFile
Packaged Content Expressions: http://purl.org/net/sword/terms/packagedContent

6. HTTP Headers

These are the HTTP headers used by SWORD, and their meanings within the context of the protocol. Where a Default Value is specified, this is what value the client or server MUST take the value to be if it is not provided explicitly in a request or response.

Header	Usage
Authorization	To pass any HTTP authorization headers, such as the content for basic auth
Content-Disposition	Used to transmit information to the server which tells it the nature of the deposit, and any associated parameters
Content-Length	Length of the content in the current payload
Content-Type	Mimetype of the content being delivered
Digest	Checksum for the depositing content. MUST include SHA-256, and allows for other formats such as MD5 and SHA (SHA-1) if still needed by the server.
ETag	Object version identifier, as provided by the server on GET requests and any requests which modify the object and return.
If-Match	Used to provider the server’s Object version identifier (ETag) for the version on which this request is intended to act. If the supplied ETag does not match, this means that the version on the server has changed since the client’s last operation, the server MUST reject the update. The client will need to retrieve the latest ETag and re-issue the request, taking into account any changes.
In-Progress	Whether this operation is part of a larger deposit operation, and the server should expect subsequent related requests before injecting the item into any ingest workflows. Default Value: false
Location	URI for the location where the requested or deposited content can be found
On-Behalf-Of	Username of any user the action is being carried out on behalf of
Packaging	URI unambiguously identifying the packaging profile Default Value: http://purl.org/net/sword/3.0/package/Binary
Slug	Suggested identifier for the item
Metadata-Format	URI unambiguously identifying the metadata format/schema/profile Default Value: http://purl.org/net/sword/3.0/types/Metadata

7. Protocol Operations

This section lists the actual on-the-wire protocol operations that are part of SWORDv3. Actual usage of each of these operations is dependent on the action that you wish to take. See Protocol Requirements for the rules which govern how to use these Protocol Operations.

The full set of protocol operations is available as an OpenAPI definition [OpenAPI], available as JSON and YAML.

7.1. Error Responses

The following error responses are possible against some or all of the HTTP Requests. In each case an Error Document MUST be returned by the server with details as to the root cause of the error.

400 (BadRequest) - The server could not understand your request. Either your headers or content body are wrong or malformed. (see Error Types: BadRequest, ByReferenceFileSizeExceeded, ContentMalformed, InvalidSegmentSize, MaxAssembledSizeExceeded, SegmentLimitExceeded, UnexpectedSegment)
401 (Unauthorized) - You have not provided authentication information, please do so (see Error Types: AuthenticationRequired)
403 (Forbidden) - You are not authorised to access this resource, or the operation you requested is not possible in this context (see Error Types: AuthenticationFailed, Forbidden)
404 (Not Found) - There is no resource available at the URL you requested
405 (Method Not Allowed) - The HTTP method you requested on the resource is not permitted/available in this context (see Error Types: MethodNotAllowed, SegmentedUploadTimedOut)
410 (Gone) - The resource existed in the past but is no longer present at the URL you requested
412 (Precondition Failed) - There is a problem implementing the request as-is. This can happen for the following reasons: your checksums may not match, you may have requested mediated deposit when the server does not support that, your headers may not be consistent with each other, your If-Match headers may not mat the current ETag, or your Segmented Upload Initialisation request may not be within parameters acceptable to the server. (see Error Types: ByReferenceNotAllowed, DigestMismatch, ETagNotMatched, ETagRequired, OnBehalfOfNotAllowed)
413 (Payload Too Large) - Your request body exceeds the size allowed by the server (see Error Types: MaxUploadSizeExceeded)
415 (Unsupported Media Type) - The metadata format is not the same as that identified in Metadata-Format and/or it is not supported by the server, or the packaging format is not the same as that identified in Packaging and/or it is not supported by the server (see Error Types: ContentTypeNotAcceptable, FormatHeaderMismatch, MetadataFormatNotAcceptable, PackagingFormatNotAcceptable)

7.2. Redirects

Some requests may result in redirect codes being sent to the client; the server MAY respond to any request with a suitable redirect. These are the redirect codes that are used, and what they mean:

301 (Moved Permanently) - The URL you requested has changed, re-send this request and all future requests to the new URL
307 (Temporary Redirect) - The URL you requested has temporarily changed, re-send this request to the new URL
308 (Permanent Redirect) - The URL you requested has changed, re-send this request and all future requests to the new URL

7.3. HTTP Requests

These are the HTTP requests that are covered by the SWORD protocol.

Each request MAY be responded to by the server with a redirect code (see above). Each request MAY also generate an error; possible errors are listed for each section, please refer to the section above for details on the meanings of errors.

7.3.1. GET Service-URL

Retrieve the Service Document

Headers

Authorization
On-Behalf-Of

Responses

Code	Description
200	Service Document Body application/json
401
403
404
410

7.3.2. POST Service-URL

Make a new Object

Headers

Authorization
Content-Disposition
Content-Length
Content-Type
Digest
In-Progress
Metadata-Format
On-Behalf-Of
Packaging
Slug

Body

Content used to create new Object. This can be one of: Metadata, By-Reference, Metadata+By-Reference, Binary File, Packaged Content, Empty Body

Responses

Code	Description
201	Resource created, responds with Status Document Headers Location - Object-URL ETag - version identifier Body application/json
202	Resource accepted for processing, responds with Status Document Headers Location - Object-URL ETag - version identifier Body application/json
400
401
403
404
405
412
413
415

7.3.3. GET Object-URL

Retrieve the Status information for the Object

Headers

Authorization
On-Behalf-Of

Responses

Code	Description
200	Status Document Headers ETag - version identifier Body application/json
400
401
403
404
410
412

7.3.4. POST Object-URL

Append data to an Object

Headers

Authorization
Content-Disposition
Content-Length
Content-Type
Digest
If-Match
In-Progress
On-Behalf-Of
Packaging
Metadata-Format

Body

Content to be appended to the Object. This can be one of: Metadata, By-Reference, Metadata+By-Reference, Binary File, Packaged Content, Empty Body

Responses

Code	Description
200	Content appended, responds with Status Document Headers Location - The File-URL of the Original Deposit File if present ETag - version identifier Body application/json
202	Content accepted for append, responds with Status Document Headers Location - The File-URL of the Original Deposit File if present ETag - version identifier Body application/json
400
401
403
404
405
412
413
415

7.3.5. PUT Object-URL

Replace the Object

Headers

Authorization
Content-Disposition
Content-Length
Content-Type
Digest
If-Match
In-Progress
On-Behalf-Of
Packaging
Metadata-Format

Body

Content to replace the Object. This can be one of: Metadata, By-Reference, Metadata+By-Reference, Binary File, Packaged Content, Empty Body

Responses

Code	Description
200	Replace carried out, responds with Status Document Headers ETag - version identifier Body application/json
202	Replace accepted for action, responds with Status Document Headers ETag - version identifier Body application/json
400
401
403
404
405
412
413
415

7.3.6. DELETE Object-URL

Delete the Object

Headers

Authorization
If-Match
On-Behalf-Of

Responses

Code	Description
202	Delete request accepted for processing Body None
204	Object Deleted Body None
400
401
403
404
405
412

7.3.7. GET Metadata-URL

Retrieve the Metadata

Headers

Authorization
On-Behalf-Of

Responses

Code	Description
200	Metadata Document Headers ETag - version identifier Body application/json
400
401
403
404
405
410
412

7.3.8. PUT Metadata-URL

Replace the Metadata

Headers

Authorization
Content-Disposition
Content-Length
Content-Type
Digest
If-Match
On-Behalf-Of
Metadata-Format

Body

Content to replace the Metadata. This must be a Metadata Document.

Responses

Code	Description
204	Metadata Replaced, no response body Headers ETag - version identifier Body None
400
401
403
404
405
412
413
415

7.3.9. DELETE Metadata-URL

Delete the metadata of an Object

Headers

Authorization
If-Match
On-Behalf-Of

Responses

Code	Description
202	Delete request accepted for processing Body None
204	Metadata Deleted Body None
400
401
403
404
405
412

7.3.10. PUT FileSet-URL

Replace the FileSet

Headers

Authorization
Content-Disposition
Content-Length
Content-Type
Digest
If-Match
On-Behalf-Of
Packaging

Body

Content to replace the FileSet. This can be one of: By-Reference, Binary File, Empty Body

Responses

Code	Description
202	FileSet replacement accepted for processing, no response body Headers ETag - version identifier Body None
204	FileSet Replaced, no response body Headers ETag - version identifier Body None
400
401
403
404
405
412
413
415

7.3.11. DELETE FileSet-URL

Delete the FileSet

Headers

Authorization
If-Match
On-Behalf-Of

Responses

Code	Description
202	Delete request accepted for processing Body None
204	FileSet Deleted Body None
400
401
403
404
405
412

7.3.12. GET File-URL

Retrieve an individual File

Headers

Authorization
On-Behalf-Of

Responses

Code	Description
200	Binary File Headers ETag - version identifier Body /
400
401
403
404
405
410
412

7.3.13. PUT File-URL

Replace an individual File

Headers

Authorization
Content-Disposition
Content-Length
Content-Type
Digest
If-Match
On-Behalf-Of

Body

Content to replace the File. This can be one of: By-Reference, Binary File, Empty Body

Responses

Code	Description
204	Binary File replaced, no response body Headers ETag - version identifier Body None
400
401
403
404
405
410
412
413
415

7.3.14. DELETE File-URL

Delete an individual File

Headers

Authorization
If-Match
On-Behalf-Of

Responses

Code	Description
202	Delete request accepted for processing Body None
204	Binary File Deleted Body None
400
401
403
404
405
412

7.3.15. POST Staging-URL

Create a Temporary-URL for Segmented File Upload

Headers

Authorization
Content-Disposition
On-Behalf-Of

Responses

Code	Description
201	Temporary-URL created Headers Location - The Temporary-URL to which Segmented File Upload requests can be sent Body None
400
401
403
404
412
413

7.3.16. GET Temporary-URL

Retrieve Information on a Segmented File Upload

Headers

Authorization
On-Behalf-Of

Responses

Code	Description
200	Segmented File Upload Document Body application/json
400
401
403
404
410

7.3.17. POST Temporary-URL

Upload a File Segment

Headers

Authorization
Content-Disposition
Content-Length
Digest
On-Behalf-Of

Body

Segment to be added to the Resource.

Responses

Code	Description
204	Segment Received Body None
400
401
403
404
405
412

7.3.18. DELETE Temporary-URL

Abort a Segmented File Upload

Headers

Authorization
On-Behalf-Of

Responses

Code	Description
202	Delete request accepted for processing Body None
204	Temporary File Deleted Body None
400
401
403
404

8. Protocol Requirements

This section describes the requirements of every kind of operation that you can do with SWORDv3. Each section in Requirement Groups identifies which Request Conditions have what requirements. To determine the requirements for a specific request, identify each block below which is relevant to your request, and this will provide the overall protocol requirements for that operation.

Converting the below into a set of requirements for a specific request is time consuming, so this has been done for you in the SWORDv3 Behaviours Document. If you are implementing a SWORD client or server it is STRONGLY RECOMMENDED that you work from that document rather than the normalised requirements below.

There are 3 key aspects of the Request Conditions where requirements can be applied, and these are:

Request: The operation that you are performing on the resources
Content: The body content of the request, such as Metadata, By-Reference, Metadata+ByReference, Binary File, Packaged Content, Empty Body
Resource: Service-URL^[def], Object-URL^[def], Metadata-URL^[def], FileSet-URL^[def], File-URL^[def], Staging-URL^[def], Temporary-URL^[def]

When combined for a specific request, these aspects tell you the exact requirements. For example: Creating (Request) a new Object by request to the Service-URL (Resource) with Packaged Content (Content)

Each of these aspects of the Request Conditions are presented below according to a hierarchy. For a specific aspect, you must import the requirements for it and all its parents in the hierarchy, to obtain all the requierements for the request.

For each Request Condition, up to 4 kinds of requirement are present:

Protocol Operation: which of the protocol operations MUST be used for this request
Request Requirements: constraints applied to the client request
Server Requirements: constraints applied to how the server handles the request
Response Requirements: constraints applied to how the server responds to the request

See the document SWORDv3 Behaviours to see each of the behaviours SWORDv3 is capable of with its requirements fully expanded.

8.1. Requirement Hierarchies

The hierarchy for the Request is:

All
- Retrieve
- Modify
  - Create
  - Update
    - Append
    - Replace
- Complete
- Delete

The hierarchy for the Content is:

All
- Body
  - JSON
    - Metadata
    - By-Reference
    - MD+BR
  - Binary
    - Binary File
    - Packaged Content
    - File Segment
- Empty Body

The hierarchy for the Resource is:

All
- Deposit
  - Service-URL
  - Object-URL
  - Components
    - Metadata-URL
    - FileSet-URL
    - File-URL
- Staging
  - Staging-URL
  - Temporary-URL

So, for example, when considering an Request Condition such as "Creating Objects with Packaged Content", this would be take requirements as follows:

For the Request, as a Create, it takes requirements from Create, Modify and All
For the Content, as Packaged Content, it takes requirements from Packaged Content, Binary, Body and All
For the Resource, as an operation on the Service-URL, it takes requirements from Service-URL, Deposit and All

8.2. Requirement Groups

Request Conditions:

Request: All
Content: All
Resource: All

Request Requirements

MAY specify Authorization and On-Behalf-Of headers (i.e. if authenticating this request)

Server Requirements

If Authorization (and optionally On-Behalf-Of) headers are provided, MUST authenticate the request

Error Responses

If no authentication credentials were supplied, but were expected, MUST respond with a 401 (AuthenticationRequired)
If authentication fails with supplied credentials, MUST respond with a 403 (AuthenticationFailed)
If the server does not allow this method in this context at this time, MAY respond with a 405 (MethodNotAllowed)
If the server does not support On-Behalf-Of deposit and the On-Behalf-Of header has been provided, MAY respond with a 412 (OnBehalfOfNotAllowed)

Request Conditions:

Request: Retrieve
Content: All
Resource: Service-URL

Protocol Operation

GET Service-URL

Response Requirements

If Authorization (and optionally On-Behalf-Of) headers are provided, MUST only list Service-URLs in the Service Document for which a deposit request would be permitted
MUST respond with a valid Service Document or a suitable error response

Request Conditions:

Request: Retrieve
Content: All
Resource: Object-URL

Protocol Operation

GET Object-URL

Response Requirements

MUST respond with a valid Status Document or a suitable error response
MUST include ETag header if implementing concurrency control

Request Conditions:

Request: Retrieve
Content: All
Resource: Components

Response Requirements

MUST include ETag header if implementing concurrency control

Request Conditions:

Request: Retrieve
Content: All
Resource: Metadata-URL

Protocol Operation

GET Metadata-URL

Response Requirements

MUST respond with a valid Metadata Document (see definition below) or a suitable error response

Request Conditions:

Request: Retrieve
Content: All
Resource: File-URL

Protocol Operation

GET File-URL

Response Requirements

MUST respond with a File (which may be Packaged Content, a Binary File, a Metadata Document, or any other file that the server exposes) or a suitable error response

Request Conditions:

Request: Modify
Content: All
Resource: All

Request Requirements

MUST provide the Content-Disposition header, with the appropriate value for the request

Request Conditions:

Request: Modify
Content: All
Resource: Deposit

Response Requirements

MUST include ETag header if implementing concurrency control

Request Conditions:

Request: Modify
Content: Body
Resource: All

Request Requirements

MUST provide the Digest header
SHOULD provide the Content-Length

Server Requirements

MUST verify that the content matches the Digest header
MUST verify that the supplied content matches the Content-Length if this is provided

Error Responses

If one or more of the digests provided by the client that the server checked did not match, MAY respond with 412 (DigestMismatch). Note that servers MAY NOT check digests in real-time.
If the body content could not be read correctly, MAY return a 400 (ContentMalformed)

Request Conditions:

Request: Modify
Content: Body
Resource: Deposit

Request Requirements

MUST provide the Content-Type header

Server Requirements

If all preconditions are met, MUST either accept the deposit request immediately, queue the request for processing, or respond with an error

Response Requirements

MUST include one or more File-URLs for the deposited content in the Status document. The behaviour of these File-URLs may vary depending on the type of content deposited (e.g. ByReference and Segmented Uploads do not need to be immediately retrievable)

Error Responses

If the Content-Type header contains a format that the server cannot accept, MUST respond with 415 (ContentTypeNotAcceptable)
If the body content is larger than the maximum allowed by the server, MAY return 413 (MaxUploadSizeExceeded)

Request Conditions:

Request: Modify
Content: Metadata
Resource: All

Request Requirements

SHOULD provide the Metadata-Format header
MUST provide only the Metadata Document

Server Requirements

If no Metadata-Format header is provided, MUST assume this is the standard SWORD format: http://purl.org/net/sword/3.0/types/Metadata

Error Responses

If the Metadata-Format header indicates a format the server does not support, MUST return 415 (MetadataFormatNotAcceptable)

Request Conditions:

Request: Modify
Content: Metadata
Resource: Deposit

Error Responses

If the Metadata-Format header does not match the format found in the body content, MAY return 415 (FormatHeaderMismatch)

Request Conditions:

Request: Modify
Content: By-Reference
Resource: All

Request Requirements

MUST provide the By-Reference Document

Server Requirements

If downloading copies of the files in the By-Reference Document, MUST do this asynchronously to the deposit request

Error Responses

If rejecting the request due to the announced file size, MUST respond with a 400 (ByReferenceFileSizeExceeded)
If the server does not support By-Reference, MUST respond with a 412 (ByReferenceNotAllowed)

Request Conditions:

Request: Modify
Content: By-Reference
Resource: File-URL

Request Requirements

MUST only include a single By-Reference File in the By-Reference Document

Server Requirements

If more than one By-Reference File is present, MUST reject the request.

Error Responses

If rejecting the request due to the presence of more than one By-Reference File in the By-Reference Document, MUST respond with a 400 (BadRequest)

Request Conditions:

Request: Modify
Content: MD+BR
Resource: All

Request Requirements

SHOULD provide the Metadata-Format header
MUST provide the Metadata+By Reference Document

Server Requirements

If no Metadata-Format header is provided, MUST assume this is the standard SWORD format: http://purl.org/net/sword/3.0/types/Metadata
If downloading copies of the files in the By-Reference Document, MUST do this asynchronously to the deposit request

Error Responses

If rejecting the request due to the announced file size, MUST respond with a 400 (ByReferenceFileSizeExceeded)
If the server does not support By-Reference, MUST respond with a 412 (ByReferenceNotAllowed)

Request Conditions:

Request: Modify
Content: MD+BR
Resource: Deposit

Error Responses

If the Metadata-Format header does not match the format found in the body content, MAY return 415 (FormatHeaderMismatch)
If the Metadata-Format header indicates a format the server does not support, MUST return 415 (MetadataFormatNotAcceptable)

Request Conditions:

Request: Modify
Content: Binary
Resource: Deposit

Server Requirements

If accepting the request MUST attach the supplied file to the Object as an originalDeposit

Request Conditions:

Request: Modify
Content: Binary File
Resource: All

Request Requirements

MAY provide the Packaging header, and if so MUST be the Binary format identifier
MUST provide Binary File body content

Server Requirements

The server SHOULD NOT attempt to unpack the file

Request Conditions:

Request: Modify
Content: Packaged Content
Resource: All

Request Requirements

MUST provide the Packaging header
MUST provide Packaged Content in the request body

Server Requirements

The server MAY attempt to unpack the file, and create derivedResources from it.

Error Responses

If the server does not accept packages in the format identified in the Packaging header, MUST respond with a 415 (PackagingFormatNotAcceptable)

Request Conditions:

Request: Modify
Content: Packaged Content
Resource: Deposit

Error Responses

If the Packaging header does not match the format found in the body content, SHOULD return 415 (FormatHeaderMismatch). Note that the server may not be able to inspect the package during the request-response, so MAY NOT return this response.

Request Conditions:

Request: Modify
Content: Empty Body
Resource: All

Request Requirements

MAY provide the Content-Length header with value 0
MUST NOT include any body content

Request Conditions:

Request: Create
Content: All
Resource: Service-URL

Protocol Operation

POST Service-URL

Request Requirements

MAY provide the Slug header
MAY provide the In-Progress header

Server Requirements

If a Slug header is provided, MAY use this as the identifier for the newly created Object.
If accepting the request MUST create a new Object
If no In-Progress header is provided, MUST assume that it is false
If In-Progress is false, SHOULD expect further updates to the item, and not progress it through any ingest workflows yet.

Response Requirements

MUST respond with a Location header, containing the Object-URL
MUST respond with a valid Status Document or a suitable error response
Status Document MUST be available on GET to the Object-URL in the Location header immediately (irrespective of whether this is a 201 or 202 response)
MUST respond with a 201 if the item was created immediately, a 202 if the item was queued for import, or raise an error.

Request Conditions:

Request: Create
Content: Metadata
Resource: Service-URL

Server Requirements

If accepting the request MUST populate the Object with the supplied Metadata

Request Conditions:

Request: Create
Content: By-Reference
Resource: Service-URL

Server Requirements

If accepting the request MUST attach the By-Reference files to the Object.

Request Conditions:

Request: Create
Content: MD+BR
Resource: Service-URL

Server Requirements

If accepting the request MUST populate the Object with the supplied Metadata, and attach the By-Reference files to it.

Request Conditions:

Request: Create
Content: Empty Body
Resource: Staging-URL

Protocol Operation

POST Staging-URL

Server Requirements

If all preconditions are met, MUST create a resource to which the client can upload file segments
MUST reject the request if the conditions of the upload are not acceptable

Response Requirements

MUST respond with a 201 to indicate that the Segmented Upload has been initialised, or raise an error.
MUST respond with a Location header containing the Temporary-URL where the client can upload file segments

Error Responses

If the proposed final assembled file size is larger than the server's limit, MUST return 400 (MaxAssembledSizeExceeded)
If the proposed segment size is not within the parameters the server supports, MUST return 400 (InvalidSegmentSize)
If the proposed number of segments is not within the parameters the server supports, MUST return 400 (SegmentLimitExceeded)

Request Conditions:

Request: Update
Content: All
Resource: Deposit

Request Requirements

MUST include the If-Match header, if the server implements concurrency control

Server Requirements

MUST reject the request if the If-Match header does not match the current ETag of the resource

Error Responses

For servers implementing concurrency control, if the If-Match header does not match the current ETag, MUST respond with 412 (ETagNotMatched)
For servers implementing concurrency control, if no If-Match header is provided, MUST respond with 412 (ETagRequired)

Request Conditions:

Request: Update
Content: Body
Resource: Components

Response Requirements

MUST respond with a 204 if the replacement was deposited immediately, a 202 if the replacement was queued for import, or raise an error.

Request Conditions:

Request: Update
Content: Body
Resource: Object-URL

Request Requirements

MAY provide the In-Progress header

Server Requirements

If no In-Progress header is provided, MUST assume that it is false

Response Requirements

MUST respond with a valid Status Document or a suitable error response
MUST include ETag header if implementing concurrency control
MUST respond with a 200 if the request was accepted immediately, a 202 if the request was queued for processing, or raise an error.

Request Conditions:

Request: Append
Content: All
Resource: Object-URL

Protocol Operation

POST Object-URL

Request Conditions:

Request: Append
Content: Binary
Resource: Object-URL

Response Requirements

MUST respond with a Location header, containing the File-URL of the Original Deposit File

Request Conditions:

Request: Append
Content: Metadata
Resource: Object-URL

Server Requirements

If accepting the new Metadata MUST add the Metadata to the item, and only treat this as an extension to existing Metadata. The server MUST NOT overwrite or otherwise remove existing Metadata.

Request Conditions:

Request: Replace
Content: All
Resource: Object-URL

Protocol Operation

PUT Object-URL

Request Conditions:

Request: Replace
Content: Binary
Resource: Object-URL

Server Requirements

If accepting the new File, MUST remove all existing Files from the Object and replace with the new File. The new File should be marked as an originalDeposit. The server MUST also remove all Metadata, so the Metadata Resource contains no fields.

Request Conditions:

Request: Replace
Content: Metadata
Resource: Object-URL

Server Requirements

If accepting the new Metadata, MUST remove all existing Files from the Object, and MUST replace the existing Metadata with the new.

Request Conditions:

Request: Replace
Content: By-Reference
Resource: Object-URL

Server Requirements

If accepting the new By-Reference files, MUST remove all existing Files from the Object and replace with the By-Reference files. The server MUST remove the existing Files immediately, even before the By-Reference files have dereferenced. The new files MUST be marked as originalDeposits. The server MUST also remove all Metadata, so the Metadata Resource contains no fields.

Request Conditions:

Request: Replace
Content: MD+BR
Resource: Object-URL

Server Requirements

If accepting the new Metadata and By-Reference files, MUST remove all existing Files from the Object and replace with the By-Reference files. The server MUST remove the existing Files immediately, even before the By-Reference files have dereferenced. The server MUST also replace all existing Metadata with the new Metadata.

Request Conditions:

Request: Replace
Content: Metadata
Resource: Metadata-URL

Server Requirements

If accepting the new Metadata MUST entirely replace the existing Metadata with the new.

Request Conditions:

Request: Replace
Content: By-Reference
Resource: FileSet-URL

Server Requirements

If accepting the new By-Reference Files, MUST replace the existing FileSet with the new files. The server MUST remove all the old files immediately, even before the new By-Reference files have been dereferenced. The new Files MUST be marked as originalDeposits

Request Conditions:

Request: Replace
Content: Binary File
Resource: FileSet-URL

Server Requirements

If accepting the new File, MUST replace the existing FileSet with a single new File. The File MUST be marked as an originalDeposit

Request Conditions:

Request: Replace
Content: By-Reference
Resource: File-URL

Server Requirements

If accepting the new By-Reference File, MUST replace the existing File. The server MAY keep the previous file as an older version. The new file MUST be marked as an originalDeposit

Request Conditions:

Request: Replace
Content: Binary File
Resource: File-URL

Server Requirements

If accepting the new File, MUST replace the existing File. The server MAY keep the previous file as an older version. The new File MUST be marked as an originalDeposit

Request Conditions:

Request: Append
Content: By-Reference
Resource: Object-URL

Server Requirements

If accepting the request, MUST attach all the By-Reference files to the Object as originalDeposits

Request Conditions:

Request: Append
Content: MD+BR
Resource: Object-URL

Server Requirements

If accepting the request, MUST attach all the By-Reference files to the Object as originalDeposits, and MUST add the Metadata to the item, and only treat this as an extension to existing Metadata. The server MUST NOT overwrite or otherwise remove existing Metadata.

Request Conditions:

Request: Replace
Content: All
Resource: Metadata-URL

Protocol Operation

PUT Metadata-URL

Request Conditions:

Request: Replace
Content: All
Resource: FileSet-URL

Protocol Operation

PUT FileSet-URL

Request Conditions:

Request: Replace
Content: All
Resource: File-URL

Protocol Operation

PUT File-URL

Request Conditions:

Request: Delete
Content: All
Resource: All

Response Requirements

MUST respond with a 204 if the delete is successful, 202 if the delete is queued for processing, or raise an error

Request Conditions:

Request: Delete
Content: All
Resource: Object-URL

Protocol Operation

DELETE Object-URL

Request Conditions:

Request: Delete
Content: All
Resource: FileSet-URL

Protocol Operation

DELETE FileSet-URL

Request Conditions:

Request: Delete
Content: All
Resource: File-URL

Protocol Operation

DELETE File-URL

Request Conditions:

Request: Delete
Content: All
Resource: Metadata-URL

Protocol Operation

DELETE Metadata-URL

Request Conditions:

Request: Delete
Content: All
Resource: Temporary-URL

Protocol Operation

DELETE Temporary-URL

Request Conditions:

Request: Complete
Content: Empty Body
Resource: Object-URL

Protocol Operation

POST Object-URL

Request Requirements

MUST provide the header In-Progress: false
MAY provide the Content-Length header with value 0
MUST NOT include any body content

Server Requirements

MAY inject the content into any ingest workflows

Response Requirements

MUST respond with a 204 or a suitable error

Request Conditions:

Request: Append
Content: Body
Resource: Temporary-URL

Protocol Operation

POST Temporary-URL

Server Requirements

MUST reject the request if the segment is incorrect or unexpected: for example, all segments were already received, or the segment is a different size than expected.

Response Requirements

MUST respond with a 204 or a suitable error

Error Responses

If the provided segment is not the final segment and is not the size that the client had indicated on initialisation, MUST return 400 (InvalidSegmentSize)

Request Conditions:

Request: Append
Content: File Segment
Resource: Temporary-URL

Server Requirements

If all preconditions are met, MUST accept the file segment, and record the receipt of it
MUST be prepared to accept file segments in any order, and in parallel
MUST be able to store the incoming file segments as they arrive, and then reconstitute them into a single file when all segments have been received.

Error Responses

If the Temporary-URL has expired, SHOULD return 410 (SegmentedUploadTimedOut). Servers may also return 404 and no further explanation.
If the segment was not expected, for example all the expected segments have already been sent, or a segment with this segment number has already been received, MUST return 400 (UnexpectedSegment)

Request Conditions:

Request: Retrieve
Content: Empty Body
Resource: Temporary-URL

Protocol Operation

GET Temporary-URL

Response Requirements

MUST respond with a 200 or a suitable error
If successful, MUST respond with a Segmented File Upload Document describing the current state of the upload.

9. Documents

9.1. JSON-LD Context

SWORD defines the semantics of its documents using JSON-LD [JSON-LD]. You can see the full JSON-LD Context here

9.2. Service Document

The Service Document defines the capabilities and operational parameters of the server as a whole, or of a particular Service-URL.

The Service Document consists of a set of properties at the root, and a list of "services". Each service may define a Service-URL and/or additional properties and further nested "services". For the purposes of normalising the data held in the Service Document (for brevity of the serialised document), the Service Document MAY specify at the root properties which MUST be taken to hold true for all nested "services" (at any level below) unless that lower service definition overrides the properties. A service which sits beneath the root of the Service Document and above another Service, MAY also redefine properties, and those overrides MUST be considered to cascade down to Services beneath that one.

A Service Document can be retrieved either for the root of the service, or from any Service within the hierarchy of Services available. If the root Service Document is requested, the full list of Services, including all their children, MUST be provided. If the URL of a Service is requested, it MUST only provide information about itself and its children.

The full JSON Schema [JSON-SCHEMA] can be downloaded here.

An example of the Service Document:

{
  "@context" : "https://swordapp.github.io/swordv3/swordv3.jsonld",

  "@id" : "http://example.com/service-document",
  "@type" : "ServiceDocument",

  "dc:title" : "Site Name",
  "dcterms:abstract" : "Site Description",

  "root" : "http://example.com/service-document",
  "acceptDeposits": true,

  "version": "http://purl.org/net/sword/3.0",
  "maxUploadSize" : 16777216000,
  "maxByReferenceSize" : 30000000000000000,
  "maxSegmentSize" : 16777216000,
  "minSegmentSize" : 1,
  "maxAssembledSize" : 30000000000000,
  "maxSegments" : 1000,

  "accept" : ["*/*"],
  "acceptArchiveFormat" : ["application/zip"],
  "acceptPackaging" : ["*"],
  "acceptMetadata" : ["http://purl.org/net/sword/3.0/types/Metadata"],

  "collectionPolicy" : {
    "@id" : "http://www.myorg.ac.uk/collectionpolicy",
    "description" : "...."
  },
  "treatment" : {
    "@id" : "http://www.myorg.ac.uk/treatment",
    "description" : "..."
  },

  "staging" : "http://example.com/staging",
  "stagingMaxIdle" : 3600,

  "byReferenceDeposit" : true,
  "onBehalfOf" : true,

  "digest" : ["SHA-256", "SHA", "MD5"],
  "authentication": ["Basic", "OAuth", "Digest", "APIKey"],

  "services" : [
    {
      "@id": "http://swordapp.org/deposit/43",

      "dc:title" : "Deposit Service Name",
      "dcterms:abstract" : "Deposit Service Description",

      "root" : "http://example.com/service-document",
      "parent" : "http://example.com/service-document",
      "acceptDeposits": true,

      "services" : []
    }
  ]
}

The fields available are defined as follows:

Field	Type	Description
@context	string	The JSON-LD Context for this document MUST be present.
@id	string	The URL of the service document you are looking at MUST be present.
@type	string	JSON-LD identifier for the document type This field is used to define the type of the document, and in this case should always be 'ServiceDocument'. MUST be present.
accept	array	List of Content Types which are acceptable to the server. MUST be present. '/' for any content type, or a list of acceptable content types
acceptArchiveFormat	array	List of Archive Formats that the server can unpack. If the server sends a package using a different format, the server MAY treat it as a Binary File SHOULD be present. '' for any archive format (not recommended), or a list of acceptable formats. If this is omitted, the client MUST assume the server only supports application/zip
acceptDeposits	boolean	Does the Service accept deposits? SHOULD be present. If omitted, the client MUST assume that the service does not accept deposits.
acceptMetadata	array	List of Metadata Formats which are acceptable to the server. SHOULD be present. '' for any metadata format, or a list of acceptable metadata formats. Acceptable metadata formats SHOULD be an IRI for a known format, or any other identifying string if no IRI exists. If this is omitted, the client MUST assume the server only supports the standard SWORD metadata format: http://purl.org/net/sword/3.0/types/Metadata
acceptPackaging	array	List of Packaging Formats which are acceptable to the server. SHOULD be present. '*' for any packaging format, or a list of acceptable packaging formats. Acceptable packaging formats SHOULD be an IRI for a known format, or any other identifying string if no IRI exists. If this is omitted, the client MUST assume the server only supports the 3 required SWORD packaging formats (see the section Packaging Formats)
authentication	array	List of authentication schemes supported by the server. SHOULD be present. If not provided the client MUST assume the server does not support authentication.
byReferenceDeposit	boolean	Does the server support By-Reference deposit? SHOULD be present. If omitted, the client MUST assume the server does not support By-Reference deposit.
collectionPolicy	object	URL and description of the server’s collection policy. MAY be present.
collectionPolicy.@id	string	Collection Policy URL
collectionPolicy.description	string	Collection Policy Description
dc:title	string	The title or name of the Service MUST be present.
dcterms:abstract	string	A description of the service MAY be present.
digest	array	The list of digest formats that the server will accept. MUST be present, and MUST include SHA-256, MAY include any others.
maxAssembledSize	integer	Maximum size in bytes as an integer for the total size of an assembled segmented upload SHOULD be present. If omitted and segmented upload is supported, the client MUST assume the server will accept a file of any size.
maxByReferenceSize	integer	Maximum size in bytes as an integer for files uploaded by reference. SHOULD be present. If omitted, the client MUST assume the server will accept a file of any size.
maxSegmentSize	integer	Maximum size in bytes as an integer for an individual segment in a segmented upload MAY be present. If omitted and segmented upload is supported, the client MUST assume the maximum segment size is the same as maxUploadSize.
maxSegments	integer	Maximum number of segments that the server will accept for a single segmented upload, if segmented upload is supported. SHOULD be present. If omitted, the client MUST assume the server will accept any number of segments.
maxUploadSize	integer	Maximum size in bytes as an integer for files being uploaded. SHOULD be present. If omitted, the client MUST assume the server will accept an upload of any size.
minSegmentSize	integer	Minimum size in bytes as an integer for an individual segment in a segmented upload MAY be present. If omitted and segmented upload is supported, the client MUST assume the manimum segment size 1 byte.
onBehalfOf	boolean	Does the server support deposit on behalf of other users (mediation) SHOULD be present. If omitted, the client MUST assume the server does not support On-Behalf-Of deposit.
root	string	The URL for the root Service Document. MUST be present.
services	array	List of Services contained within the parent service MAY be present.
staging	string	The URL where clients may stage content prior to deposit, in particular for segmented upload MAY be present. If omitted, the client MUST assume the server does not support Segmented Upload.
stagingMaxIdle	integer	What is the minimum time a server will hold on to an incomplete Segmented File Upload since it last received any content before deleting it. SHOULD be present. If omitted, the client MUST assume that the server will hold on to the incomplete file indefinitely. Servers MAY delete the unfinished upload at any time after the minimum time stated here has elapsed.
treatment	object	URL and description of the treatment content can expect during deposit. MAY be present.
treatment.@id	string	Treatment URL
treatment.description	string	Treatment Description
version	string	The version of the SWORD protocol this server supports MUST be present.

9.3. Metadata Document

The default SWORD Metadata document allows the deposit of a standard, basic metadata document constructed using the DCMI terms [DCMI]. This Metadata document can be sent when creating an Object initially, when appending to the metadata, or in replacing the metadata or indeed the Object as a whole.

The format of the document is simple and extensible (see the Metadata Formats section). The dc and dcterms vocabularies are supported, and servers MUST support this metadata format.

The full JSON Schema [JSON-SCHEMA] can be downloaded here.

An example of the Metadata Document:

{
  "@context" : "https://swordapp.github.io/swordv3/swordv3.jsonld",

  "@id" : "http://example.com/object/1/metadata",
  "@type" : "Metadata",

  "dc:title" : "The title",
  "dcterms:abstract" : "This is my abstract",
  "dc:contributor" : "A.N. Other"
}

The fields available are defined as follows:

Field	Type	Description
@context	string	The JSON-LD Context for this document MUST be present.
@id	string	The URL of the Metadata Document you are looking at MUST be present.
@type	string	JSON-LD identifier for the document type This field is used to define the type of the document, and in this case should always be 'Metadata'. MUST be present.
^dc:.+$	string	Properties from the DC namespace MAY be present.
^dcterms:.+$	string	Properties from the DCTERMS namespace MAY be present.

When sending this document, the client MUST provide a Content-Disposition header of the form:

Content-Disposition: attachment; metadata=true

Additionally, when sending this document the client SHOULD provide the Metadata-Format header with the identifier for the format: http://purl.org/net/sword/3.0/types/Metadata

Metadata-Format: http://purl.org/net/sword/3.0/types/Metadata

If the client omits the Metadata-Format header, the server MUST assume that it is the above format.

9.4. By-Reference Document

The By-Reference document allows the client to send a list of one or more files that the server will fetch asynchronously. The By-Reference document can be sent when creating an Object initially, or when appending to or replacing the FileSet in the Object, or replacing the Object as a whole.

The full JSON Schema [JSON-SCHEMA] can be downloaded here.

An example of the By-Reference Document:

{
  "@context" : "https://swordapp.github.io/swordv3/swordv3.jsonld",

  "@type" : "ByReference",

  "byReferenceFiles" : [
    {
      "@id" : "http://www.otherorg.ac.uk/by-reference/file.zip",
      "contentType" : "application/zip",
      "contentLength" : 123456,
      "contentDisposition" : "attachment; filename=file.zip",
      "packaging" : "http://purl.org/net/sword/packaging/SimpleZip",
      "digest" : "SHA256=....",
      "ttl" : "2018-04-16T00:00:00Z",
      "dereference" : true
    }
  ]
}

The fields available are defined as follows:

Field	Type	Description
@context	string	The JSON-LD Context for this document MUST be present.
@type	string	JSON-LD identifier for the document type This field is used to define the type of the document, and in this case should always be 'ByReference'. MUST be present.
byReferenceFiles	array	List of files to deposit By-Reference MUST be present and contain one or more entries
byReferenceFiles[].@id	string	The URL of the file to be retrieved and deposited MUST be present
byReferenceFiles[].contentDisposition	string	Content-Disposition as it would have been supplied if this were a regular file deposit. MUST be present
byReferenceFiles[].contentLength	integer	Content-Length as it would have been supplied if this were a regular file deposit. SHOULD be present
byReferenceFiles[].contentType	string	The Content-Type of the file to be retrieved and deposited MUST be present
byReferenceFiles[].dereference	boolean	Should the server dereference the file (i.e. download it and store it locally) or should it simply maintain a link to the external resource. MUST be present. Note that servers MAY choose to do both, irrespective of the value here, though if `false`, the server should make the external link available to users accessing the resource.
byReferenceFiles[].digest	string	Digest as it would have been supplied if this were a regular file deposit. MUST be present
byReferenceFiles[].packaging	string	The packaging format of the file, or the Binary file identifier SHOULD be present. If this is not provided, the server MUST assume this is the Binary format: http://purl.org/net/sword/3.0/package/Binary
byReferenceFiles[].ttl	string	A timestamp which indicates when the file will no longer be available (Time To Live). MUST be formatted as UTC big-endian date as per [NOTE-datetime]. If no date is provided, the server MAY assume the file will be available indefinitely.

When sending this document, the client MUST provide a Content-Disposition header of the form:

Content-Disposition: attachment; by-reference=true

9.5. Metadata + By-Reference Document

In the event that the client wishes to send both Metadata and By-Reference content to the server, this is possible in the event that the Metadata format is expressed as JSON, such as the default SWORD metadata format.

If the client wishes to send a metadata format that is not or cannot be expressed as JSON, this operation is not available, it is provided only as a convenience. In that case, a separate Metadata Deposit and By-Reference Deposit should be carried out.

To do this, the client may include the Metadata and By-Reference documents embedded in a single JSON document, structured as shown below. The entire Metadata document (including its JSON-LD @context when using the default format) is embedded in a field entitled metadata, and the entire By-Reference document (again, with its JSON-LD @context) is embedded in a field entitled by-reference.

When a document of this form is sent, the client MUST set the Content-Disposition header appropriately, to alert the server of its required behaviour.

An example of the Metadata + By-Reference Document:

{
  "metadata" : {
    "@context" : "https://swordapp.github.io/swordv3/swordv3.jsonld",
    "@type" : "Metadata",

    "dcterms:abstract" : "....",
    "dc:contributor" : "...",
    "etc..." : "...."
  },

  "by-reference" : {
    "@context" : "https://swordapp.github.io/swordv3/swordv3.jsonld",
    "@type" : "ByReference",

    "byReferenceFiles" : []
  }
}

When sending this document, the client MUST provide a Content-Disposition header of the form:

Content-Disposition: attachment; metadata=true; by-reference=true

Additionally, when sending this document the client SHOULD provide the Metadata-Format header with the identifier for the format:

Metadata-Format: http://purl.org/net/sword/3.0/types/Metadata

If the client omits the Metadata-Format header, the server MUST assume that it is the default format: http://purl.org/net/sword/3.0/types/Metadata

9.6. Status Document

The status document is provided in response to a deposit operation on a Service-URL, and can be retrieved at any subsequent point by a GET on the Object-URL, and is returned each time the client takes action on the Object-URL. It tells the client detailed information about the content and current state of the item.

The full JSON Schema [JSON-SCHEMA] can be downloaded here.

An example of the Status Document:

{
  "@context" : "https://swordapp.github.io/swordv3/swordv3.jsonld",

  "@id" : "http://www.myorg.ac.uk/sword3/object/1",
  "@type" : "Status",
  "eTag" : "...",

  "metadata" : {
    "@id" : "http://www.myorg.ac.uk/sword3/object/1/metadata",
    "eTag" : "..."
  },
  "fileSet" : {
    "@id" : "http://www.myorg.ac.uk/sword3/object/1fileset",
    "eTag" : "..."
  },

  "service" : "http://www.myorg.ac.uk/sword3",

  "state" : [
    {
      "@id" : "http://purl.org/net/sword/3.0/state/inProgress",
      "description" : "the item is currently inProgress"
    }
  ],

  "actions" : {
    "getMetadata" : true,
    "getFiles" : true,
    "appendMetadata" : true,
    "appendFiles" : true,
    "replaceMetadata" : true,
    "replaceFiles" : true,
    "deleteMetadata" : true,
    "deleteFiles" : true,
    "deleteObject" : true
  },

  "links" : [
    {
      "@id" : "http://www.myorg.ac.uk/col1/mydeposit.html",
      "rel" : ["alternate"],
      "contentType" : "text/html"
    },
    {
      "@id" : "http://www.myorg.ac.uk/sword3/object/1/package.zip",
      "rel" : ["http://purl.org/net/sword/3.0/terms/originalDeposit"],
      "contentType" : "application/zip",
      "packaging" : "http://purl.org/net/sword/3.0/package/SimpleZip",
      "depositedOn" : "[timestamp]",
      "depositedBy" : "[user identifier]",
      "depositedOnBehalfOf" : "[user identifier]",
      "byReference" : "http://www.otherorg.ac.uk/by-reference/file.zip",
      "status" : "http://purl.org/net/sword/3.0/filestate/ingested",
      "log" : "[any information associated with the deposit that the client should know]"
    },
    {
      "@id" : "http://www.myorg.ac.uk/sword3/object/1/file1.pdf",
      "rel" : [
        "http://purl.org/net/sword/3.0/terms/fileSetFile",
        "http://purl.org/net/sword/3.0/terms/derivedResource"
      ],
      "contentType" : "application/pdf",
      "derivedFrom" : "http://www.myorg.ac.uk/sword3/object1/package.zip",
      "dcterms:relation" : "http://www.myorg.ac.uk/repo/123456789/file1.pdf",
      "dcterms:replaces" : "http://www.myorg.ac.uk/sword3/object/1/versions/file1.1.pdf",
      "eTag" : "..."
    },
    {
      "@id" : "http://www.myorg.ac.uk/sword3/object/1/package.1.zip",
      "rel" : ["http://purl.org/net/sword/terms/packagedContent"],
      "contentType" : "application/zip",
      "packaging" : "http://purl.org/net/sword/3.0/package/SimpleZip"
    },
    {
      "@id" : "http://www.swordserver.ac.uk/col1/mydeposit/metadata.mods.xml",
      "rel" : ["http://purl.org/net/sword/3.0/terms/formattedMetadata"],
      "contentType" : "application/xml",
      "metadataFormat" : "http://www.loc.gov/mods/v3"
    },
    {
      "@id" : "http://www.myorg.ac.uk/sword3/object/1/versions/file1.1.pdf",
      "rel" : ["http://purl.org/net/sword/3.0/terms/derivedResource"],
      "contentType" : "application/pdf",
      "dcterms:isReplacedBy" : "http://www.myorg.ac.uk/sword3/object1/file1.pdf",
      "versionReplacedOn" : "[xsd:dateTime]"
    },
    {
      "@id" : "http://www.myorg.ac.uk/sword3/object/1/reference.zip",
      "rel" : [
        "http://purl.org/net/sword/3.0/terms/byReferenceDeposit",
        "http://purl.org/net/sword/3.0/terms/originalDeposit",
        "http://purl.org/net/sword/3.0/terms/fileSetFile"
      ],
      "byReference" : "http://www.otherorg.ac.uk/by-reference/file2.zip",
      "log" : "Any information on the download, especially if it failed",
      "eTag" : "...",
      "status" : "http://purl.org/net/sword/3.0/filestate/ingested"
    }
  ]
}

The fields available are defined as follows:

Field	Type	Description
@context	string	The JSON-LD Context for this document MUST be present.
@id	string	The Object-URL for this document MUST be present
@type	string	JSON-LD identifier for the document type This field is used to define the type of the document, and in this case should always be 'Status'. MUST be present.
actions	object	Container for the list of actions that are available against the object for the client. MUST be present
actions.appendFiles	boolean	Whether the client can issue a request to append one or more files (individually or via a package) to the item MUST be present
actions.appendMetadata	boolean	Whether the client can issue a request to append the metadata of the item MUST be present
actions.deleteFiles	boolean	Whether the client can issue a request to delete files in the item. This may be a single file or all files. MUST be present
actions.deleteMetadata	boolean	Whether the client can issue a request to delete all the item metadata. MUST be present
actions.deleteObject	boolean	Whether the client can issue a request to delete the entire object. MUST be present.
actions.getFiles	boolean	Whether the client can issue a request to retrieve any/all files in the item (both Binary Files and Packaged Content) MUST be present
actions.getMetadata	boolean	Whether the client can issue a request to retrieve the item metadata MUST be present
actions.replaceFiles	boolean	Whether the client can issue a request to replace files in an item. This may be a single file or all of the files. MUST be present
actions.replaceMetadata	boolean	Whether the client can issue a request to replace the item metadata. MUST be present
eTag	string	The current ETag for the Object MUST be present if the repository enforces concurrency control
fileSet	object	Information about the identifier/version of the Object's FileSet MUST be present.
fileSet.@id	string	The FileSet-URL for this Object MUST be present.
fileSet.eTag	string	The Etag for the FileSet MUST be present if the server supports concurrency control
links	array	List of link objects referring to the various files, both content and metadata, available on the object MUST be present if there is one or more links available to the client
links[].@id	string	The URL of the resource MUST be present
links[].byReference	string	The external URL of the location a By-Reference deposit was retrieved from SHOULD be present if this is an Original Deposit that was deposited By-Reference, or is an active By-Reference deposit
links[].contentType	string	Content type of the resource SHOULD be present
links[].dcterms:isReplacedBy	string	URL to a newer version of the file in the same Object, if this is present as a resource SHOULD be present, if newer version is present
links[].dcterms:relation	string	URL to a non-sword access point to the file MAY be present. For example, the URL from which an end-user would download the file via the website. This related URL does not need to support any of the SWORD protocol operations, and indeed may even be on a server or application which has no sword support. Primary use case is to redirect the user to the web front end for the repository.
links[].dcterms:replaces	string	URL to an older version of the file in the same Object, if this is also present as a resource. SHOULD be present, if an older version of the file is present
links[].depositedBy	string	Identifier for the user that deposited the item SHOULD be present if this is an Original Deposit
links[].depositedOn	string	Timestamp of when the deposit happened SHOULD be present if this is an Original Deposit. If present, MUST be formatted as UTC big-endian date as per [NOTE-datetime].
links[].depositedOnBehalfOf	string	Identifier for the user that the item was deposited on behalf of. SHOULD be present if this is an Original Deposit that was done On-Behalf-Of another user
links[].derivedFrom	string	Reference to URL of resource from which the current resource was derived, for example, if extracted from a package that was deposited. SHOULD be present, if the resource is derived from another resource
links[].eTag	string	The eTag of the resource MUST be present if the server supports concurrency control and the resource is available to the client to modify
links[].log	string	Any information associated with the deposit that the client should know. MAY be present
links[].packaging	string	The package format identifier if the resource is a package. SHOULD, if the resource is a package
links[].rel	array	The relationship between the resource and the object. MUST be present. Note that multiple relationships are supported.
links[].status	string	The status of the resource, with regard to ingest. SHOULD be present. For example, packaged resources which are still being unpacked and ingested may announce their status here. Likewise, by-reference deposits may do the same. MUST be one of the allowed status URIs. Any associated information to go along with the status, especially if the status is an error, SHOULD be in link[].log. If no value is provided, the client MUST assume that the item is in the status: http://purl.org/net/sword/3.0/filestate/ingested
links[].versionReplacedOn	string	Date that the current resource was replaced by a newer resource SHOULD be present if dcterms:isReplacedBy is present
metadata	object	Information about the identifier/version of the Object's Metadata MUST be present if the server permits any operations on metadata.
metadata.@id	string	The Metadata-URL for this Object MUST be present if the server permits any operations on metadata
metadata.eTag	string	The ETag for the Metadata MUST be present if the server supports concurrency control and the Metadata-URL is present
service	string	The URL for the service to which this item was deposited (the Service-URL) MUST be present. This is the URL from which the client can retrieve information about the settings for the server that are relevant to this item (e.g. max upload sizes, etc)
state	array	List of states that the item is in on the server. At least one state MUST be present, using the SWORD state vocabulary. Other states using server-specific vocabularies may also be used alongside.
state[].@id	string	Identifier for the state. MUST be present. At least one such identifier MUST be from the SWORD state vocabulary.
state[].description	string	Human readable description of the state MAY be present

9.6.1. Available `rel` types and their meanings

alternate

An alternate, non-SWORD URL which will allow the user to access the same object. For example, this could be the URL of the landing page in the repository for the item.

http://purl.org/net/sword/3.0/terms/originalDeposit

The resource (file or package) was explicitly deposited via some deposit operation.

The relevant properties of the link section for any resource with this rel are

packaging
depositedOn
depositedOnBehalfOf
status
log
dcterms:relation
dcterms:replaces
dcterms:isReplacedBy
versionReplaced
eTag
byReference

http://purl.org/net/sword/3.0/terms/derivedResource

A file which was unpacked or otherwise derived from another deposited resource, and which itself was not explicitly deposited through some deposit operation. The main usage would be to identify files which were extracted from a deposited zip file.

The relevant properties of the link section for any resource with this rel are

derivedFrom
dcterms:relation
dcterms:replaces
dcterms:isReplacedBy
versionReplaced
eTag

http://purl.org/net/sword/terms/packagedContent

A resource which makes this object available packaged in the specified package format on HTTP GET. This is not a resource which has been deposited or derived (though it may be very similar to an originally deposited package), it is one which the server makes available as a service to the client. Packages may be pre-built or assembled on the fly - that responsibility rests with the server.

The relevant properties of the link section for any resource with this rel are

packaging

http://purl.org/net/sword/3.0/terms/formattedMetadata

A resource which makes this object’s metadata available, serialised in the specified metadata format on HTTP GET. This is not a resource which has been deposited or derived (though it may be very similar to the originally deposited metadata), it is one which the server makes available as a service to the client. Metadata documents may be pre-built or assembled on the fly - that responsibility rests with the server.

The relevant properties of the link section for any resource with this rel are

metadataFormat

http://purl.org/net/sword/3.0/terms/byReferenceDeposit

A file which is currently being downloaded from an external reference. Often will also have the rel for originalDeposit, and once all segments have been uploaded the byReferenceDeposit rel can be removed.

The relevant properties of the link section for any resource with this rel are

byReference
status

http://purl.org/net/sword/3.0/terms/fileSetFile

A File which can be considered by the client to be part of the FileSet. Files in this state are available for modification via the SWORD protocol, and should be considered to form the actual "content" of the Object.

9.6.2. Required SWORD State Information

state/@id MUST contain one of:

http://purl.org/net/sword/3.0/state/accepted: for records accepted for processing but not yet created
http://purl.org/net/sword/3.0/state/inProgress: for records that have been deposited, but for which the deposit has not yet completed
http://purl.org/net/sword/3.0/state/inWorkflow: for records that are in the server’s ingest workflow
http://purl.org/net/sword/3.0/state/ingested: for records that are in the server’s archive state, whatever that might mean (e.g. published to the web)
http://purl.org/net/sword/3.0/state/rejected: for records that have been rejected from the server’s workflow
http://purl.org/net/sword/3.0/state/deleted: for tombstone records

The state field is a list, so it may also contain other states that are server-specific in addition to the SWORD values.

9.6.3. Ingest Statuses for Individual Files

Some files, when deposited, may be processed asynchronously to the client’s request. For example, large files that require unpacking, by-reference deposits, etc. In these cases, the client will not receive feedback on the state or success of their deposit in the request/response exchange. Instead, the client may monitor the file(s) via the Status document, and for each appropriate file (Original Deposits), a “status” field will provide information on the current status of processing for that file.

The following statuses are permitted, servers SHOULD provide one of these by each relevant file:

http://purl.org/net/sword/3.0/filestate/pending: the server has not yet started to process this file. It may be in a queue, or it may still be in the process of deposit via a Segmented Upload.
http://purl.org/net/sword/3.0/filestate/downloading: the server has started to download your By-Reference file, and is not yet complete
http://purl.org/net/sword/3.0/filestate/unpacking: the server has started unpacking your Packaged Content, and is not yet finished
http://purl.org/net/sword/3.0/filestate/error: there was an error either downloading or unpacking your file; information should be available in the “log” field to aid the client in understanding what went wrong.
http://purl.org/net/sword/3.0/filestate/ingested: the file has been successfully ingested

9.7. Segmented File Upload Document

A client may request information on an ongoing Segmented File Upload at any point via a GET to the Temporary-URL.

The full JSON Schema [JSON-SCHEMA] can be downloaded here.

An example of the Segmented File Upload Document:

{
    "@context": "https://swordapp.github.io/swordv3/swordv3.jsonld",
    "@id": "http://example.com/temporary/1",
    "@type": "Temporary",

    "received": [
        1,
        2,
        4
    ],
    "expecting": [
        3,
        5
    ],
    "assembledSize": 10000000,
    "segmentSize": 2000000
}

The fields available are defined as follows:

Field	Type	Description
@context	string	The JSON-LD Context for this document MUST be present.
@id	string	The Temporary-URL for this document MUST be present
@type	string	JSON-LD identifier for the document type This field is used to define the type of the document, and in this case should always be 'Temporary'. MUST be present.
assembledSize	integer	The expected size in bytes of the final resulting assembled file. MUST be present.
expecting	array	This list of integers identifying the segments which are expected and that have not yet been deposited MUST be present if there are any segments remaining to be uploaded
received	array	The list of integers identifying the segments that have been successfully uploaded so far. MUST be present if one or more segments have been uploaded
segmentSize	integer	The expected size in bytes of the segments (except the final one) that will be uploaded. MUST be present.

9.8. Error Document

An error document is returned at any point that a synchronous operation fails.

The full JSON Schema [JSON-SCHEMA] can be downloaded here.

An example of the Error Document:

{
  "@context" : "https://swordapp.github.io/swordv3/swordv3.jsonld",

  "@type" : "BadRequest",

  "timestamp" : "[timestamp]",
  "error" : "error summary",
  "log" : "text log of any debug information for the client"
}

The fields available are defined as follows:

Field	Type	Description
@context	string	The JSON-LD Context for this document MUST be present
@type	string	JSON-LD identifier for the document type This field is used to define the type of the document, and in this case should be one of the allowed Error Doucment types. MUST be present.
error	string	A short summary/title for the error MUST be present
log	string	Some detail as to the error, with any information that might help resolve it. SHOULD be present
timestamp	string	When the error occurred. MUST be formatted as UTC big-endian date as per [NOTE-datetime]. MUST be present

See Error Types for details of what errors can be reported in the @type field.

10. Authentication and Authorisation

It is strongly RECOMMENDED that SWORD servers support authentication and authorisation for requests.

SWORD servers are not restricted in the forms of authentication that they employ, and there is no minimum requirement or default supported approach.

10.1. Announcing Support for Authentication Schemes

Servers SHOULD enumerate the authentication schemes that they support in the Service Document, in the field authentication, and MUST draw from the IANA registry of HTTP auth scheme names [IANA Auth] where one is available.

Where an authentication scheme is in use by the server which is not covered by the IANA registry - such as a custom API-token-based approach, the server MAY indicate this in whatever way seems most appropriate.

For example, a Server which supports Basic, Digest and OAuth authentication, as well as a custom API-Key approach could indicate as follows:

{
  "authentication": [
    "Basic",
    "OAuth",
    "Digest",
    "APIKey"
  ]
}

Servers MAY also choose to support On-Behalf-Of deposit, which means that the authenticating user is providing content to the server, as if another user were actually carrying out this request. A use case for this would be when a known third-party deposit tool is sending content to a server and has been authorised by another user to add content on their behalf.

If a server supports On-Behalf-Of deposit, it SHOULD indicate this in the Service Document with the field onBehalfOf set to true. If this field is not present clients MUST assume that the server does not support On-Behalf-Of deposit.

{
  "onBehalfOf": true
}

10.2. Authentication and Authorisation in requests

When carrying out authenticated requests, Authorization headers MUST be sent with every request to the server - the server is not responsible for maintaining state for the client. The server is responsible for authenticating and authorising every request individually. Clients may choose also to send Cookie headers, and servers may support these, but support for Cookies is explicitly outside this specification.

When an On-Behalf-Of deposit is received, the server MUST ensure that the user identified in that header is valid with respect to the associated Authorization header. For example, when using OAuth2, the On-Behalf-Of user MUST match the user for which the token in the Authorization header was granted.

10.3. Authentication and Authorisation responses

There are two possible error responses to a request from the perspective of authentication:

If the request does not supply any credentials, and the server is expecting to authenticate requests, then a 401 (AuthenticationRequired) response MUST be returned.
If the request contains credentials and the server is unable to authenticate the client based on those credentials, then a 403 (AuthenticationFailed) response MUST be returned.

10.4. Recording Depositing Users

In all cases (On-Behalf-Of or not) where a user has authenticated to make a deposit, servers SHOULD preserve the user's identity in the depositedBy property of the Original Deposit in the Status document. In On-Behalf-Of deposit, the value given in the On-Behalf-Of header SHOULD be used for the value of the depositedOnBehalfOf property of the Original Deposit in the Status document.

Note that recording a user's identity in this way does not have to contain enough information for the client to directly identify the user, and implementers should take note of privacy legislation when choosing what information to expose in these fields.

11. Transport Security

It is strongly RECOMMENDED that servers implement modern transport layer security, whether authenticating requests or not. If you are carrying out authenticated protocol operations you MUST implement TLS.

12. Error Types

The following are the error types that are available (to place in @type in the Error Document), their associated HTTP Status Code, and the legitimate reasons for returning that error:

Error Type	Error Code	HTTP Name	Reason
AuthenticationFailed	403	Forbidden	The request supplied invalid credentials
AuthenticationRequired	401	Unauthorized	The request supplied no credentials, when the server was expecting to authenticate the request.
BadRequest	400	BadRequest	The request did not meet the standard specified by the SWORD protocol. This error can be used when no other error is appropriate
ByReferenceFileSizeExceeded	400	BadRequest	The client supplied a By-Reference deposit file, which specified a file size which exceeded the server's limit
ByReferenceNotAllowed	412	PreconditionFailed	The client attempted to carry out a By-Reference deposit on a server which does not support it
ContentMalformed	400	BadRequest	The body content of the request was malformed in some way, such that the server cannot read it correctly.
ContentTypeNotAcceptable	415	UnsupportedMediaType	The `Content-Type` header specifies a content type of the request which is in a format that the server cannot accept.
DigestMismatch	412	PreconditionFailed	One or more of the Digests that the server checked did not match the deposited content
ETagNotMatched	412	PreconditionFailed	The client supplied an `If-Match` header which did not match the current `ETag` for the resource being updated.
ETagRequired	412	PreconditionFailed	The client did not supply an `If-Match` header, when one was required by the server
Forbidden	403	Forbidden	The client requested an operation that is not permitted by the server in this context.
FormatHeaderMismatch	415	UnsupportedMediaType	The `Metadata-Format` or `Packaging` header does not match what the server found when looking at the Metadata or Packaged Content supplied in a request.
InvalidSegmentSize	400	BadRequest	The client sent a segment that was not the final segment, and was not the size that it indicated segments would be, or during segmented upload initialisation, the client specified a segment size which was not between `minSegmentSize` and `maxSegmentSize`.
MaxAssembledSizeExceeded	400	BadRequest	During a segmented upload initialisation, the client specified a total file size which is larger than the maximum assembled file size supported by the server
MaxUploadSizeExceeded	413	PayloadTooLarge	The request supplied body content which is larger than that supported by the server.
MetadataFormatNotAcceptable	415	UnsupportedMediaType	The `Metadata-Format` header specifies a metadata format for the request which is in a format that the server cannot accept
MethodNotAllowed	405	MethodNotAllowed	The request is for a method on a resource that is not permitted. This may be permanent, temporary, and may depend on the client’s credentials
OnBehalfOfNotAllowed	412	PreconditionFailed	The request contained an `On-Behalf-Of` header, although the server indicates that it does not support this.
PackagingFormatNotAcceptable	415	UnsupportedMediaType	The `Packaging` header specifies a packaging format for the request which is in a format that the server cannot accept
SegmentedUploadTimedOut	410	MethodNotAllowed	The client's segmented upload URL has timed out. Servers MAY respond to this with a 404 and no explanation also.
SegmentLimitExceeded	400	BadRequest	During a segmented upload initialisation, the client specified a total number of intended segments which is larger than the limit specified by the server
UnexpectedSegment	400	BadRequest	The client sent a segment that the server was not expecting; in particular the server may have recieved all the segments it was expecting, and this is an extra one

13. Content Disposition

SWORD uses the Content-Disposition header in client requests to indicate to the server information about the payload being delivered. Traditionally Content-Disposition is an HTTP response header, but it makes sense in the PUSH context of SWORD to use this as a request header. We follow [RFC6266] for its usage.

Implementers should also note [RFC5987] if sending filenames which require characters outside the ISO-8859-1 character set.

The general format of a Content-Disposition header is as follows:

Content-Disposition: [disposition type]; [disposition param]=[value]; ...

The rules below define how to generate the correct Content-Disposition for a given set of Request Conditions. If you are implementing a SWORD client or server it is STRONGLY RECOMMENDED that you work from the SWORDv3 Behaviours Document, as this lays out the Content-Disposition requirements per-request, rather than in the form of the normalised requirements below.

There are three general deposit operations in SWORD:

A direct upload of some content, which may be Metadata, a By-Reference document, or a Binary File (which may itself be Packaged Content)
A Segmented Upload Initialisation
A File Segment for a Segmented Upload

Each of these has a different Content-Disposition, which makes it clear to the server what it should do with that content.

There are two aspects which control what the Content-Disposition should be:

The Upload Type
The Content

The requirements below define what Disposition Type and Parameters are required for each kind of request. The requirements should be interpreted according to the following hierarchy for each of the above aspects:

The hierarchy for the Upload Type is:

All
- Direct Deposit
- Segmented Upload Initialisation
- File Segment Upload

The hierarchy for the Content is:

All
- Body
  - JSON
    - Metadata
    - By-Reference
    - MD+BR
  - Binary
    - Binary File
    - Packaged Content
    - File Segment
- Empty Body

So, for example, if delivering a Metadata+By-Reference Document (MD+BR) as a Direct Deposit, you would take into account the following requirements:

With an Upload Type of Direct Deposit: Direct Deposit and All
With a Content type of MD+BR: MD+BR, JSON, Body and All

The requirements are:

Request Conditions:

Upload Type: Direct Deposit
Content: All

Disposition Type

attachment

Request Conditions:

Upload Type: Direct Deposit
Content: Metadata

Param

metadata=true - Indicates that the body content of the request contains Metadata. A Direct Deposit containing Metadata MUST contain this parameter.

Request Conditions:

Upload Type: Direct Deposit
Content: By-Reference

Param

by-reference=true - Indicates that the body content of the request contains By-Reference files. A Direct Deposit containing By-Reference files MUST contain this parameter.

Request Conditions:

Upload Type: Direct Deposit
Content: MD+BR

Param

metadata=true - Indicates that the body content of the request contains Metadata. A Direct Deposit containing Metadata MUST contain this parameter.
by-reference=true - Indicates that the body content of the request contains By-Reference files. A Direct Deposit containing By-Reference files MUST contain this parameter.

Request Conditions:

Upload Type: Direct Deposit
Content: Binary File

Param

filename=[filename] - Indicates the intended filename of the deposited file. MAY be present, and if present the server SHOULD respect it, unless this is an update to an existing file, then the server MAY ignore this parameter. If using a character set outside of ISO-8859-1, you MUST use filename\* instead.

Request Conditions:

Upload Type: Direct Deposit
Content: Packaged Content

Param

filename=[filename] - Indicates the intended filename of the deposited file. MAY be present, and if present the server SHOULD respect it, unless this is an update to an existing file, then the server MAY ignore this parameter. If using a character set outside of ISO-8859-1, you MUST use filename\* instead.

Request Conditions:

Upload Type: Segmented Upload Initialisation
Content: Empty Body

Disposition Type

segment-init

Param

size=[bytes] - The total size of the final file. This MUST be sent so that the server can determine when all the bytes of the file have been uploaded.
digest=[digest] - The Digest information for the resulting file as a whole, after assembly. This MUST be present, and MUST be in the same form as if it were the HTTP header you would use if depositing this file as a whole.
segment_count=[n] - The total number of segments that will be sent to the Temporary-URL. This MUST be present. Later, any segment uploads with segment_number greater than this number MUST be rejected by the server.
segment_size=[bytes] - The size of each segment (except the final segment) that the client will be sending. This MUST be present. If a non-final segment is sent with a different size, this MUST be rejected by the server.

Request Conditions:

Upload Type: File Segment Upload
Content: File Segment

Disposition Type

segment

Param

segment_number=[n] - The position in the full sequence of this segment. This MUST be present. It MUST be an integer, and MUST start counting at 1. Full list of segments MUST be a sequential list of integers.

The following examples show a number of key cases:

A Metadata Deposit

Content-Disposition: attachment; metadata=true

A By-Reference Deposit

Content-Disposition: attachment; by-reference=true

A Metadata+By-Reference Deposit

Content-Disposition: attachment; metadata=true; by-reference=true

A Binary File Deposit

Content-Disposition: attachment; filename=[filename]

A Segmented Upload Initialisation

Content-Disposition: segment-init; size=[bytes]; digest=[digest]; segment_count=[n]; segment_size=[bytes]

A File Segment Upload

Content-Disposition: segment; segment_number=[n]

14. Content Digests

In order to ensure that the content transmitted via SWORD is correct when it arrives at its destination, clients MUST provide Digests that servers MUST check against incoming content.

14.1. Announcing Support For Digests

Servers can announce support for the Digest formats that they support in the Service Document as follows:

{
  "digest": [
    "SHA-256",
    "SHA",
    "MD5"
  ]
}

The Server SHOULD list all the digest formats that it supports. Servers MUST support at least SHA-256 and MAY support any other digest formats.

The Digest formats MUST be identified as per the IANA HTTP Digest Algorithm values: [IANA Digest]

14.2. Transmitting Digests

SWORD uses the recommendations of [RFC3230] for transmitting base64 encoded Digests of request bodies.

For every request where there is a request body, the client MUST attach the Digest header with the appropriate content:

Digest: SHA-256=MzA1ZmIzMDJiZjA4MzUzYTg5ZGY4NDIxMjcyY2JmZTEwNzM5ODdmMjJhY2Y1ZDc5NzFhOTY3MmM1MGNkN2ZlMA==

Note that the client MAY send multiple digests from different algorithms, separated by commas in the header:

Digest: SHA-256=MzA1ZmIzMDJiZjA4MzUzYTg5ZGY4NDIxMjcyY2JmZTEwNzM5ODdmMjJhY2Y1ZDc5NzFhOTY3MmM1MGNkN2ZlMA==, MD5=ZjQxNjA3N2M3MDdhODJkZGJlMGE0YTk2NGRjZWEyNWE=

The server MUST validate at least one digest, SHOULD validate all digests, though MAY choose its preferred format to validate against.

15. Concurrency Control

Servers MAY choose to implement concurrency control, in order to ensure that clients do not accidentally overwrite or make changes that conflict with other changes which may have happened to the Object since it was first deposited. Note that this does not prevent clients causing damage to Objects, only that it cannot be so easily done by accident.

Objects may change for a number of reasons after their initial creation, such as:

Additional requests by the original depositing client to modify the Object
Requests for modify by other clients with authorisation to modify the Object
Modifications to the Object from agents on the server-side, such as administrators, etc.

In order to provide concurrency control, SWORD follows [RFC7232], and specificially uses the ETag and If-Match headers.

On each request for a resource, or when the Status document is retrieved, the ETag for the resource MUST be returned. The ETag gives the client an opaque identifier for the current version of that resource. When the resource is being updated by the client (for example, it is replacing a File), the ETag that the client expects to be the current one MUST be sent in the If-Match header. The server MUST then compare that with its actual current ETag for the resource. If they match, the request can go ahead, otherwise the Server MUST respond with an error (412).

Note that ETags, and Concurrency Control in general, is only applicable from the Object downwards. There are no requirements for use of ETag or If-Match headers on Service-URLs.

15.1. Announcing Support for Concurrency Control

The server does not have to announce support for concurrency control in the Service Document. Clients MUST check response headers for the presence of an ETag. Presence of the ETag indicates that the server requires the client to pay attention to its concurrency control procedures, and to carry out later requests with an If-Match header.

If supporting concurrency control, Servers MUST provide an ETag on all responses to requests (GET, POST, PUT) against resources from the Object and below.

15.2. Procedures around Concurrency Control

If a server supports Concurrency Control, it MUST behave in accordance with the following rules.

An ETag MUST be provided for each SWORD resource: the Object, the Metadata, the FileSet and any Files.
The ETag is a resource-level version identifier, it MUST be the same for all expressions of the resource. For example, all serialised Metadata documents (such as in JSON, or in XML) MUST have the same ETag as the Metadata resource, and each other.
The client MUST send the ETag that it expects to represent the current version with every request to change the resource (POST, PUT, DELETE) by placing it in the If-Match header
If the ETag supplied by the client in the If-Match header does not match the current ETag for the resource, the Server MUST respond with a 412 (Precondition Failed) error
If the ETag supplied by the client in the If-Match header does match the current ETag for the resource, the request MUST go ahead as normal.
The server MUST include the ETag in the HTTP headers of every GET request for a resource.
The server MUST include the ETags for the resources in the appropriate places in the Status document.
If a resource is modified, its ETag MUST change
If a resource is modified, the ETags of all resources within which it is contained MUST change.
The server MAY choose between strong and weak ETags, at its discretion
The server MAY NOT track previous ETag values for a resource.

15.3. Resource Hierarchy for ETag Regeneration

If an ETag of a resource changes, the resources above it (up to the level of the Object) MUST also change. This is to prevent a change at a higher level (e.g. an Object replacement) overwriting a change at a lower level (e.g. addition of a single file).

The Object hierarchy is as follows:

Object
- Metadata
- FileSet
  - File

So, for example, if the Metadata is updated, then the Metadata and Object ETags MUST change, but the FileSet and File ETags MAY NOT. Similarly, if a File ETag changes, then the FileSet and Object ETags must also change, while the Metadata ETag MAY NOT.

16. Continued Deposit

Some systems may wish to give the client more control over the ingest process, and SWORD uses the In-Progress HTTP header to allow the client to indicate that a deposit should not yet be injected into any post-submission or pre-ingest workflow. The In-Progress header MUST take the value true or false, and if it is not present the server MUST assume that it is false and behave as described below.

An example use case for this is that the client may be embedded into a system which uses the SWORD server as a storage layer, but which cannot acquire all of the content for a "finished" item in one deposit operation. Consider a user-facing system which encourages users to upload files one at a time through some web interface, which causes each file to be directly deposited onto the SWORD server. At the start of the deposit the client asserts that deposit is In-Progress: true, and then proceeds to upload files. If uploading them to the Object-URL the client continues to assert In-Progress: true on each request (if depositing to other URLs this is not necessary). This goes on until the user confirms that they have uploaded all the relevant files, or navigates away from the page. At that stage, the client can issue a blank HTTP POST request to the SWORD server, with In-Progress: false to complete the deposit.

Note that the In-Progress header is intended to indicate to the server that further content will be coming in which is associated with the existing content, before it can be considered "complete". It is not intended to provide workflow control, and clients MUST NOT assume that asserting In-Progress: true will have any specific effect on the state of the item.

16.1. Deposit Complete

If In-Progress is false, the server MAY assume that it can carry on processing the deposit as it sees fit.

16.2. Deposit Incomplete

If In-Progress is true, the server SHOULD expect the client to provide further updates to the item some undetermined time in the future. Details of how this is implemented is dependent on the server's purpose. For example, a repository system may hold items which are marked In-Progress in a workspace until such time as a client request indicates that the deposit is complete.

16.3. Completing a Previously Incomplete Deposit

The client can assert that a deposit process has completed by issuing an HTTP POST to the Object-URL with a blank request body and with the In-Progress header set to false (it may simply omit the header altogether too, as this is treated as In-Progress: false by the server). The client MAY specify a Content-Length: 0 HTTP header, and MUST NOT include any body content.

The client MAY provide an In-Progress header with a value of false
The client MAY provide an On-Behalf-Of header

Once the server has processed the request it MUST respond with status code 204 (No Content), or a suitable error.

17. Segmented File Upload

If a client has a very large file that it wishes to transfer to the server by value, then in may be beneficial to do this in several small operations, rather than as a single large operation. Large uploads are at higher risk of failure, depending on a variety of factors, and there is no guarantee that a SWORD server will be able to resume a partial upload.

In order to transfer a large file, the client can break it down into a number of equally sized segments of binary data (the final segment may be a different size to the rest). It can then initialise a Segmented File Upload with the server, and then transfer the segments. The server will reconstitute these segments into a single file, and then the client may deposit this file by-reference.

Segments can be uploaded in any order, and can be uploaded one at a time or in parallel.

17.1. Announcing Support for Segmented File Upload

Servers MAY support Segmented File Upload. To do so, it must provide a staging area where file segments can be uploaded prior to the client requesting a specific deposit operation. The server MUST include a staging field in the Service Document with a URL for where the client can initialise its Segmented File Upload. It SHOULD also specify how long it will retain an unfinished Segmented File Upload, before assuming that the client will not complete it, with the stagingMaxIdle field. In addition, the server SHOULD specify the size parameters of the segments using maxSegmentSize, minSegmentSize, maxAssembledSize and maxSegments:

{
  "maxAssembledSize": 30000000000000,
  "maxSegmentSize": 16777216000,
  "maxSegments": 1000,
  "minSegmentSize": 1,
  "staging": "http://example.com/staging",
  "stagingMaxIdle": 3600
}

17.2. Outline of Process for Segmented File Upload

Obtain the Staging-URL^[def] from the Service from which to request an Temporary-URL^[def]

If the client is creating a new Object, the Staging-URL can be found in the staging field in the Service Document. If an Object already exists, the client should find the Service-URL from the service field in the Service Document, then GET this URL to obtain the appropriate Service Document, and subsequently get the Staging-URL from the staging field.
Request a Temporary-URL^[def] from the Service, via a Segmented Upload Initialisation request.

Send a POST request to the Staging-URL, as per POST Staging-URL, with the appropriate Content-Disposition (see below). The server will respond with a Temporary-URL in the Location header.
Upload all the file segments to the Temporary-URL^[def]

Send one or more POST requests to the Temporary-URL as per POST Temporary-URL, with the appropriate Content-Disposition (see below), until all file segments have been uploaded.
Carry out the desired deposit operation as a By-Reference deposit, using the Temporary-URL as the by-reference file.

Refer to the section By-Reference Deposit for more information on this approach. Deposits of content hosted at Temporary-URLs SHOULD NOT contain the ttl or dereference fields in the By-Reference Document, and if they are included, the server MUST ignore them.

17.3. Segmented Upload Initialisation

Before sending any segments to the server, the client must initialise the process. This is done by sending a POST request to the Staging-URL as per POST Staging-URL.

The requirements of the protocol for a Segment Upload Initialisation are:

Protocol Operation

POST Staging-URL

Request Requirements

MAY specify Authorization and On-Behalf-Of headers (i.e. if authenticating this request)
MUST provide the Content-Disposition header, with the appropriate value for the request
MAY provide the Content-Length header with value 0
MUST NOT include any body content

Server Requirements

If Authorization (and optionally On-Behalf-Of) headers are provided, MUST authenticate the request
If all preconditions are met, MUST create a resource to which the client can upload file segments
MUST reject the request if the conditions of the upload are not acceptable

Response Requirements

MUST respond with a 201 to indicate that the Segmented Upload has been initialised, or raise an error.
MUST respond with a Location header containing the Temporary-URL where the client can upload file segments

See the section Content Disposition for detailed information on the Content-Disposition header. Based on that section, the supplied Content-Disposition would be:

Content-Disposition: segment-init; size=[bytes]; digest=[digest]; segment_count=[n]; segment_size=[bytes]

The server MAY choose to reject the Segmented Upload Initialisation request at this stage, for a variety of reasons - for example, it may have a limit on the total number of segments it will accept, or the total size may exceed a maximum file size for assembled files. In these cases, the server MUST respond with one of the appropriate Error Types.

If the request is successful, the server will respond with a Temporary-URL in the Location header, and the segments themselves can be uploaded to that URL.

17.4. Uploading File Segments

Segments may be uploaded in any order and may also be parallelised. Segments MUST all be the same size, with the exception of the final segment with MUST be the same size or smaller than the other segments. Segments size MUST be smaller than the maxSegmentSize if specified and if not then smaller than maxUploadSize specified in the Service Document. Segments MUST be larger than the minSegmentSize also specified in the Service Document.

The requirements of the protocol for File Segment Upload are:

Protocol Operation

POST Temporary-URL

Request Requirements

MAY specify Authorization and On-Behalf-Of headers (i.e. if authenticating this request)
MUST provide the Content-Disposition header, with the appropriate value for the request
MUST provide the Digest header
SHOULD provide the Content-Length

Server Requirements

If Authorization (and optionally On-Behalf-Of) headers are provided, MUST authenticate the request
MUST verify that the content matches the Digest header
MUST verify that the supplied content matches the Content-Length if this is provided
MUST reject the request if the segment is incorrect or unexpected: for example, all segments were already received, or the segment is a different size than expected.
If all preconditions are met, MUST accept the file segment, and record the receipt of it
MUST be prepared to accept file segments in any order, and in parallel
MUST be able to store the incoming file segments as they arrive, and then reconstitute them into a single file when all segments have been received.

Response Requirements

MUST respond with a 204 or a suitable error

Error Responses

If no authentication credentials were supplied, but were expected, MUST respond with a 401 (AuthenticationRequired)
If authentication fails with supplied credentials, MUST respond with a 403 (AuthenticationFailed)
If the server does not allow this method in this context at this time, MAY respond with a 405 (MethodNotAllowed)
If the server does not support On-Behalf-Of deposit and the On-Behalf-Of header has been provided, MAY respond with a 412 (OnBehalfOfNotAllowed)
If one or more of the digests provided by the client that the server checked did not match, MAY respond with 412 (DigestMismatch). Note that servers MAY NOT check digests in real-time.
If the body content could not be read correctly, MAY return a 400 (ContentMalformed)
If the provided segment is not the final segment and is not the size that the client had indicated on initialisation, MUST return 400 (InvalidSegmentSize)
If the Temporary-URL has expired, SHOULD return 410 (SegmentedUploadTimedOut). Servers may also return 404 and no further explanation.
If the segment was not expected, for example all the expected segments have already been sent, or a segment with this segment number has already been received, MUST return 400 (UnexpectedSegment)

See the section Content Disposition for detailed information on the Content-Disposition header. Based on that section, the supplied Content-Disposition would be:

Content-Disposition: segment; segment_number=[n]

The Content-Type header MUST just be application/octet-stream.

The Digest header MUST contain the Digest for the File Segment itself, so the server can confirm successful transfer of the segment.

17.5. Retrieving Information about a Segmented File Upload

At any point after creating a Temporary-URL, the client may request information on the state of their Segmented File Upload. This can be done via a GET to the Temporary-URL.

This will return you a document as described in Segmented File Upload Document.

The requirements for this operation are:

Protocol Operation

GET Temporary-URL

Request Requirements

MAY specify Authorization and On-Behalf-Of headers (i.e. if authenticating this request)

Server Requirements

If Authorization (and optionally On-Behalf-Of) headers are provided, MUST authenticate the request

Response Requirements

MUST respond with a 200 or a suitable error
If successful, MUST respond with a Segmented File Upload Document describing the current state of the upload.

Error Responses

If no authentication credentials were supplied, but were expected, MUST respond with a 401 (AuthenticationRequired)
If authentication fails with supplied credentials, MUST respond with a 403 (AuthenticationFailed)
If the server does not allow this method in this context at this time, MAY respond with a 405 (MethodNotAllowed)
If the server does not support On-Behalf-Of deposit and the On-Behalf-Of header has been provided, MAY respond with a 412 (OnBehalfOfNotAllowed)

NOTE that you cannot retrieve an actual copy of the full or partially uploaded Segmented File Upload from the Temporary-URL at any point.

17.6. Aborting an Upload

If, part way through a segmented upload (even after completion) the client wishes to abort, it can send an DELETE request to the Temporary-URL, with the following requirements:

Protocol Operation

DELETE Temporary-URL

Request Requirements

MAY specify Authorization and On-Behalf-Of headers (i.e. if authenticating this request)

Server Requirements

If Authorization (and optionally On-Behalf-Of) headers are provided, MUST authenticate the request

Response Requirements

MUST respond with a 204 if the delete is successful, 202 if the delete is queued for processing, or raise an error

Error Responses

If no authentication credentials were supplied, but were expected, MUST respond with a 401 (AuthenticationRequired)
If authentication fails with supplied credentials, MUST respond with a 403 (AuthenticationFailed)
If the server does not allow this method in this context at this time, MAY respond with a 405 (MethodNotAllowed)
If the server does not support On-Behalf-Of deposit and the On-Behalf-Of header has been provided, MAY respond with a 412 (OnBehalfOfNotAllowed)

If a client submits the Temporary-URL as a By-Reference deposit to the server after completing the upload, the client SHOULD NOT delete the Temporary-URL themselves, the server SHOULD take responsibility for this. If the client deletes the resource before the By-Reference deposit has completed, the server SHOULD record an error against the ingest.

17.7. Incomplete Upload Retention

Servers SHOULD delete incomplete Segmented File Uploads after a specified amount of time (in the Service Document), if they are not finalised with all segments.

17.8. Completed Upload Retention

Servers SHOULD delete completed Segmented File Uploads after a specified amount of time (in the Service Document). Servers MUST be able to tell when they have been given one of their own Temporary-URLs as a By-Reference deposit, and not delete that resource until after it has been ingested.

If a Temporary-URL is used in a By-Reference deposit, this should reset the idle counter on the server for that file, and the server SHOULD NOT delete the file until after the idle period has expired. This allows clients to be able to reference the file in multiple deposits should that be necessary.

17.9. Errors

Servers MUST respond with Error documents under the following circumstances (in addition to the standard errors that may arise through using the protocol):

An initialisation request is sent which specifies a total size larger than that allowed by the server (MaxAssembledSizeExceeded)
An initialisation request is sent which specifies a segment size larger than that allowed by the server (MaxUploadSizeExceeded)
An initialisation request is sent which specifies a segment size smaller than that allowed by the server (InvalidSegmentSize)
An initialisation request is sent which specifies a segment count larger than that allowed by the server (SegmentLimitExceeded)
An upload request is sent after the total_size has been reached (MethodNotAllowed)
An upload request is sent after the segment_count has been reached (MethodNotAllowed)
A segment is received which is not the final segment and is not the same as the expected file size (InvalidSegmentSize)
A segment is received which is the final segment which is larger than the other segment sizes (InvalidSegmentSize)
A segment is received which is larger than that allowed by the server (InvalidSegmentSize)
A segment is received which is smaller than that allowed by the server (InvalidSegmentSize)
A segment number is received which is not in the allowed range (SegmentLimitExceeded)

The server MAY respond with an Error document under the following circumstances:

The Temporary-URL has timed out, and the server will no longer receive updates to it (SegmentedUploadTimedOut)

If any other errors occur asynchronously, such as in reassembling or unpacking the resulting file, servers MUST provide an error status field and suitable log information in the link record in the Status document.

18. By-Reference Deposit

By-Reference Deposit is when the client provides the server with URLs for Files which it would like the server to retrieve asynchronously to the deposit request itself. This could be useful in a number of contexts, such as when the files are very large, and are stored on specialist staging hardware, or where the files are already readily available elsewhere, and there is no need to push them through a by-value deposit.

18.1. Announcing Support for By Reference Deposit

Servers MAY support By-Reference deposit. If a server supports By-Reference it SHOULD indicate this in the Service Document using the field byReferenceDeposit:

{
  "byReferenceDeposit": true
}

18.2. Options for By-Reference Deposit

Clients may use a By-Reference Deposit anywhere a by-value deposit could be carried out. Instead of sending any Binary content, the client sends the By-Reference Document containing one or more (depending on context) URLs to files which the server can retrieve.

See the document SWORDv3 Behaviours for an expansion of the Protocol Requirements for requests to deposit By-Reference.

The Content Disposition for a By-Reference deposit is:

Content-Disposition: attachment; by-reference=true

18.2.1. Usage with Segmented File Upload

If carrying out a Segmented File Upload, the final deposit stage is to send the Temporary-URL^[def] to the server as part of a By-Reference deposit. In this case the client SHOULD omit the ttl and dereference fields from the By-Reference Document, thus:

{
  "@context" : "https://swordapp.github.io/swordv3/swordv3.jsonld",

  "@type" : "ByReference",

  "byReferenceFiles" : [
    {
      "@id" : "[Temporary-URL]",
      "contentType" : "application/zip",
      "contentLength" : 123456,
      "contentDisposition" : "attachment; filename=file.zip",
      "packaging" : "http://purl.org/net/sword/packaging/SimpleZip",
      "digest" : "SHA256=...."
    }
  ]
}

The server MUST recognise one of its own Temporary-URLs, and should implement ingest in the most efficient way possible, remembering that you cannot retrieve a copy of the actual Segmented File Upload from the Temporary-URL via GET, so the server MUST have a way to retrieve the content from those uploads in another way. The server MUST NOT delete the resource until after it has been successfully ingested (i.e. the stagingMaxIdle time should be ignored when the server has received the resource as a By-Reference deposit).

18.3. Server-Side Processing of By Reference Deposits

The following is the procedure that MUST be followed by servers implementing By-Reference deposit.

The server receives a By-Reference Document with one or more files listed
The server creates records for each of these files that it plans to dereference, which then become visible in the Status Document. Files marked by the client not to be dereferenced are considered metadata, and MAY NOT appear in the Status Document. All other supplied Files MUST have the status pending in the Status Document.
The server responds to the client with the appropriate response for the action (See Protocol Operations and Protocol Requirements)
At its own pace, taking into account the ttl of the Files, the server obtains all the files that are marked for dereference and validates them against their Digest and any other supporting information such as contentType, contentLength, and packaging. During the download the server SHOULD set the status to downloading. The server SHOULD be able to resume an interrupted download.
Once the Files are downloaded and processed, the server MUST set the status to ingested. If the Files need unpacking first, the server SHOULD first set the status to unpacking and then ingested when this operation is complete. The server MUST also remove the byReferenceDeposit rel.
If there is an error in downloading or otherwise processing the file, the server MUST set the status to error and SHOULD provide a meaningful log message.
The server MAY continue to record the original URL of the file if desired.

18.3.1. Representation in the Status Document

While a By-Reference File is being processed, it MUST be represented in the Status Document under the link field. The following sections show how it is represented.

On Initial Deposit

{
  "@id": "http://www.myorg.ac.uk/sword3/object1/reference.zip",
  "byReference": "http://www.otherorg.ac.uk/by-reference/file2.zip",
  "eTag": "1",
  "rel": [
    "http://purl.org/net/sword/3.0/terms/byReferenceDeposit",
    "http://purl.org/net/sword/3.0/terms/originalDeposit",
    "http://purl.org/net/sword/3.0/terms/fileSetFile"
  ],
  "status": "http://purl.org/net/sword/3.0/filestate/pending"
}

During Download

{
  "@id": "http://www.myorg.ac.uk/sword3/object1/reference.zip",
  "byReference": "http://www.otherorg.ac.uk/by-reference/file2.zip",
  "eTag": "1",
  "rel": [
    "http://purl.org/net/sword/3.0/terms/byReferenceDeposit",
    "http://purl.org/net/sword/3.0/terms/originalDeposit",
    "http://purl.org/net/sword/3.0/terms/fileSetFile"
  ],
  "status": "http://purl.org/net/sword/3.0/filestate/downloading"
}

During Unpacking

{
  "@id": "http://www.myorg.ac.uk/sword3/object1/reference.zip",
  "byReference": "http://www.otherorg.ac.uk/by-reference/file2.zip",
  "eTag": "2",
  "rel": [
    "http://purl.org/net/sword/3.0/terms/originalDeposit",
    "http://purl.org/net/sword/3.0/terms/fileSetFile"
  ],
  "status": "http://purl.org/net/sword/3.0/filestate/unpacking"
}

After Completion

{
  "@id": "http://www.myorg.ac.uk/sword3/object1/reference.zip",
  "byReference": "http://www.otherorg.ac.uk/by-reference/file2.zip",
  "eTag": "2",
  "rel": [
    "http://purl.org/net/sword/3.0/terms/originalDeposit",
    "http://purl.org/net/sword/3.0/terms/fileSetFile"
  ],
  "status": "http://purl.org/net/sword/3.0/filestate/ingested"
}

In Case of Error

{
  "@id": "http://www.myorg.ac.uk/sword3/object1/reference.zip",
  "byReference": "http://www.otherorg.ac.uk/by-reference/file2.zip",
  "eTag": "2",
  "log": "There was an error ingesting your file",
  "rel": [
    "http://purl.org/net/sword/3.0/terms/originalDeposit",
    "http://purl.org/net/sword/3.0/terms/fileSetFile"
  ],
  "status": "http://purl.org/net/sword/3.0/filestate/error"
}

18.4. Responsibilities of the client/reference server

To provide deposit By-Reference, the reference server, where the file is initially hosted, SHOULD:

Support resumable downloads
Hold the file for sufficient time for the repository to retrieve it

To use By-Reference deposit, the client SHOULD:

Follow up on the deposit to determine if the dereference of the file has been successful
Be able to take suitable onward action if there is an error

19. Metadata Deposit

SWORD allows the client to deposit arbitrary metadata onto the server through agnostic support for metadata formats. A metadata format is any document which expresses metadata in a given serialisation. SWORD has a default format which MUST be supported by the server, which consists of the set of DCMI Terms [DCMI] expressed as JSON (see Metadata Document).

In general, the form of metadata consists of several aspects:

The serialisation, such as to JSON or XML
The vocabulary of the metadata, such as Dublin Core, or MODS (sometimes the vocabulary and the serialisation will be conflated here)
The profile of the metadata, such as the RIOXX profile for DC (+extensions)

Any format (combining the 3 aspects above) may be represented by an IRI in the protocol, or an opaque string if no IRI exists or can be minted.

SWORD does not require that the server be able to disseminate any metadata in a format other than the default format. Metadata in the default format can be obtained from GET Metadata-URL. If the server chooses to make other metadata formats available, this SHOULD be listed in the links section of the Status Document. See Representing Other Formats in the Service Document for details.

19.1. Announcing Support for Metadata Formats

The server can list Metadata formats that it will accept in the acceptMetadata field of the Service Document.

If no acceptMetadata field is present, the client MUST assume the server only supports the default SWORD metadata format (http://purl.org/net/sword/3.0/types/Metadata).

{
  "acceptMetadata": [
    "http://purl.org/net/sword/3.0/types/Metadata"
  ]
}

19.2. Indicating Metadata Format to the Server

During deposit, the client SHOULD specify a Metadata-Format header which contains the identifier for the format. For example, if supplying the default SWORD metadata format:

Metadata-Format: http://purl.org/net/sword/3.0/types/Metadata

If this header is not present the server MUST assume it has the above value.

20. Metadata Formats

20.1. Default Format

In order to provide a baseline of interoperability, SWORD provides a default metadata format which MUST be supported by the server. This document has the following aspects (as per Metadata Deposit):

It is serialised as JSON and with a JSON-LD @context
It contains dc and dcterms vocabulary elements, and any other arbitrary elements added by the client
It does not pre-suppose any particular profile of usage of these vocabulary elements.

Clients MAY choose to extend this document with their own metadata fields, though the server MAY NOT understand them, and MAY ignore them.

When using this Metadata Format, the client should identify it in the Metadata-Format header with the following IRI:

http://purl.org/net/sword/3.0/types/Metadata

20.2. Depositing Other Formats

In addition to the standard SWORD metadata format described above, SWORD can support the deposit of arbitrary metadata schemas and serialisations.

Clients who wish to ensure that their servers support all the metadata they send them should consider minting a new identifier for their format, and looking for servers to declare explicit support for it.

Clients should not expect that servers will keep their metadata in the format it is provided. Servers can and will store the metadata in their internal formats as needed.

The following is a minimal example of the deposit of a MODS XML metadata file while creating a new Object:

POST Service-URL
Content-Type: application/xml
Content-Disposition: attachment; metadata=true
Digest: SHA-256=74b2851bd2760785b0987ba219debea69c228353f7ccc67a2bdcd9819f97fc71
Metadata-Format: http://www.loc.gov/mods/v3

<mods xmlns:mods="http://www.loc.gov/mods/v3">
  <originInfo>
    <place>
      <placeTerm type="code" authority="marccountry">nyu</placeTerm>
      <placeTerm type="text">Ithaca, NY</placeTerm>
    </place>
    <publisher>Cornell University Press</publisher>
    <copyrightDate>1999</copyrightDate>
  </originInfo>
</mods>

If the server supports the MODS Metadata-Format, identified with the IRI http://www.loc.gov/mods/v3 then it will be able to create a new Object from this XML document, and populate the Metadata from the data therein.

20.3. Representing Other Formats in the Service Document

A server is not required to retain or be able to disseminate the metadata delivered to it by the client in the format it is provided. Alternative metadata formats to the default format MAY be accepted (as defined by the acceptMetadata field in the Status Document), but the server is not required to be able to serve that metadata format as well.

If the server chooses to expose metadata in alternative formats to the default, it may do so by providing them as links in the links section of the Status Document. To do this:

Provide a link to the serialised metadata
Specify a rel type of http://purl.org/net/sword/3.0/terms/formattedMetadata
Specify the contentType as needed
Specify the metadataFormat as the format identifier for the metadata schema.

For example, to reflect the metadata from the previous section back to the client:

{
  "@id": "http://www.swordserver.ac.uk/col1/mydeposit/metadata.mods.xml",
  "contentType": "application/xml",
  "metadataFormat": "http://www.loc.gov/mods/v3",
  "rel": [
    "http://purl.org/net/sword/3.0/terms/formattedMetadata"
  ]
}

21. Packaged Content Deposit

SWORD allows you to deposit both Files and Metadata simultaneously through support of Packaged Content. SWORD does not place any limitations on the number or type of packaging formats that the client/server support, though see the section Packaging Formats for the packages that MUST be supported by the server.

21.1. Announcing Support for Packaged Content Deposit

The Service Document uses the acceptPackaging field to indicate that a Service will accept deposits of a particular packaging format, and the acceptArchiveFormat field to indicate the serialisation/compression formats that it understands.

Clients should refer to the treatment description in the Service Document to find out the treatment for a particular packaging type.

Packages formats SHOULD be identified by a IRI, but MAY be identified by an arbitrary string.

If no acceptPackaging field is supplied the client MUST assume that the server does not formally support any package formats, and should expect everything to be treated as per the server's policies with regard to the mimetype as per the accept element.

If no acceptArchiveFormat field is supplied the client MUST assume that the server supports application/zip only.

{
  "accept": [
    "*/*"
  ],
  "acceptArchiveFormat": [
    "application/zip"
  ],
  "acceptPackaging": [
    "*"
  ]
}

21.2. Package support during resource creation

When depositing Packaged Content, the client SHOULD indicate the archive file MIME type using the Content-Type header, and SHOULD also give information about content packaging using the Packaging header.

The value of the Packaging header SHOULD match one of values the server has advertised as acceptable for the service.

If a server receives a POST with an unacceptable Packaging header value, it MUST reject the POST by returning an HTTP response with a status code of 415 (Unsupported Media Type) and a SWORD Error document with URI http://purl.net/org/sword/3.0/error/PackagingFormatNotAcceptable, or store the content without further processing.

21.3. Package description in Status documents

Status documents can speak about packaging in two distinct ways, depending on whether an element in the links list refers to a file that was deposited, or a file that is available for retrieval by the client (or both).

When a package has been deposited as the Original Deposit, it SHOULD record the packaging format and content type alongside it in the record.

{
  "@id": "http://www.myorg.ac.uk/sword3/object1/package.zip",
  "contentType": "application/zip",
  "packaging": "http://purl.org/net/sword/3.0/package/SimpleZip",
  "rel": [
    "http://purl.org/net/sword/3.0/terms/originalDeposit"
  ]
}

Similarly, when a package has been created by the server from the Object’s content and made available to the client as a service, the packaging format and content type MUST be presented alongside it:

{
  "@id": "http://www.myorg.ac.uk/sword3/object1/package.1.zip",
  "contentType": "application/zip",
  "packaging": "http://purl.org/net/sword/3.0/package/SimpleZip",
  "rel": [
    "http://purl.org/net/sword/terms/packagedContent"
  ]
}

22. Packaging Formats

There are 3 packaging formats the all SWORD implementations MUST support.

22.1. Binary

URI: http://purl.org/net/sword/3.0/package/Binary

This format indicates that the package should be interpreted as an opaque blob, and the server SHOULD NOT attempt to extract any content from it. This is typically for use when depositing single files, which do not need unpacking of any kind.

Servers MAY choose, nonetheless, to extract content from Binary packages, if they have the capabilities, such as metadata from images, structural information from text documents, etc.

22.2. SimpleZip

URI: http://purl.org/net/sword/3.0/package/SimpleZip

This format indicates that the package is a compressed set of one or more files in an arbitrary directory structure. The nature of the compression and the structure of the compressed content is not specified.

Servers MAY choose to extract the content from SimpleZip packages, and present the individual file components as derivedResources, if desired.

22.3. SWORDBagIt

URI: http://purl.org/net/sword/3.0/package/SWORDBagIt

This format is a profile of the BagIt directory structure, which has in turn been serialised (which may include compression). The nature of the serialisation/compression is not specified, though if the client wishes the server to extract the content, it SHOULD use one of the formats specified in the Service Document field acceptArchiveFormat.

A SWORD BagIt Profile is available which desribes the outline structure of the bag.

SwordBagIt
| -- bag-info.txt
| -- bagit.txt
| -- data
| -- | -- bitstreams ...
|    \ -- directories ...
|         \ bitstreams ...
| -- manifest-sha-256.txt
| -- metadata
|     \-- sword.json
\ -- tagmanifest-sha-256.txt

This allows us to represent the item as a combination of an arbitrary structure of bitstreams in the data directory (similar to SimpleZip), and the metadata in the sword default format in metadata/sword.json. A manifest (and tagmanifest) of sha-256 checksums is required, as well as the bagit.txt file and a bag-info.txt file. Note that although listed, the bag-info.txt is not used by SWORD to transfer metadata. All metadata MUST appear in metadata/sword.json.

The content of sword.json is exactly as defined in the SWORD default Metadata. Note that use of fetch.txt is not supported here.

The server SHOULD unpack this file, and action at least the Metadata. The contents of the data directory MAY be unpackaged into derivedResources if the server desires. It is RECOMMENDED that the contents of the data directory be a flat file structure, to aid mutual comprehension by servers/clients.

23. Auto-Discovery

In order to assist potential clients discover a server’s capabilities, SWORD RECOMMENDS the following auto-discovery features to be embedded in any web interfaces associated with the service provider.

23.1. For Services

Embed an html link with a rel value of http://purl.org/net/sword/3.0/discovery/Service in any page which represents a deposit Service.

<html:link rel="http://purl.org/net/sword/3.0/discovery/Service" href="[Service-URL]"/>

23.2. For Objects

Embed an html link with a rel value of http://purl.org/net/sword/3.0/discovery/Object in any page which represents a deposited resource.

<html:link rel="http://purl.org/net/sword/3.0/discovery/Object" href="[Object-URL]"/>

23.3. Well-Known URI

For any server which wishes to expose its main or root Service-URL via Well-Known URIs [RFC8615], provide a redirect (307) from ./well-known/swordv3 (PROVISIONAL) to your root Service-URL.