API Send Secure Documents

ArmoredEnvoy™ API Documentation

Document revision date: October 7, 2013

Copyright © Armatu 2014, all rights reserved.

 

Introduction

Detailed reporting about document access

Notably, no special software is needed by end users for controlled document access, only a modern web browser.

The Armored Envoy service has three parts:

  1. Vault: An online data store for digital documents and other content during transmission
  2. AE API: An application programming interface (API) with functions for
    1. adding content to and removing content from the data store
    2. manipulating metadata such as expiration times and access control
    3. generating universal resource identifiers (URIs) for access to specific content
    4. downloading reports about content access
  3. AE Gateway web app: A secure web application that provides an end-user interface to enforce access controls and deliver content when accessed via the URIs created by the API

This document describes the ArmoredEnvoy API.

About document sensitivity

The AE API described in this document is intended to provide security appropriate for documents up to the secret level, plus hooks for appropriately strong challenges. Producer/viewer top secret cryptography is external as of this revision.

For comparison, iPost’s iMM email solution is appropriate for unclassified and confidential levels.

Four levels of document sensitivity are defined below. Level names are borrowed from the US DoD but their meanings may differ.

  1. Unclassified – equivalent to ordinary email or web page. Document contains no confidential information, need not be encrypted in storage, and may be delivered via unencrypted channels. Delivery is directed to the target audience but snooping or accidental disclosure is not a concern, so no verification of viewer identity or content encryption is required.
  2. Confidential – like HTTP/S to a public website. Document might contain personally identifiable information (PII) but does not contain information that has compliance requirements (PCI, HIPAA, etc.). Document need not be encrypted in storage but should be delivered over a secure channel such as end-to-end SSL. Snooping is not permitted but simplicity of access for the audience is more important than prevention of accidental disclosure, so possession of a document request is sufficient to verify viewer identity.
  3. Secret – similar to HTTP/S to a password protected site. Document might contain sensitive confidential information having compliance requirements, must be encrypted while stored (“at rest”) and delivered over a secure channel, and accidental disclosure should be prevented. Therefore, each viewer must both possess a document request and answer a challenge. However, this authentication does not necessarily involve a pre-arranged password or other cryptographic exchange; out-of-band shared knowledge (e.g., PII) such as a phone number may be sufficient at the Producer’s discretion.
  4. Top Secret – equivalent to S/MIME email with public key encryption. Same as secret except that the challenge in this case typically involves stronger assurance of identity via a challenge such as a password or OAuth token. The document should be encrypted by the producer (or by the operator on the producer’s behalf), re-encrypted for storage, and fully decrypted only by the recipient after delivery, which may require additional out-of-band communication between the producer and each viewer (for example, see the W3’s Web Cryptography API).

Terminology

Instance – one complete functional set of Armored Envoy servers and services, provided to one Armored Envoy customer (typically a single Producer as defined below).

AE implementation means the programming code and deployment arrangements of an AE instance.

Document – any static digital content, such as a single byte, or plain text, or a video.

Credentials are information which identify and permit access for specific individuals or component processes which interact with or within the AE implementation.

Sensitive Information – The following are sensitive information:

  • Documents and metadata
  • Identities of all viewers or a single viewer
  • Credentials
  • The collection of all viewer tokens or handles for an audience, or a single handle
  • The collection of all monikers for a producer, or a single moniker

The following are explicitly not sensitive information:

  • AE implementation
  • A single viewer token

Moniker – a name assigned to a document to obscure its actual name, used when the actual name might be a hint as to its value to an attacker

Link – an http or https URI, can be pre-authorized or not

Vault – Secure store service for documents. Resolves requests on handles.

Handle – a pre-authorized link that directly returns a document

Audience – the set of individuals who are each intended by the Producer to be able to obtain a handle.

Viewer – an individual member of the audience.

Operator – an individual who loads a document into the service and specifies the audience for it, thereby creating (or enabling the creation of) a set of handles.

Producer – an entity that originates documents, typically the employer of the operator

User – either a viewer or an operator, depending on context

Attacker – A hypothetical individual who is not a user but who attempts to gain unauthorized access to sensitive information.

Viewer token – any data representing shared knowledge (whether or not secret) between the producer and the audience, which demonstrates that a viewer meets the policy for requesting a handle

Placeholder – a URI template representing a document request but which typically cannot be resolved directly; instead the placeholder must first be filled in with the viewer token

Document request – an HTTP/S URI created by completing a placeholder with a viewer token, used to obtain a handle

Challenge – a user interaction which, when successfully completed, is considered to have validated the user’s identity

Gateway – an intermediate server to which the document request resolves; responsible for resolving the challenge and thereafter providing the handle

The following diagram shows the how the various URL forms relate to one another and to documents. Some terminology used in this diagram is explained in the next section (API for Producer). Identifiers such as “ipo.st”, “94949” and “piltdown.harry” are examples only.

img

API for Producer

Some terminology used in this section is drawn from the REST API Design Rulebook by Mark Massé (O’Reilly), ISBN 978-1-449-31050-9.

Data Model

Information about the producer identity is encoded in the authority (i.e., host) and/or document root (AKA docroot) of the URL used to access the API, and is not considered part of the data model. This is referred to as the apiRoot in examples below.

The document store has a flat namespace without hierarchy. In Rulebook terms, the API does not support creation and management of collections. However, for API naming purposes a single top-level collection called “documents” is used.1 Producers may use a hierarchical form of document naming but this is not represented directly in the data model. When a new document is created, the producer’s original name for that document becomes metadata, and an opaque identifier (the moniker) is assigned to the document.

Each record2 in the data model contains the following:

  • Original name (see above)
  • Media type (HTTP Content-Type value, e.g. application/pdf)
  • Content (the actual document)
  • Expiration date
  • Access control description (for audience — access for operator is via the API itself)
  • Challenge (optional — JSON struct, see Challenge Specification below)
  • Fulfillment (optional — JSON struct)
  • Producer-defined data bundle (JSON)

Notes

  1. This is similar to the object namespace adopted by Amazon S3, which provides a single level of collections (buckets): “An Amazon S3 bucket has no directory hierarchy such as you would find in a typical computer file system. You can, however, create a logical hierarchy by using object key names that imply a folder structure.”
  2. A document in the data model is a record of the current state of one object in the document vault. A handle returns the object from the vault, in contrast to API calls which manipulate data model records.

REST Definition

Refer to URI template syntax for clarification of formats below. “Message” refers to the HTTP request body, see REST Rulebook. “Response” refers to a successful result and is expressed in JSON format, with textual descriptions of values in Arial font and literals or URI templates in consolas font. Some error responses are still to be defined as of this document revision, except as noted. All API requests are directed to an authority (host) plus docroot, together designated here as the apiRoot, which is different for each instance of the service.

Log in and out of Armored Envoy services

Request: POST /login

Message: multipart/form-data

{?email}

{?key}

Response: code 200

Authorization: {token}

{message}

Semantics: Method to login a user to the API; returns an informational message as the content. The key is a PEM block which is generated and sent to you when you configure your AE instance. The email address is used to distinguish users for auditing purposes.

An authorization token is returned in the Authorization header. After login, the authorization token is used to populate an Authorization header in standard HTTP AUTH Basic format. Note that the code provided by /login must be encoded with the base64 algorithm before using it in outgoing Authorization headers in further API accesses. The token is returned un-encoded in case the HTTP Auth library you use expects to apply that encoding.

The authorization code is time-limited. Default expiration time is 1 hour, a different expiration can be selected at AE instance deployment time.

Failure occurs on an invalid PEM key, returning an informational message with code 401.

Request: POST /logout

Message: none

Response: empty, code 200

Semantics: Immediately expires the authorization code contained in the Authorization header.

Insert a document into ArmoredEnvoy

Document insertion is atomic, that is, in the absence of a catastrophic system failure, no part of the document content remains in the document vault after any failed document insertion.

Possible failure cases include: expiration date already in the past; bad structure format of audience, fulfillment, or challenge; metadata too large (limits yet to be defined, at this revision).

Request: POST /documents

Message: multipart/form-data

{?name}

{?expiration}

{?audience}

{?fulfillment}

{?challenge}

{?landingpage}

{?metadata}

followed by the document content with media type

Response1: code 201

Last-Modified: The current time

Content-Location: https://{apiRoot}/documents/{docId}

Content-Type: application/json

{

“id” : ID value (numeric?) generated by the service,

“hash” : {

“algorithm” : Keyword (at this time always “SHA-1”),

“value” : hash string computed on the contents of the stored document

},

“expiration” : From the expiration parameter or default 1 year from now,

“links” : {

“self” : {

“href” : “https://{apiRoot}/documents/{docId}”,

“rel” : “self”

},

“document” : {

“href” : “https://{apiRoot}/documents/{docId}/document”,

“rel” : “edit”

},

“audience” : {

“href” : “https://{apiRoot}/documents/{docId}/audience”,

“rel” : “edit”

},

“fulfillment” : {

“href” : “https://{apiRoot}/documents/{docId}/fulfillment”,

“rel” : “edit”

},

“challenge” : {

“href” : “https://{apiRoot}/documents/{docId}/challenge”,

“rel” : “edit”

},

“landingpage” : {

“href” : “https://{apiRoot}/documents/{docId}/landingpage”,

“rel” : “edit”

},

“metadata” : {

“href” : “https://{apiRoot}/documents/{docId}/metadata”,

“rel” : “edit”

},

“placeholder” : {

“href” : “https://{gateway}/{docId}{?viewerToken}”,

“rel” : “alternate”

}

}

}

Notes

  1. The difference between “self” and “placeholder” is that “self” is only accessible via an authenticated API connection, whereas “placeholder” is a publishable link (see Document Access UX for Audience). The viewerToken in the placeholder is not filled in by the API server, it represents a template value to be filled in when the placeholder is used (all other variables such as the gateway are meant to be filled in by the API server).

Retrieve a document record or data from ArmoredEnvoy

Request: GET /documents/{docId}

Message: none

Response: same as Create above1, except code 200

Request2: GET /documents/{docId}/document

Message: none

Response: document from vault with media type, code 200

Request: GET /documents/{docId}/audience

Message: none

Response: audience object with media type (application/json), code 200

Request: GET /documents/{docId}/fulfillment

Message: none

Response: fulfillment object with media type (application/json), code 200

Request: GET /documents/{docId}/challenge

Message: none

Response: challenge object with media type (application/json), code 200

Request: GET /documents/{docId}/landingpage

Message: none

Response: landingpage object with media type (application/json), code 200

Request: GET /documents/{docId}/metadata

Message: none

Response: metadata object with media type (typically application/json), code 200

Notes

  1. The hash structure returned for a document must always use the same algorithm that was returned upon initial document insertion. That is, the API caller should not be required to recompute a previously returned hash in order to determine that the document is identical. This is only important for future revisions where more than one algorithm may be in use.
  2. In some cases this action may not be allowed, i.e., the only way to retrieve a document from the vault is through the placeholder. Of course, anyone who can update the audience and challenge can give themselves access to a placeholder.

Browse documents

Armored Envoy instances can be configured for low security or high security applications. In the high security case, browsing documents is not permitted, the caller must possess a valid {docId} to retrieve any information.

Request: GET /documents

Message: none

Response: empty, code 405

Allow: POST

In a low security instance, a list of document IDs may be retrieved.

Request: GET /documents

Message: none

Response: code 200

Content-Type: application/json; charset=utf-8

{

“msg” : “list of retrieved documents”,

“documents” : [

{

“{docId}” : “{name}”

},

{

“{docId}” : “{name}”

},

… and so on …

]

}

Update a document in ArmoredEnvoy

Request1: POST /documents/{docId}

Message: multipart/form-data, as Create above2, except document content is optional

Response: same as Create above, except code 200

Semantics: any subset of the fields passed to Create may be included, and only the included fields are updated. A new placeholder2 is generated even if the document content is not updated (this implies copying or renaming the existing content). The old copy in the vault is deleted (in future, archival may be required, but not for first implementation).

Request3: PUT /documents/{docId}/document{?lastModified}

Message: document contents with media type

Response: empty, code 204

Location: https://{apiRoot}/documents/{docId}/document

Semantics: overwrites the document in the vault, in place. If a lastModified parameter is present and the Last-Modified date of the existing document is more recent than the value in the lastModified parameter, the document is NOT overwritten (response code 304). The format of the lastModified value is the same as that of the HTTP Last-Modified header.

Request: PUT /documents/{docId}/audience

Message: audience object with media type (application/json)

Response4: empty, code 204

Location: https://{apiRoot}/{docId}/audience

Request: PUT /documents/{docId}/fulfillment

Message: fulfillment object with media type (application/json)

Response: empty, code 204

Location: https://{apiRoot}/{docId}/fulfillment

Request5: PUT /documents/{docId}/challenge

Message: challenge object with media type (application/json)

Response: empty, code 204

Location: https://{apiRoot}/{docId}/challenge

Request: PUT /documents/{docId}/landingpage

Message: metadata object with media type (application/json)

Response: empty, code 204

Location: https://{apiRoot}/{docId}/landingpage

Request: PUT /documents/{docId}/metadata

Message: metadata object with media type (application/json)

Response: empty, code 204

Location: https://{apiRoot}/{docId}/metadata

Notes

  1. PUT is not allowed to this URI (do nothing, response code 405).
  2. Unlike GET, a new hash.algorithm may be returned here. It’s a different document.
  3. POST to a document renders any extant placeholders (and corresponding document requests) invalid; that is, future document requests to the gateway based on the old placeholders must yield a 404 response. PUT to any data field (including {docId}/document) does NOT alter the placeholder or document request.
  4. PUT is not allowed to alter the audience type or viewerToken (see Audience Specification below) except to set viewerToken to null. If either the viewerToken in the new audience object is defined and differs from the stored value, or the type differs, return response code 409 (conflict) and do not alter the stored value. Audience type and viewer token must be updated at the same time as the challenge, via POST, so that format checking can occur and a new placeholder can be returned.
  5. PUT is not allowed to change the challenge consumer.type or access.type from “literal” to “field” (see Challenge Specification and Audience Specification below). If either of these changes is attempted, return response code 409. Changing from “field” to “literal” is allowed.

Delete a document record or data from ArmoredEnvoy

Request: DELETE /documents/{docId}{?lastModified}

Message: none

Response: empty, code 204

Semantics: deletes the document record and all associated data (especially the document in the vault). If a lastModified parameter is present and the Last-Modified date of the existing document is more recent than the value in the lastModified parameter, the document is NOT deleted (response code 304).

Request: DELETE /documents/{docId}/document

Message: none

Response: empty, code 409

Semantics: deleting only the document in the vault is not allowed. The whole record must be deleted instead.

Request: DELETE /documents/{docId}/audience

Message: none

Response: empty, code 204

Request: DELETE /documents/{docId}/fulfillment

Message: none

Response: empty, code 204

Request: DELETE /documents/{docId}/challenge

Message: none

Response: empty, code 204

Request: DELETE /documents/{docId}/landingpage

Message: none

Response: empty, code 204

Request: DELETE /documents/{docId}/metadata

Message: none

Response: empty, code 204

Controller URIs

Request: POST /documents/{docId}/deliver

Message: multipart/form-data, producer-defined

Response: code 202 (“in progress”)

Semantics: initiates the process defined by the fulfillment, passing it the form-data from this call, the audience object, and the placeholder URI (see Fulfillment Specification below).

Download audit trail

Logs may be downloaded for all documents, or for a single document by appending a document ID to the URL path.

Request: GET /logs{/docId}{?start_date,end_date}

Message: none

Response: code 200

Content-Type: application/zip

Semantics: The optional start and end dates determine which logs are returned. Each archive returned contains three log files: API server, Gateway server, and direct Vault access. Each file in the archive is text in tab-separated columns. API and Gateway logs are available in near real time, logs for Vault access may have a delay of up to several hours.

Log format:

API access logs:

{docId} {end time} {start time} {remote IP} {REST method} {request URI} {HTTP status} {hit number} {publisher id}

Gateway access logs:

{docId} {end time} {start time } {remote IP} {REST method} {request URI (without get parameters)} {HTTP status} {hit number} {publisher id}

Vault access logs:

{docId} {time} {remote IP} {operation} {key (anon name)} {request URI} {http protocol used} {HTTP status} {error code} {bytes sent} {object size} {total time} {referer} {user agent}

Landing Page Specification

The landingpage object may be null (undefined). When it is defined, it is a JSON structure whose fields are a type and landing page data. The type name identifies the format of the data. Two types of landing page are currently defined:

  1. No landing page

The gateway displays a simple default page including (an anchor for) the handle URL. This is represented by the JSON null value.

  1. URL

The landing page is externally defined. The gateway displays this page in an iframe or the equivalent, below the handle URL. The data is a URL template which must have the scheme “https” and which may specify fields from the audience record.

{

“type” : “url”,

“href” : “https://{location}{?audienceFields}”

}

The “audienceFields” are filled from the audience records in the same way that the “viewerToken” is appended to the document request URL.

Challenge Specification

The challenge object may be null (undefined). When it is defined, it is a JSON structure whose fields are a type and the challenge data. The type name identifies the format of the data. Three types of challenge are defined:

  1. No challenge

The document must have an audience object (JSON) defined. If the viewer token matches a member of the audience (see Audience Specification), the challenge succeeds. There is no user interaction in this case. This is represented by the JSON null value.

  1. HTTP Auth

The document must have a challenge object (JSON), and may have an audience object. The challenge describes the authorization realm. If the challenge also describes a password, no audience object is required. Otherwise a member of the audience must match, and the audience data for the viewer must include the password.

{

“type” : “HTTP-Auth”,

“data” : {

“realm” : Authority scope (most browsers display as part of the prompt),

“scheme” : “basic” / “digest”,

“access” : {

“type” : “literal” / “field”,

“value” : user-pass string or name of audience field

}

}

}

The gateway server handles the HTTP Auth protocol directly for this challenge type. The exchange may use Basic authentication but must occur over the HTTP/S protocol. If challenge.data.access.type = “field” then there must be an audience and audience.type must be “records” (see Audience Specification below).

RFC 2617 defines the user-pass string as a “userid” and “passwd” separated by a colon (:). For the “basic” scheme (access.value is clear text) the “userid” exchanged in the Auth protocol is discarded when access.value does not contain a colon character, and the value is the passwd only. This allows the password to be any string in the audience data, such as a phone number. For the “digest” scheme the userid with colon must be part of the user-pass string (because it is required in order to create and compare the digest) and the producer and viewer are expected to have agreed on the userid and passwd in advance.

  1. OAuth

The document must have a challenge object, and may also have an audience object. In OAuth parlance, the document viewer is the “User”, the Armored Envoy gateway acts as the “Consumer”, the challenge object describes the “Service Provider”, and either the challenge or the audience object describes the “Consumer Key” and “Shared Secret”. It is expected that typically the challenge object contains the Consumer Key and the audience object contains a different Shared Secret for each viewer.

{

“type” : “OAuth”,

“data” : {

“provider” : {

“href” : URI of OAuth Service Provider

},

“consumer” : {

“type” : “literal” / “field”,

“value” : Consumer Key or name of audience field

}

“access” : {

“type” : “literal” / “field”,

“value” : Shared Secret or name of audience field

}

}

}

If either of challenge.data.access.type or challenge.data.consumer.type has the value “field” then there must be an audience and audience.type must be “records” (see Audience Specification below). The value in the named field in the audience record is the string to use as the Consumer Key and/or Shared Secret. Otherwise (“literal”) it is up to the Service Provider to distinguish among viewers.

As specified in the “OAuth dance” the gateway (Consumer) redirects the User to the Service Provider for authorization; the Service Provider then redirects the User back to the gateway again with an appropriate verifier. The gateway then requests from the Service Provider an “Access Token” with expiration time. If an Access Token is returned, the gateway creates a handle for the originally requested document and redirects the User there.

In a typical OAuth exchange, the Access Token would then be used to make requests of the Service Provider. For Armored Envoy there are no further requests; only the expiration time is needed, during which the gateway creates document handles for that User. Once the Access Token expires, the dance must be repeated.

Future directions for challenges may include direct support for OpenID, and for proprietary challenges such as Google Authenticator.

Audience Specification

The audience object may be null (undefined). When it is defined, it is a JSON structure whose fields are a type name, a template for the viewer token, and the audience data. The type name identifies the format of the audience data.

The viewer token template is a URI template query component expression describing the query parameters that the gateway should expect to find in a valid document request. Typically this includes the fields named in challenge.data.access or challenge.data.consumer, but it may also include other fields. The set of fields in the viewer token is expected to uniquely identify a member of the audience; if it does not, the correct challenge is not guaranteed to be issued when the gateway processes a document request. Note that the entire audience is expected to view the same single document, so unique identity is not otherwise a security consideration.

ArmoredEnvoy supports three types of audience:

  1. No audience

The provider is expected to use the placeholder returned in the document record (see Insert a document above) to construct document requests, and to arrange for their delivery without use of this API.

  1. An opaque audience object to be passed to the fulfillment service

In this case the challenge must use challenge.data.access.type = “literal” as described above. The viewerToken shown below is an example only, it can be any query component or null and is not checked for accuracy by the API when storing an octet-stream object.

{

“type” : “octet-stream”,

“viewerToken” : “{?email_address}”,

“data” : base64-encoded string to be passed to fulfillment service

}

  1. An array of JSON structures representing sets of name/value pairs

(Cf. dataRecords structure in iPost® XMLRPC importListData API). We define audience.data.records in anticipation of adding other descriptive fields to audience.data in the future. Each element of the array must include at least the fields named in the viewerToken, and should all have the same set of name/value pairs.

The following is an example only, in practice the field names of the elements of audience.data.records are caller-defined.

{

“type” : “records”,

“viewerToken” : “{?telephone,first_name}”,

“data” : {

“records” : [

{

“email_address” : “h.piltdown@ipo.st”,

“first_name” : “Harry”,

“last_name” : “Piltdown”,

“telephone” : “4154445511”

},

… and so on …

]

}

}

Lookups on the audience data are performed by searching the records array for any match to the set of desired fields. All contexts discussed in this specification expect a single match to be returned, but future contexts may allow for multiple matches.

Fulfillment Specification

The fulfillment object may be null (undefined). When it is defined, it is a JSON structure that describes how ArmoredEnvoy should communicate with the fulfillment service. Note that this occurs only as a result of activating the deliver controller (POST /documents/{docId}/deliver) and all necessary authentication tokens must be included in the form-data passed in that request. No authentication information for fulfillment is stored in the document records.

There are two varieties of fulfillment object:

  1. A generic fulfillment description. This consists of a document using URI template syntax, which is filled in with data from the input form-data and from the audience object and is then sent to a URI for the fulfillment service with POST.
  2. A custom fulfillment description which differs for each fulfillment service and is created via an integration project with the fulfillment vendor and/or producer.

The first custom fulfillment object will define an interface to the iPost® XMLRPC iPost Mailing Manager™ (iMM™) triggered mailing API.

Document Access UX for Audience

  1. User receives an email message containing a cryptographically encoded link that identifies
    1. the viewer and
    2. a placeholder for the document
  2. User clicks on the encoded link;
    1. his browser contacts the server which decodes the link and builds a document request from the placeholder;
    2. the browser is then redirected to the document request
  3. The document request resolves to the gateway;
    1. if there is a challenge, the gateway presents it;
    2. when the user is validated, a handle is created and
    3. the browser is directed to the handle
  4. The browser downloads the document and presents it to the user.