HTTP Gateway Protocol Specification
Introduction
Section titled “Introduction”The HTTP Gateway Protocol is an extension of the Internet Computer Protocol that allows conventional HTTP clients to interact with the Internet Computer network. This is important for software such as web browsers to be able to fetch and render client-side canister code, including HTML, CSS, and JavaScript as well as other static assets such as images or videos. The HTTP Gateway does this by translating between standard HTTP requests and API canister calls that the Internet Computer Protocol will understand.
Such an HTTP Gateway could be a stand-alone proxy, it could be implemented in web browsers (natively, via a plugin or a service worker) or in other ways. This document describes the interface and semantics of this protocol independent of a concrete HTTP Gateway so that all HTTP Gateway Protocol implementations can be compatible.
Overview
Section titled “Overview”An HTTP request by an HTTP client is handled by these steps:
- An HTTP client makes a request.
- The HTTP Gateway intercepts the request.
- The HTTP Gateway resolves the canister ID that the request is intended for.
- The HTTP Gateway Candid encodes the HTTP request.
- The HTTP Gateway invokes the canister via a query call to the
http_requestcanister method. - The canister handles the request and returns an HTTP response, encoded in Candid.
- The HTTP Gateway Candid decodes the response for inspection and further processing.
- If requested by the canister, the HTTP Gateway sends the request again via an update call to
http_request_update. - If applicable, the HTTP Gateway fetches further response body data via streaming query calls.
- If applicable, the HTTP Gateway validates the certificate of the response.
- The HTTP Gateway returns the decoded response to the HTTP client.
Canister ID Resolution
Section titled “Canister ID Resolution”The HTTP Gateway needs to determine the canister ID for each incoming request before it can forward the request to the Internet Computer. The mechanism by which the canister ID is resolved is not prescribed by this specification and may vary across implementations. Some examples of how a canister ID can be obtained include:
- Extracted from the hostname (e.g., a canister ID encoded as a subdomain).
- Looked up via DNS (e.g., a TXT record at a well-known subdomain).
- Retrieved from a static mapping configured in the gateway.
- Provided via an HTTP response header returned during a pre-flight lookup.
If the HTTP Gateway cannot determine a canister ID for a request, it may handle the request as a standard Web2 request or return an error, depending on the implementation.
API Boundary Node Resolution
Section titled “API Boundary Node Resolution”An API Boundary Node forwards Candid encoded HTTP requests to the relevant replica node. Any requests to the Internet Computer made by an HTTP Gateway are forwarded through these API boundary nodes. The hostname of the API boundary nodes is always icp-api.io.
HTTP Request Encoding
Section titled “HTTP Request Encoding”An HTTP request is encoded using the following Candid interface:
type HeaderField = record { text; text; };
type HttpRequest = record { method: text; url: text; headers: vec HeaderField; body: blob; certificate_version: opt nat16;};The full Candid interface is described in Canister HTTP Interface.
- The
methodfield contains the HTTP method in all upper case letters, e.g."GET". - The
urlfield contains the URL from the HTTP request line, i.e. without protocol or hostname, and includes query parameters. - The
headersfield contains the headers of the HTTP request. - The
bodyfield contains the body of the HTTP request (without any content encodings processed by the HTTP Gateway). - The
certificate_versionfield indicates the maximum supported version of response verification.- A value of
2will request the current standard of response verification, while a missing version or a value of1will request the legacy standard. - Current HTTP Gateway implementations will always request version 2, but older HTTP Gateways may still request version 1.
- A value of
Query Calls
Section titled “Query Calls”The encoded HTTP request is sent as a query call according to the HTTPS Interface via the API Boundary Node resolved according to API Boundary Node Resolution.
HTTP Response Decoding
Section titled “HTTP Response Decoding”An HTTP response is decoded from the result of the query call using the following Candid interface:
type HeaderField = record { text; text; };
type HttpResponse = record { status_code: nat16; headers: vec HeaderField; body: blob; upgrade : opt bool; streaming_strategy: opt StreamingStrategy;};The full Candid interface is described in Canister HTTP Interface.
- The HTTP response status code is taken from the
status_codefield. - The HTTP response headers are taken from the
headersfield. - The HTTP response body is initialized with the value of the
bodyfield and further assembled as per the response body streaming protocol.
Notes:
- Not all HTTP Gateway implementations may be able to pass on all forms of headers. In particular, Service Workers are unable to pass on forbidden headers.
- HTTP Gateways may add additional headers. In particular, the following headers may be set:
access-control-allow-origin: \*access-control-allow-methods: GET, POST, HEAD, OPTIONSaccess-control-allow-headers: DNT,User-Agent,X-Requested-With,If-Modified-Since,Cache-Control,Content-Type,Range,Cookieaccess-control-expose-headers: Content-Length,Content-Rangex-cache-status: MISS
Response Verification
Section titled “Response Verification”The HTTP Gateway will primarily be used to load static assets needed to run frontend canister code, so both low latency and security are essential for providing a good experience to end users. Query calls are more performant but less secure than Update calls.
Response verification fills the security gap left by query calls. It is a versioned subprotocol that allows for an HTTP Gateway to verify a certified response received as a result of performing a query call to the Internet Computer. Two versions are currently supported, the current version of response verification is covered in this section and the legacy version is covered in another section. The legacy version only includes a mapping of the request URL to the response body so it is quite limiting in what it can verify. The current version builds on the legacy version by optionally including the following extra parameters in the certification process:
- Request URL query params
- Request method
- Request headers
- Response status code
- Response headers
Response Verification Outline
Section titled “Response Verification Outline”- Case-insensitive search for the
IC-Certificateresponse header.- If no such header is found, verification fails.
- If the header value is not structured as per the certificate header, verification fails.
- Parse the
certificateandtreefields from theIC-Certificateheader value as per the certificate header. - Perform certificate validation.
- Parse the
versionfield from theIC-Certificateheader value as per the certificate header.- If the
versionfield is missing or equal to1then proceed with legacy response verification. - If the
versionfield is equal to2then continue. - Otherwise, verification fails.
- If the
- Parse the
expr_pathfield from theIC-Certificateheader value as per the certificate header. - The parsed
expr_pathis valid as per Expression Path otherwise, verification fails. - Case-insensitive search for the
IC-CertificateExpressionheader.- If no such header is found, verification fails.
- If the header value is not structured as per the certificate expression header, verification fails.
- Let
expr_hashbe the label of the node in the tree at pathexpr_path.- If no such label exists, verification fails.
- If
expr_hashdoes not match the sha256 hash of theIC-CertificateExpressionheader value, verification fails. - If
no_certificationis set, verification succeeds. - Let
response_hashbe the response hash calculated according to Response Hash Calculation - If
no_request_certificationis set:- If the
expr_hashlabel node has an empty leaf node at the subpath["", response_hash], verification succeeds. - Otherwise, verification fails.
- If the
- Let
request_hashbe the request hash calculated according to Request Hash Calculation.- If there is not an empty leaf node at the subpath
[request_hash, response_hash], verification fails.
- If there is not an empty leaf node at the subpath
The Certificate Header
Section titled “The Certificate Header”The IC-Certificate header is a structured header according to RFC 8941 with the following mandatory fields:
certificate: Base64 encoded string of self-describing, CBOR-encoded bytes that decode into a valid certificate.tree: Base64 encoded string of self-describing, CBOR-encoded bytes that decode into a valid hash tree as per certificate encoding.
The following additional fields are mandatory for response verification version 2 and upwards:
version: String representation of an integer that represents the version of response verification that was used to build thetree.expr_path: Base64 encoded string of self-describing, CBOR-encoded bytes that decode into an array of strings.
Expression Path
Section titled “Expression Path”The decoded expr_path field of The Certificate Header is an array of strings that corresponds to a path in the tree field of the same header:
- The first segment is always
http_expr. - The last segment is always
<$>or<*>. - No segment, aside from the last segment, will be
<$>or<*>. - Each segment between
http_exprand<$>or<*>will contain a percent-encoded segment of the current request URL. - The path must be the most specific path for the current request URL in the tree, i.e. a lookup of more specific paths must return
Absentas per lookup. - An
expr_paththat ends in<$>is an exact match for the current request URL. <*>is treated as a wildcard, so anexpr_paththat ends in<*>is a partial match for the current request URL.
Certificate Validation
Section titled “Certificate Validation”Certificate validation is performed as part of response verification as per Canister Signatures and Certification. It is expanded on here concerning response verification for completeness:
- Case-insensitive search for a response header called
IC-Certificate. - The value of the header corresponds to the format described in the certificate header section.
- The decoded
certificatemust pass the following validations:- The certificate is signed by the root key of the NNS subnet or by a subnet delegation signed by that same root key.
- If the certificate contains a subnet delegation, the delegation must be valid for the given canister.
- The timestamp at the
/timepath must be recent, e.g. 5 minutes. - The subnet state tree in the certificate must reveal the canister’s certified data.
- The root hash of the decoded
treemust match the canister’s certified data.
The Certificate Expression Header
Section titled “The Certificate Expression Header”The IC-CertificateExpression header carries additional information instructing the HTTP Gateway how to reconstruct the certification, it can instruct the HTTP Gateway to:
- Exclude the complete request/response pair or the request only.
- Include specific request headers.
- Include specific request URL query parameters.
- Include or exclude specific response headers.
The format of the IC-CertificateExpression header is as follows:
IC-CertificateExpression: default_certification(ValidationArgs{<literal field values>})The value of this header must have valid CEL syntax, such that default_certification could be implemented as a function provided by the HTTP Gateway to validate the certification.
The properties supplied to this function are as follows:
certified_request_headers- a list of request header names to include. This list can be empty.- Mutually exclusive with the
no_request_certificationproperty.
- Mutually exclusive with the
certified_query_parameters- a list of request URL query parameter names to include. This list can be empty.- Mutually exclusive with the
no_request_certificationproperty.
- Mutually exclusive with the
certified_response_headers- a list of response header names to include.- Must not include
IC-CertificateorIC-CertificateExpression. - Mutually exclusive with the
response_header_exclusionsproperty.
- Must not include
response_header_exclusions- a list of response header names to exclude. All other headers are included.- Must not include
IC-CertificateorIC-CertificateExpression. - Mutually exclusive with the
certified_response_headersproperty.
- Must not include
no_request_certification- disables certification of the request for this HTTP response.- Mutually exclusive with the
certified_request_headersandcertified_query_parametersproperties. - This feature has security implications. If it is used on a path that serves dynamic content using the upgrade to update call feature, malicious replica nodes can always return the certified response, instead of setting the upgrade flag on the response.
- Mutually exclusive with the
no_certification- disables certification for this HTTP request/response pair.- This feature has security implications. Clients will not be able to verify the authenticity of the response content if it is used. Dynamic content can be returned securely by making use of the upgrade to update feature. Only use
no_certificationif the content is dynamic, the latency of an update call is too high and the impact of a malicious response on that path is benign.
- This feature has security implications. Clients will not be able to verify the authenticity of the response content if it is used. Dynamic content can be returned securely by making use of the upgrade to update feature. Only use
The ValidationArgs object has the following Protocol Buffer 3 definition:
message ResponseHeaderList { repeated string headers = 1;}
message RequestCertification { repeated string certified_request_headers = 1; repeated string certified_query_parameters = 2;}
message ResponseCertification { oneof response_headers { ResponseHeaderList certified_response_headers = 1; ResponseHeaderList response_header_exclusions = 2; }}
message Certification { oneof request { RequestCertification request_certification = 1; Empty no_request_certification = 2; } ResponseCertification response_certification = 3;}
message ValidationArgs { oneof certification { Certification certification = 1; Empty no_certification = 2; }}The syntax of the header is defined by the following EBNF:
CHAR = /[^\0\n"]/STRING = '"', { CHAR }, '"'STRING-LIST = '[', { STRING }, ']'
RESPONSE-HEADER-LIST = 'ResponseHeaderList{headers:', STRING-LIST, '}'
REQUEST-CERTIFICATION = 'RequestCertification{certified_request_headers:', STRING-LIST, ',certified_query_parameters:', STRING-LIST, '}'
RESPONSE-CERTIFICATION = 'ResponseCertification{', ('response_header_exclusions:' | 'certified_response_headers:'), RESPONSE-HEADER-LIST, '}'
CERTIFICATION = 'Certification{', ('no_request_certification:Empty{}' | 'request_certification:', REQUEST-CERTIFICATION), ',response_certification:', RESPONSE-CERTIFICATION, '}'
VALIDATION-ARGS = 'ValidationArgs{', ('no_certification:Empty{}' | 'certification:', CERTIFICATION), '}'
HEADER-VALUE = 'default_certification(', VALIDATION-ARGS, ')'
HEADER = 'IC-CertificateExpression:', HEADER-VALUERequest Hash Calculation
Section titled “Request Hash Calculation”The request hash is calculated as follows:
- Let
request_headers_hashbe the representation-independent hash of the request headers:- The header names are lower-cased.
- Only include headers listed in the
certified_request_headersfield of the certificate expression header.- If the field is empty or no value was supplied, no headers are included.
- Headers can be repeated and each repetition should be included.
- Include an additional
:ic-cert-methodheader that contains the HTTP method of the request. - Include an additional
:ic-cert-queryheader that contains a value according to the following steps:- Parse the query string and build a list of tuples
(<query_param_name>, <query_param_value>)while maintaining the order. - Exclude all tuples where
<query_param_name>does not exactly match a value listed in thecertified_query_parametersfield of the certificate expression header. Ifcertified_query_parametersis empty then the resulting list of tuples should also be empty. - Concatenate each
<query_param_name>with the corresponding<query_param_value>and then concatenate all of these concatenations using the original separators and order. - Calculate the sha256 hash of the UTF-8 representation of the resulting string.
- Parse the query string and build a list of tuples
- Let
request_body_hashbe the sha256 of the request body. - Concatenate
request_headers_hashandrequest_body_hashand calculate the sha256 of that concatenation.
Response Hash Calculation
Section titled “Response Hash Calculation”The response hash is calculated as follows:
- Let
response_headers_hashbe the representation-independent hash of the response headers:- The header names are lower-cased.
- The
IC-Certificateheader is always excluded. - The
IC-CertificateExpressionheader is always included. - If the
no_certificationfield of the certificate expression header is present:- This request/response pair is exempt from certification and the response hash calculation can be skipped altogether
- If the
certified_response_headersfield of the certificate expression header is present:- All headers listed by certified_response_headers are included (except for the
IC-Certificateheader) - All others are excluded (except for the
IC-CertificateExpressionheader)
- All headers listed by certified_response_headers are included (except for the
- If the
response_header_exclusionsfield of the certificate expression header is present:- All headers listed (except for the
IC-CertificateExpressionheader) are excluded from the certification - All other headers (except for the IC-Certificate header) are included in the certification
- All headers listed (except for the
- Headers can be repeated and each repetition should be included.
- Include an additional
:ic-cert-statusheader that contains the numerical HTTP status code of the response.
- Let
response_body_hashbe the sha256 of the response body. - Concatenate
response_headers_hashandresponse_body_hashand calculate the sha256 of that concatenation.
Multiple CEL Expression Hashes Per Expression Path
Section titled “Multiple CEL Expression Hashes Per Expression Path”Adding one CEL expression hash per expression path should be the default and most common case as it is the most secure approach. It is, however, possible to add multiple CEL expression hashes per expression path, if the flexibility is needed by a canister. This feature is quite dangerous and must be used with extreme caution. By adding a 2nd CEL expression hash, a canister is giving a malicious replica node freedom to choose a different CEL expression hash for a request than what is intended by the canister. This could be used to expose a potential vulnerability that does not exist if the intended CEL expression hash is used. This should only be used in cases where the difference in CEL expression hashes is benign and will not pose a security threat to the canister, or there is not sufficient overlap between the CEL expressions to allow the replica node to freely choose between them.
Multiple Response Hashes Per Request Hash
Section titled “Multiple Response Hashes Per Request Hash”Similar to Multiple CEL Expression Hashes Per Expression Path, this is a feature that is intended to allow for flexibility when it is needed. It is also dangerous and must be used with great care. By adding multiple response hashes for a single request hash, a malicious replica can freely choose between any of those response hashes for that request hash. This should only be used in cases where the difference between the responses is benign and will not pose a security threat to the canister.
Response Body Streaming
Section titled “Response Body Streaming”The HTTP Gateway protocol has provisions to transfer further chunks of the body data from the canister to the HTTP Gateway, to overcome the message limit of the Internet Computer. This streaming protocol is independent of any possible streaming of data between the HTTP Gateway and the HTTP client. The HTTP Gateway may assemble the response as a whole before passing it on, or pass the chunks on directly, on the TCP or HTTP level, as it sees fit. When the HTTP Gateway is certifying the response, it must not pass on uncertified chunks.
If the streaming_strategy field of the HttpResponse is set, the HTTP Gateway then uses further query calls to obtain further chunks to append to the body:
If the function reference in the callback field of the streaming_strategy is not a method of the given canister, the HTTP Gateway fails the request.
Else, it makes a query call to the given method, passing the token value given in the streaming_strategy as the argument.
That method returns a StreamingCallbackHttpResponse. The body therein is appended to the body of the HTTP response. This is repeated as long as the method returns some token in the token field until that field is null.
The type of the token value is chosen by the canister; the HTTP Gateway obtains the Candid type of the encoded message from the canister and uses it when passing the token back to the canister. This generic use of Candid is not covered by the Candid specification, and may not be possible in some cases (e.g. when using “future types”). Canister authors may have to use “simple” types.
Upgrade to Update Calls
Section titled “Upgrade to Update Calls”If the canister sets upgrade = opt true in the HttpResponse reply from the http_request call, then the HTTP Gateway ignores all other fields of the response. The HTTP Gateway performs an update call to http_request_update, passing an HttpUpdateRequest record as the argument, and uses the resulting response from http_request_update instead. The HttpUpdateRequest record is identical to the original HttpRequest, with the certificate_version field excluded.
The value of the upgrade field returned from http_request_update is ignored.
Legacy Response Verification
Section titled “Legacy Response Verification”Version 1 response verification only supports verifying a request path and response body pair with only one response per request path. This is quite restrictive in the number of scenarios it can support. For example, redirection or client-side caching is not safe since the status code and headers required to verify responses of that nature are not included in the certification. Upon a query call to a canister’s http_request method, a single malicious node or boundary node can modify these parts of the HTTP response, leading to the following issues:
- apps cannot load the service worker when embedded within iFrames.
- The use of redirects and cookies is unsafe as they can be manipulated by malicious nodes.
- This is unexpected for developers and will lead to vulnerabilities in apps sooner or later.
- The effectiveness of security headers (such as Content Security Policy) is diminished as they can be omitted or modified by malicious nodes.
Response Verification version 2 overcomes these issues.
The steps for response verification are as follows:
- See the response verification outline for the full subprotocol description.
- Assert that the canister returning the response does not have support for response verification v2 via Response Verification Version Assertion.
- If the canister reports that it has support for response verification v2, verification fails.
- Otherwise, continue.
- The path
["http_assets", <url>]exists in thetreeand is a leaf with a value, where<url>is the utf8-encoded URL from theHttpRequest. - Otherwise, the path
["http_assets", "/index.html"]must exist in thetreeand be a leaf. - That leaf must contain the SHA-256 hash of the decoded body.
- If the
streaming_strategyfield of theHttpResponseis set, all chunks are streamed and concatenated according to response body streaming before decoding. - The body is decoded according to the
Content-Encodingheader if present. Supported values for theContent-Encodingheader includegzipanddeflate.
- If the
Response Verification Version Assertion
Section titled “Response Verification Version Assertion”Canisters can report the supported versions of response verification using (public) metadata sections available in the system state tree. This metadata will be read by the HTTP Gateway using a read_state request. The metadata section must be a (public) custom section with the name supported_certificate_versions and contain a comma-delimited string of versions, e.g., 1,2. This is treated as an optional, additional layer of security for canisters supporting multiple versions. If the metadata has not been added (i.e., the read_state request succeeds and the lookup of the metadata section in the read_state response certificate returns Absent), then the HTTP Gateway will allow for whatever version the canister has responded with.
The request for the metadata will only be made by the HTTP Gateway if there is a downgrade. If the HTTP Gateway requests v2 and the canister responds with v2, then a request will not be made. If the HTTP Gateway requests v2 and the canister responds with v1, a request will be made. If a request is made, the HTTP Gateway will not accept any response from the canister that is below the max version supported by both the HTTP Gateway and the canister. This will guarantee that a canister supporting both v1 and v2 will always have v2 security when accessed by an HTTP Gateway that supports v2.
Canister HTTP Interface
Section titled “Canister HTTP Interface”The full Candid interface that a canister is expected to implement is as follows:
type HeaderField = record { text; text; };
type HttpRequest = record { method: text; url: text; headers: vec HeaderField; body: blob; certificate_version: opt nat16;};
type HttpUpdateRequest = record { method: text; url: text; headers: vec HeaderField; body: blob;};
type HttpResponse = record { status_code: nat16; headers: vec HeaderField; body: blob; upgrade : opt bool; streaming_strategy: opt StreamingStrategy;};
// Each canister that uses the streaming feature gets to choose their concrete// type; the HTTP Gateway will treat it as an opaque value that is only fed to// the callback method
type StreamingToken = /* application-specific type */
type StreamingCallbackHttpResponse = record { body: blob; token: opt StreamingToken;};
type StreamingStrategy = variant { Callback: record { callback: func (StreamingToken) -> (opt StreamingCallbackHttpResponse) query; token: StreamingToken; };};
service : { http_request: (request: HttpRequest) -> (HttpResponse) query; http_request_update: (request: HttpUpdateRequest) -> (HttpResponse);}You can also download the file.
Not all of this interface is required. The following sections detail what can be optionally omitted depending on the requirements of the canister in question.
Note. Composite query methods can be used instead of query methods to allow for calling composite query methods of other canisters on the same subnet when processing an HTTP request, e.g., the canister can export
http_request: (request: HttpRequest) -> (HttpResponse) composite_query;instead of
http_request: (request: HttpRequest) -> (HttpResponse) query;Response Verification Interface
Section titled “Response Verification Interface”The certificate_version field of the HttpRequest interface is optional depending on the version of response verification that the canister is implementing. It is omitted in older canisters that do not implement response verification version 2 or later.
type HttpRequest = record { // ... certificate_version: opt nat16;};Upgrade to Update Calls Interface
Section titled “Upgrade to Update Calls Interface”The http_request_update method of the service interface along with the upgrade field of the HttpResponse interface is optional depending on whether the canister needs to use the upgrade to update calls feature. Not that the HttpUpdateRequest type is the same as the HttpRequest type, but excludes the certificate_version field since this should not affect the response to an update call from a canister.
type HttpUpdateRequest = record { method: text; url: text; headers: vec HeaderField; body: blob;};
type HttpResponse = record { // ... upgrade : opt bool; // ...};
service : { // ... http_request_update: (request: HttpUpdateRequest) -> (HttpResponse);}Response Body Streaming Interface
Section titled “Response Body Streaming Interface”The StreamingToken, StreamingCallbackHttpResponse, and StreamingStrategy interfaces along with the streaming_strategy field of the HttpResponse interface are optional depending on whether the canister needs to use the response body streaming feature.
type HttpResponse = record { // ... streaming_strategy: opt StreamingStrategy;};
// Each canister that uses the streaming feature gets to choose their concrete// type; the HTTP Gateway will treat it as an opaque value that is only fed to// the callback method
type StreamingToken = /* application-specific type */
type StreamingCallbackHttpResponse = record { body: blob; token: opt StreamingToken;};
type StreamingStrategy = variant { Callback: record { callback: func (StreamingToken) -> (opt StreamingCallbackHttpResponse) query; token: StreamingToken; };};Minimum Canister Interface
Section titled “Minimum Canister Interface”If all of the above optional features are not needed by a canister, the minimum Candid interface that it needs to implement is as follows:
type HeaderField = record { text; text; };
type HttpRequest = record { method: text; url: text; headers: vec HeaderField; body: blob;};
type HttpResponse = record { status_code: nat16; headers: vec HeaderField; body: blob;};
service : { http_request: (request: HttpRequest) -> (HttpResponse) query;}