For Development HEAD DRAFTSearch (procedure/syntax/module):

Next: , Previous: , Up: Library modules - Utilities   [Contents][Index]

12.46 rfc.http - HTTP client

Module: rfc.http

This module provides a simple client API for HTTP/1.1, defined in RFC2616, "Hypertext Transfer Protocol – HTTP/1.1" (https://www.ietf.org/rfc/rfc2616.txt).

Current API implements only a part of HTTP/1.1 protocol. Support for some advanced features such as persistent connection may be added in the future versions.

If you’re looking a library to write HTTP server, see Gauche-makiki (https://github.com/shirok/Gauche-makiki).

Condition Type: <http-error>

{rfc.http} This type of condition is raised when the server terminates connection prematurely or server’s response has invalid header fields. Inherits <error>.


12.46.1 Http client mid-level API

Function: http-get server request-uri :key sink flusher redirect-handler secure …
Function: http-head server request-uri :key redirect-handler secure …
Function: http-post server request-uri body :key sink flusher redirect-handler secure …
Function: http-put server request-uri body :key sink flusher redirect-handler secure …
Function: http-delete server request-uri :key sink flusher redirect-handler secure …

{rfc.http} Send http GET, HEAD, POST, PUT and DELETE requests to the http server, respectively, and returns the server’s reply.

By default, if the server returns 300, 301, 302, 303, 305 and 307 status, these procedures attempts to fetch the redirected URL by the "location" reply message header if it is allowed by RFC2616. This behavior can be turned off or customized by the redirect-handler keyword argument; see the "keyword arguments" heading below for the details.

Required arguments: The server argument specifies http server name in a string. A server name can be optionally followed by colon and a port number. You can use IP address, too; for IPv6, you have to surround the address in brackets.

Additionally, you can specify "unix:/path" where /path is the absolute path to the unix domain socket; this allows to connect to httpd listening on unix domain sockets. Examples: "w3c.org", "mycompany.com:8080", "192.168.0.1:8000", "[::1]:8000"

The request-uri argument can be a string or a list. If it is a string, it’s request-uri specified in RFC2616; usually, this is the path part of http url. The string is passed to the server as is, so the caller must properly convert character encodings and perform necessary url encodings.

If request-uri is a list, it must be in the following form:

(path (name value) ...)

Here, path is a string specifying up to the path component of the request-uri. From provided alist of names and values, http procedures compose a query string in application/x-www-form-urlencoded format as defined in HTML4, and append it to path. For example, the following two requests have the same effect. Note that url escaping is automatically handled in the second call.

(http-get "example.com" "/search?q=foo%20bar&n=20")

(http-get "example.com" '("/search" (q "foo bar") (n 20)))

If request-encoding keyword argument is also given, names and values are converted into the specified character encoding before url escaping. If it is omitted, gauche’s internal character encoding is used.

Some procedures take the third argument, body, to specify the body of the request message. It can be a string, which will be copied verbatim to the request body, or a list, which will be encoded in multipart/form-data message.

If body is a list, it is a list of parameter specs. Each parameter spec is either a list of name and value, e.g. ("submit" "OK") or a name followed by keyword-value list, e.g. ("upload" :file "logo.png" :content-type "image/png").

The first form is for the convenience. It is also compatible to the query parameter list in request-uri, so that you can use the same format for GET and POST request. Each value is put in a MIME part with text/plain media type, with the character encoding specified by request-encoding keyword argument described below.

The second form allows further control over each MIME part’s attributes. The following keywords are treated specially.

:value

Specifies the value of the parameter. The convenience form, (name val), is just an abbreviation of (name :value val).

:file

Specifies the pathname of the file, whose content is inserted as the value of the parameter. Useful to upload a file. This option has precedence over :value. MIME type of the part is set to application/octet-stream unless specified otherwise.

:content-type

Overrides the MIME type of the part. A charset parameter is added to the content-type if not given in this argument.

:content-transfer-encoding

Specifies the value of content-transfer-encoding; currently the following values are supported: 7bit, binary, quoted-printable and base64. If omitted, binary is used.

Other keywords are used as the header of the MIME part.

Return values: All procedures return three values.

The first value is the status code defined in RFC2616 in a string (such as "200" for success, "404" for "not found").

The second value is a list of parsed headers—each element of list is a list of (header-name value …), where header-name is a string name of the header (such as "content-type" or "location"), and value is the corresponding value in a string. The header name is converted to lowercase letters. The value is untouched except that "soft line breaks" are removed, as defined in RFC2822. If the server returns more than one headers with the same name, their values are consolidated to one list. Except that, the order of the header list in the second return value is the same as the order in the server’s reply.

The third value is for the message body of the server’s reply. By default, it is a message body itself in a string. If the server’s reply doesn’t have a body, the third value is #f. You can change how the message body is handled by keyword arguments; for example, you can directly store the returned message body to a file without creating intermediate string. The details are explained below.

Keyword arguments: By default, these procedures only attaches "Host" header field to the request message. You can give keyword arguments to add more header fields.

(http-get "foo.bar.com" "/index.html"
  :accept-language "ja"
  :user-agent "My Scheme Program/1.0")

The following keyword arguments are recognized by the procedure and do not appear in the request headers.

request-encoding

When a list is given to the request-uri or body arguments, the characters in names and values of the parameters are first converted to the character encoding specified by this keyword argument, then encoded into application/x-www-form-urlencoded or multipart/form-data MIME formats. If this argument is omitted, Gauche’s internal character encoding is used.

For multipart/form-data, you can override character encodings for individual parameters by giving content-type header. See the description of body arguments above.

If you give a string to request-uri or body, it is used without encoding conversion. It is caller’s responsibility to ensure desired character encodings are used.

proxy

Specify http proxy server in a string of a form hostname or hostname:port. If omitted, the value of the parameter http-proxy is used.

redirect-handler

Specifies how the redirection is handled when the server responds with 3xx status code. You can pass #f, #t or a procedure. The default is #t.

If #f is given, no redirect attempt will be made; the 3xx status code and response is just returned from http-* procedures as they are.

If a procedure is given, it is called when the response status code is 3xx. The procedure takes four arguments, the request method (in symbol, e.g. GET), the response status code (in string, e.g. "302"), the parsed response headers and the response body (a string if there’s a body, or #f if the response doesn’t have a body).

The procedure can return a pair or #f. If it is a pair, it should be (method . url), where method is a symbol (e.g. GET) and url is a string representing url. If a pair is returned, the http-* procedures tries to send the request with the given method (it allows a redirection of POST request to be GET, for example). If it is #f, no further attempt of redirection is made.

If redirect-handler is #t, which is the default, then it works as if the value of the parameter http-default-redirect-handler is passed to redirect-handler. The parameter contains a procedure with reasonable default behavior. See the http-default-redirect-handler entry below for the details.

A loop in redirection is detected automatically and <http-error> is thrown.

no-redirect

This is an obsoleted keyword argument kept only for the backward compatibility. If a true value is given, it has the same effect as specifying #f to redirect-handler.

secure

If a true value is given, the secure connection is used. The value specifies the secure transport agent to establish https connection. It can be #t or a symbol tls or stunnel. If #f is given (default), non-secure plain http is used. See the “Secure connection” section below.

auth-user, auth-password

If given, the authorization header using Basic Authentication (RFC2617) is added to the request. In future, we might add support for other authentication scheme.

sink, flusher

You can customize how the reply message body is handled by these keyword arguments. You have to pass an output port to sink, and a procedure that takes two arguments to flusher.

When the procedure starts receiving the message body, it feeds the received chunk to sink. When the procedure receives entire message body, flusher method is called with sink and a list of message header fields (in the same format to be returned in the second value from the procedure). The return value of flusher becomes the third return value from the procedure.

So, the default value of sink is a newly opened string port and the default value of flusher is (lambda (sink headers) (get-output-string sink)).

The following example saves the message body directly to a file, without allocating (potentially very big) string buffer.

(call-with-output-file "page.html"
  (lambda (out)
    (http-get "www.schemers.org" "/"
       :sink out :flusher (lambda _ #t))))

12.46.2 Http client utilities

The module also provides some utility procedures.

Parameter: http-user-agent :optional value

{rfc.http} The value of this parameter is used as a default value to pass to the user-agent header. The default value is something like gauche.http/*, where * is Gauche’s version. An application is encouraged to set this parameter appropriately.

Parameter: http-proxy :optional value

{rfc.http} This value is used as the default http proxy name by http-get etc. The default value is #f (no proxy).

Parameter: http-default-redirect-handler :optional value

{rfc.http} Specifies the behavior of redirection if no redirect-handler keyword argument is given to the http-* procedures. If you change this value, it must be a procedure that follows the protocol of redirect-handler; see the description of http-* procedures above.

The default behavior is as follows:

300, 301, 305, 307

Redirect to the url given to the location header only if the original request method is GET or HEAD.

302

Redirect to the url given to the location header. If the original request method is HEAD, it is used again. Otherwise, GET method is used.

Strictly speaking, this is a violation of RFC2616. However, as the note in RFC2616 says, many user agent do this, so we follow the flock. (We may change this in future.)

303

Redirect to the url given to the location header. If the original request method is HEAD, it is used again. Otherwise, GET method is used.

other than above

No redirection is made.

The following code is an example of intercepting the default behavior in a specific request:

(http-get server uri
  :redirect-handler
  (^[method status headers body]
    (if (and (equal? status "302")
             (not (member method '(GET HEAD))))
        #f
        ((http-default-request-handler) method status headers body))))
Function: http-compose-query path params :optional encoding

{rfc.http} A helper procedure to create a request-uri from a list of query parameters. Encoding specifies the character encodings to be used.

(http-compose-query "/search" '((q "$foo") (n 20)))
 ⇒ "/search?q=%24foo&n=20"

(http-compose-query "" '((x "a b") (x 2)))
 ⇒ "?x=a%20b&x=2"

If path is #f, only the query parameter part is returned (compare the following example and the last example):

(http-compose-query #f '((x "a b") (x 2)))
 ⇒ "x=a%20b&x=2"

This is built on top of uri-compose-query in rfc.uri (see rfc.uri - URI parsing and construction).

Function: http-compose-form-data params port :optional encoding

{rfc.http} A helper procedure to create multipart/form-data from a list of parameters. The format of params argument is the same as the list format of body argument of http request procedures. The result is written to an output port port, and the boundary string used to compose MIME message is returned. Alternatively you can pass #f to the port to get the result in a string. In that case, two values are returned, the MIME message string and the boundary string.

Encoding specifies the character encodings to be used. When omitted, Gauche’s native encoding is used.

(define p (open-output-string))

(http-compose-form-data '((name "Preludes and Fugues")
                          (composer "Shostakovich, Dmitri")
                          (opus "87"))
                         p)
  ⇒ "boundary-fh87o52rp6zkubp2uhdmo"

(get-output-string p)
  ⇒
  "\r\n--boundary-fh87o52rp6zkubp2uhdmo\r\nContent-type: te
   xt/plain; charset=utf-8\r\nContent-transfer-encoding: bi
   nary\r\ncontent-disposition: form-data; name=title\r\n\r\n
   Preludes and Fugues\r\n--boundary-fh87o52rp6zkubp2uhdmo...
;; (result is truncated)
Function: http-status-code->description code

{rfc.http} Returns a brief description of http status code code, which may be an integer or a string (e.g. "404"). If code isn’t one of known code, #f is returned.

(http-status-code->description 404)
  ⇒ "Not Found"

12.46.3 Secure http connection

When you pass a true value to secure keyword argument, the request-making APIs such as http-get use a secure connection. That is, it connects with https instead of http. The actual value for the keyword argument can be one of the followings:

#t
tls

The rfc.tls module is used for the secure connection. See rfc.tls - Transport layer security, for the details—you might need to set CA certificate bundle path.

stunnel

The external process stunnel is spawned and used for the secure connection.

#f

Secure connection is not used.

If specified secure connection subsystem isn’t available in the running Gauche, an error is signaled. Use the following procedure to check if you can use secure connections:

Function: http-secure-connection-available? :optional type

{rfc.http} The type argument may be tls or stunnel. If omitted, tls is assumed. Returns #t if running Gauche can use secure connection of the given type, #f otherwise.


Next: , Previous: , Up: Library modules - Utilities   [Contents][Index]


For Development HEAD DRAFTSearch (procedure/syntax/module):
DRAFT