rfc.http
- HTTP client ¶This module provides a simple client API for HTTP/1.1, defined in RFC2616, "Hypertext Transfer Protocol – HTTP/1.1" (https://www.ietf.org/rfc/rfc2616.txt).
Current API implements only a part of HTTP/1.1 protocol. Support for some advanced features such as persistent connection may be added in the future versions.
If you’re looking a library to write HTTP server, see Gauche-makiki (https://github.com/shirok/Gauche-makiki).
{rfc.http
}
This type of condition is raised when the server terminates
connection prematurely or server’s response has invalid
header fields. Inherits <error>
.
• Http client mid-level API: | ||
• Http client utilities: | ||
• Secure http connection: |
{rfc.http
}
Send http GET, HEAD, POST, PUT and DELETE requests to the http server,
respectively, and returns the server’s reply.
By default, if the server returns 300, 301, 302, 303, 305 and 307 status, these procedures attempts to fetch the redirected URL by the "location" reply message header if it is allowed by RFC2616. This behavior can be turned off or customized by the redirect-handler keyword argument; see the "keyword arguments" heading below for the details.
Required arguments: The server argument specifies http server name in a string. A server name can be optionally followed by colon and a port number. You can use IP address, too; for IPv6, you have to surround the address in brackets.
Additionally, you can specify "unix:/path"
where /path
is the absolute path to the unix domain socket; this allows to connect
to httpd listening on unix domain sockets.
Examples: "w3c.org"
, "mycompany.com:8080"
,
"192.168.0.1:8000"
, "[::1]:8000"
The request-uri argument can be a string or a list. If it is a string, it’s request-uri specified in RFC2616; usually, this is the path part of http url. The string is passed to the server as is, so the caller must properly convert character encodings and perform necessary url encodings.
If request-uri is a list, it must be in the following form:
(path (name value) ...)
Here, path is a string specifying up to the path component
of the request-uri.
From provided alist of names and values,
http procedures compose a query string in
application/x-www-form-urlencoded
format
as defined in HTML4, and append it to path.
For example, the following two requests have the same effect.
Note that url escaping is automatically handled in the second call.
(http-get "example.com" "/search?q=foo%20bar&n=20") (http-get "example.com" '("/search" (q "foo bar") (n 20)))
If request-encoding keyword argument is also given, names and values are converted into the specified character encoding before url escaping. If it is omitted, gauche’s internal character encoding is used.
Some procedures take the third argument, body,
to specify the body of the request message.
It can be a string, which will be copied verbatim to the request body,
or a list, which will be encoded in multipart/form-data
message.
If body is a list, it is a list of parameter specs. Each parameter
spec is either a list of name and value, e.g.
("submit" "OK")
or a name followed by keyword-value list,
e.g. ("upload" :file "logo.png" :content-type "image/png")
.
The first form is for the convenience. It is also compatible to the
query parameter list in request-uri, so that you can use the
same format for GET and POST request. Each value is put
in a MIME part with text/plain
media type, with the
character encoding specified by request-encoding
keyword argument
described below.
The second form allows further control over each MIME part’s attributes. The following keywords are treated specially.
:value
Specifies the value of the parameter. The convenience form,
(name val)
, is just an abbreviation of
(name :value val)
.
:file
Specifies the pathname of the file, whose content is inserted
as the value of the parameter. Useful to upload a file.
This option has precedence over :value
.
MIME type of the part is set to application/octet-stream
unless specified otherwise.
:content-type
Overrides the MIME type of the part. A charset parameter is added to the content-type if not given in this argument.
:content-transfer-encoding
Specifies the value of content-transfer-encoding; currently the
following values are supported: 7bit
,
binary
, quoted-printable
and base64
.
If omitted, binary
is used.
Other keywords are used as the header of the MIME part.
Return values: All procedures return three values.
The first value is the status code defined in RFC2616 in a string (such as "200" for success, "404" for "not found").
The second value is a list of parsed headers—each element of list
is a list of (header-name value …)
,
where header-name is a string name of the header
(such as "content-type" or "location"), and value is
the corresponding value in a string. The header name is converted
to lowercase letters. The value is untouched except that "soft line breaks"
are removed, as defined in RFC2822. If the server returns
more than one headers with the same name, their values are
consolidated to one list. Except that, the order of the header list
in the second return value is the same as the order in the server’s reply.
The third value is for the message body of the server’s reply.
By default, it is a message body itself in a string. If the server’s
reply doesn’t have a body, the third value is #f
. You can
change how the message body is handled by keyword arguments; for example,
you can directly store the returned message body to a file without
creating intermediate string. The details are explained below.
Keyword arguments:
By default, these procedures only attaches "Host"
header
field to the request message. You can give keyword arguments
to add more header fields.
(http-get "foo.bar.com" "/index.html" :accept-language "ja" :user-agent "My Scheme Program/1.0")
The following keyword arguments are recognized by the procedure and do not appear in the request headers.
request-encoding
When a list is given to the request-uri or body arguments,
the characters in names and values of the parameters are first
converted to the character encoding specified by this keyword
argument, then encoded into application/x-www-form-urlencoded
or multipart/form-data
MIME formats.
If this argument is omitted, Gauche’s internal character encoding is used.
For multipart/form-data
, you can override character encodings
for individual parameters by giving content-type
header.
See the description of body arguments above.
If you give a string to request-uri or body, it is used without encoding conversion. It is caller’s responsibility to ensure desired character encodings are used.
proxy
Specify http proxy server in a string of a form hostname
or
hostname:port
. If omitted, the value of the parameter
http-proxy
is used.
redirect-handler
Specifies how the redirection is handled when the server responds with
3xx status code.
You can pass #f
, #t
or a procedure. The default is #t
.
If #f
is given, no redirect attempt will be made; the 3xx status
code and response is just returned from http-*
procedures as they are.
If a procedure is given, it is called when the response status code
is 3xx. The procedure takes four arguments, the request method (in symbol,
e.g. GET
), the response status code (in string, e.g. "302"
),
the parsed response headers and the response body (a string
if there’s a body, or #f
if the response doesn’t have a body).
The procedure can return a pair or #f
.
If it is a pair, it should be (method . url)
, where method
is a symbol (e.g. GET
) and url is a string representing url.
If a pair is returned, the http-*
procedures tries to send
the request with the given method (it allows a redirection of POST request
to be GET, for example). If it is #f
, no further attempt of
redirection is made.
If redirect-handler is #t
, which is the default,
then it works as if the value of the parameter
http-default-redirect-handler
is passed to redirect-handler.
The parameter contains a procedure with reasonable default behavior.
See the http-default-redirect-handler
entry below for the details.
A loop in redirection is detected automatically and <http-error>
is thrown.
no-redirect
This is an obsoleted keyword argument kept only for the backward
compatibility. If a true value is given, it has the same effect
as specifying #f
to redirect-handler.
secure
If a true value is given, the secure connection is used. The value
specifies the secure transport agent to establish https connection.
It can be #t
or a symbol tls
or stunnel
.
If #f
is given (default), non-secure plain http is used.
See the “Secure connection” section below.
auth-user, auth-password
If given, the authorization header using Basic Authentication (RFC2617) is added to the request. In future, we might add support for other authentication scheme.
sink, flusher
You can customize how the reply message body is handled by these keyword arguments. You have to pass an output port to sink, and a procedure that takes two arguments to flusher.
When the procedure starts receiving the message body, it feeds the received chunk to sink. When the procedure receives entire message body, flusher method is called with sink and a list of message header fields (in the same format to be returned in the second value from the procedure). The return value of flusher becomes the third return value from the procedure.
So, the default value of sink is a newly opened string
port and the default value of flusher is
(lambda (sink headers) (get-output-string sink))
.
The following example saves the message body directly to a file, without allocating (potentially very big) string buffer.
(call-with-output-file "page.html" (lambda (out) (http-get "www.schemers.org" "/" :sink out :flusher (lambda _ #t))))
The module also provides some utility procedures.
{rfc.http
}
The value of this parameter is used as a default value
to pass to the user-agent header.
The default value is something like gauche.http/*
,
where *
is Gauche’s version.
An application is encouraged to set this parameter appropriately.
{rfc.http
}
This value is used as the default http proxy name by http-get
etc.
The default value is #f
(no proxy).
{rfc.http
}
Specifies the behavior of redirection if no redirect-handler
keyword
argument is given to the http-*
procedures.
If you change this value, it must be a procedure that follows the
protocol of redirect-handler
; see the description of http-*
procedures above.
The default behavior is as follows:
300
, 301
, 305
, 307
Redirect to the url given to the location
header only if
the original request method is GET
or HEAD
.
302
Redirect to the url given to the location
header. If
the original request method is HEAD
, it is used again.
Otherwise, GET
method is used.
Strictly speaking, this is a violation of RFC2616. However, as the note in RFC2616 says, many user agent do this, so we follow the flock. (We may change this in future.)
303
Redirect to the url given to the location
header. If
the original request method is HEAD
, it is used again.
Otherwise, GET
method is used.
No redirection is made.
The following code is an example of intercepting the default behavior in a specific request:
(http-get server uri :redirect-handler (^[method status headers body] (if (and (equal? status "302") (not (member method '(GET HEAD)))) #f ((http-default-request-handler) method status headers body))))
{rfc.http
}
A helper procedure to create a request-uri from
a list of query parameters. Encoding specifies
the character encodings to be used.
(http-compose-query "/search" '((q "$foo") (n 20))) ⇒ "/search?q=%24foo&n=20" (http-compose-query "" '((x "a b") (x 2))) ⇒ "?x=a%20b&x=2"
If path is #f
, only the query parameter part
is returned (compare the following example and the last
example):
(http-compose-query #f '((x "a b") (x 2))) ⇒ "x=a%20b&x=2"
This is built on top of uri-compose-query
in rfc.uri
(see rfc.uri
- URI parsing and construction).
{rfc.http
}
A helper procedure to create multipart/form-data
from a list of parameters. The format of params argument
is the same as the list format of body argument of
http request procedures. The result is written to an output
port port, and the boundary string used to compose
MIME message is returned. Alternatively you can pass #f
to the port to get the result in a string.
In that case, two values are returned, the MIME message string
and the boundary string.
Encoding specifies the character encodings to be used. When omitted, Gauche’s native encoding is used.
(define p (open-output-string)) (http-compose-form-data '((name "Preludes and Fugues") (composer "Shostakovich, Dmitri") (opus "87")) p) ⇒ "boundary-fh87o52rp6zkubp2uhdmo" (get-output-string p) ⇒ "\r\n--boundary-fh87o52rp6zkubp2uhdmo\r\nContent-type: te xt/plain; charset=utf-8\r\nContent-transfer-encoding: bi nary\r\ncontent-disposition: form-data; name=title\r\n\r\n Preludes and Fugues\r\n--boundary-fh87o52rp6zkubp2uhdmo... ;; (result is truncated)
{rfc.http
}
Returns a brief description of http status code code
,
which may be an integer or a string (e.g. "404"
).
If code
isn’t one of known code, #f
is returned.
(http-status-code->description 404) ⇒ "Not Found"
When you pass a true value to secure
keyword argument,
the request-making APIs such as http-get
use a secure
connection. That is, it connects with https
instead of
http
. The actual value for the keyword argument can be one of the
followings:
#t
tls
The rfc.tls
module is used for the secure connection.
See rfc.tls
- Transport layer security, for the details—you might need to
set CA certificate bundle path.
stunnel
The external process stunnel
is spawned and used for the
secure connection.
#f
Secure connection is not used.
If specified secure connection subsystem isn’t available in the running Gauche, an error is signaled. Use the following procedure to check if you can use secure connections:
{rfc.http
}
The type argument may be tls
or stunnel
.
If omitted, tls
is assumed.
Returns #t
if running Gauche can use secure connection of the given type,
#f
otherwise.