Redoubt: network protocol version 1
Redoubt uses a subset of the the HTTP/1.1 protocol (see
rfc 2068). It does
allow the implementation of a server using CGI scripts and a normal HTTP server
like Apache. For security reasons all connections should be done over SSL.
Some servers may accept normal http connections though.
Authentication
Authentication is nor mandated nor enforced by the protocol. However,
not using authentication is not advisable for obvious reasons.
One of the following authentication methods should be used
(not all of them have already been implemented):
- Basic Authentication:
This scheme (see RFC 2617)
uses the standard http authentication mechanism. As it
sends passwords in clear text over the wire it is not advisable to
use it unless anonymous SSL (i.e. SSL without client side authentication)
is used.
- Digest Authentication:
This authentication system (see RFC 2617)
does not send passwords in clear over the wire, and is
more secure than basic authentication. It still allows
man-in-the-middle attacks.
- SSL with public keys:
Most web servers allow public key authentication with SSL. This
is one of the most secure ways to authenticate and encrypt
traffic.
- SSH and http tunneling:
If none of the above authentication systems can be used, it
is always possible to tunnel the traffic through an ssh tunnel.
In this scenario ssh can either use passwords or public key
authentication.
Base URL
All URLs a backup client uses may have to be prefixed by a fixed string (as in
/backup/username/) called the base URL. This is server specific and usually
only necessary for servers that are not exclusive backup servers. This URL
is shown as /<base-url>.
Fields
The following fields should be supported by the client and server and always
used if appropriate.
- Range
- Content-MD5
- Content-Length
- Server / Client
All connections should specify 'no-cache' in order to not confuse caches.
Multiple requests per connection should be supported by the server and the
client for bigger throughput, but are not really necessary.
Filenames
Filenames include important information and must conform to the
Perl regular expression shown below. They consist of five fields,
separated by '.':
- Random Filename:
Every filename on the client is mapped to a random filename. This avoids
the leakage of information in filenames. Only the client keeps the
mapping in a file, if that information is lost all files need to be
downloaded by the client and decrypted in order to recreated the mapping.
The field must be between 5 and 50 characters long. It is recommended
that a random string is converted to a suitable filename with base64
(all '/' need to be replaced with '-').
- Filetype:
The archive stores four kind of file types: full files, meta data only,
deletions, and hardlink information (the format of those files are
described elsewhere). In order to speed up recovery the type of file
is encoded in the filename, using one character: 'F', 'M', 'D' or 'L'.
- Generation:
Every version of a file gets a new generation number. This field contains
the time when the file was backed up in seconds since the epoch.
- Block:
Large files must be broken up into pieces. This fields contains the
block number for this part of the file, and the total number of blocks
used. The two numbers are separated by a '-' sign.
- Redundancy and Mirror information:
In order to recreate a file the client needs to know how it was stored
on different servers. This field stores this information. Its format
depends on the redundancy algorithm used (mirror, raid, ...) and
generally contains between 2 and 5 characters.
The filename must match the following Perl regular expression:
([A-Za-z0-9+-]{5,50}).([FMDL]).([0-9]+).([0-9+]-[0-9]+).([A-Za-z0-9+-]+)
Note that the server will make use of some of the data in the filename (in
particular the generation information). The rest of the fields are only
for clients, and standardizing those makes it easier to develop clients that
can inter operate with each other.
Options
The Client sends a GET request for the URL /<base-url>/options
The Server returns an ASCII file (content-type text/plain)
that lists options and their values. The option name is separated by a tab
from the value, which goes to the end of the line.
The following options are sent out (a serer may include additional ones,
clients should just ignore options they don't recognize).
VERSION | protocol version the server supports, currently always 1. |
SPACE | Total space in bytes for this archive user. |
USED | Total bytes already used in this backup. |
FILES | Total number of files currently stored. |
RESTORE_BANDWIDTH | Current restore bandwidth in kbits/second for this archive user. |
BACKUP_BANDWIDTH | Current bqckup bandwidth in kbits/second for this archive user. |
MAX_FILE_SIZE | Maximum size a file can have. |
Note that the OPTIONS request that is part of the http protocol cannot
be used as it does not allow extensions. The client should not be surprised
if other fields and informations are sent by the server.
Listing
The client can request a full listing of all files on the server. This should
only be done to either check the archive to find dead or missing files, or
to start a full restore.
The listing is requested by sending a GET request for
/<base-url>/
The server responds with an ASCII file (content type text/plain) that lists
each filename in the index on a line by itself.
Requesting a file
The client sends a GET request with a URL of the form
/<base-url>/filename, and the
server returns the file (if it exists). The server should support
Range: requests, and always add the size and md5sum in the response
header. The content-type should always be application/octet-stream.
Storing a file
The client sends a PUT request with the URL
/<base-url>/filename and the file as part of the request with a
content-type of application/octet-stream. It may only send parts of the file
using Range:, and the server is expected to support this.
Deleting a file
The client sends the DELETE request with the URL
/<base-url>/filename. The server acts
accordingly.
Checking for a file
If the client wants to check if a file is still in the archive, it can issue a
HEAD request with the same URL used to GET or PUT
the file. The server is supposed to return the
correct information and include the md5sum of the file (or of the range of
bytes requested).