RSYNC(5)                      File Formats Manual                     RSYNC(5)

NAME
     rsync – rsync wire protocol

DESCRIPTION
     The rsync protocol described in this relates to the BSD-licensed
     openrsync(1), a re-implementation of the GPL-licensed reference utility
     rsync(1).  It is compatible with version 27 of the reference.

     In this document, the "client process" refers to the utility as run on
     the operator's local computer.  The "server process" is run either on the
     local or remote computer, depending upon the command-line given file
     locations.

     There are a number of options in the protocol that are dictated by
     command-line flags.  These will be noted as -D for devices, -g for group
     ids, -l for links, -n for dry-run, -o for user ids, -r for recursion, -v
     for verbose, and --delete for deletion (before).

   Data types
     The binary protocol encodes all data in little-endian format.  Integers
     are signed 32-bit, shorts are signed 16-bit, bytes are unsigned 8-bit.  A
     long is variable-length.  For values less than the maximum integer, the
     value is transmitted and read as a 32-bit integer.  For values greater,
     the value is transmitted first as a maximum integer, then a 64-bit signed
     integer.

     There are three types of checksums: long (slow), short (fast), and whole-
     file.  The fast checksum is a derivative of Adler-32.  The slow checksum
     is MD4, made over the checksum seed first (serialised in little-endian
     format), then the data.  The whole-file applies MD4 to the file first,
     then the checksum seed at the end (also serialised in little-endian
     format).

   Multiplexing
     Most rsync transmissions are wrapped in a multiplexing envelope protocol.
     It is composed as follows:

     1.   envelope header (4 bytes)
     2.   envelope payload (arbitrary length)

     The first byte of the envelope header consists of a tag.  If the tag is
     7, the payload is normal data.  Otherwise, the payload is out-of-band
     server messages.  If the tag is 1, it is an error on the sender's part
     and must trigger an exit.  This limits message payloads to 24 bit integer
     size, 0x00ffffff.

     The only data not using this envelope are the initial handshake between
     client and server.

   File list
     A central part of the protocol is the file list, which is generated by
     the sender.  It consists of all files that must be sent to the receiver,
     either explicitly as given or recursively generated.

     The file list itself consists of filenames and attributes (mode, time,
     size, etc.).  Filenames must be relative to the destination root and not
     be absolute or contain backtracking.  So if a file is given to the sender
     as ../../foo/bar, it must be sent as foo/bar.

     The file list should be cleaned of inappropriate files prior to sending.
     For example, if -l is not specified, symbolic links may be omitted.
     Directory entries without -r may also be omitted.  Duplicates may be
     omitted.

     The receiver must not assume that the file list is clean.  It should not
     omit inappropriate files from the file list (which would affect the
     indexing), but may omit them during processing.

     Prior to be sent from sender to receiver, and upon being received, the
     file list must be lexicographically sorted such as with strcmp(3).
     Subsequent references to the file are by index in the sorted list.

   Client process
     The client can operate in sender or receiver mode depending upon the
     command-line source and destination.

     If the destination directory (sink) is remote, the client is in sender
     mode: the client will push its data to the server.  If the source file is
     remote, it is in receiver mode: the server pushes to the client.  If
     neither are remote, the client operates in sender mode.  These are all
     mutually exclusive.

     When the client starts, regardless its mode, it first handshakes the
     server.  This exchange is not multiplexed.

     1.   send local version (integer)
     2.   receive remote version (integer)
     3.   receive random seed (integer)

     Following this, the client multiplexes when reading from the server.
     Transmissions sent from client to server are not multiplexed.  It then
     enters the Update exchange protocol.

   Server process
     The server can operate in sender or receiver mode depending upon how the
     client starts the server.  This may be directly from the parent process
     (when invoked for local files) or indirectly via a remote shell.

     When in sender mode, the server pushes data to the client.  (This is
     equivalent to receiver mode for the client.)  In receiver, the opposite
     is true.

     When the server starts, regardless the mode, it first handshakes the
     client.  This exchange is not multiplexed.

     1.   send local version (integer)
     2.   receive remote version (integer)
     3.   send random seed (integer)

     Following this, the server multiplexes when writing to the client.
     (Transmissions received from the client are not multiplexed.)  It then
     enters the Update exchange protocol.

   Update exchange
     When the client or server is in sender mode, it begins by conditionally
     sending the exclusion list.  At this time, this is always empty.

     1.   if --delete and the client, exclusion list zero (integer)

     It then sends the File list.  Prior to being sent, the file list should
     be lexicographically sorted.

     1.   status byte (integer)
     2.   inherited filename length (optional, byte)
     3.   filename length (integer or byte)
     4.   file (byte array)
     5.   file length (long)
     6.   file modification time (optional, time_t, integer)
     7.   file mode (optional, mode_t, integer)
     8.   if -o, the user id (integer)
     9.   if -g, the group id (integer)
     10.  if a special file and -D, the device “rdev” type (integer)
     11.  if a symbolic link and -l, the link target's length (integer)
     12.  if a symbolic link and -l, the link target (byte array)

     The status byte may consist of the following bits and determines which of
     the optional fields are transmitted.

     0x01    A top-level directory.  (Only applies to directory files.)  If
             specified, the matching local directory is for deletions.
     0x02    Do not send the file mode: it is a repeat of the last file's
             mode.
     0x08    Like 0x02, but for the user id.
     0x10    Like 0x02, but for the group id.
     0x20    Inherit some of the prior file name.  Enables the inherited
             filename length transmission.
     0x40    Use full integer length for file name.  Otherwise, use only the
             byte length.
     0x80    Do not send the file modification time: it is a repeat of the
             last file's.

     If the status byte is zero, the file-list has terminated.

     If -o has been specified, the sender sends the list of all users
     encountered in the file list.  Identifier zero ("root") is never
     transmitted, as it would prematurely end the list.  This list may be
     incomplete or empty: the server is not obligated to properly fill it in
     with all relevant users.

     1.   user identifier or zero to indicate end of set (integer)
     2.   non-zero length of user name (byte)
     3.   user name (prior length)

     The same sequence is then sent for groups if -g has been specified.

     The sender then sends any IO error values, which for openrsync(1) is
     always zero.

     1.   constant zero (integer)

     The server sender then reads the exclusion list, which is always zero.

     1.   if server, constant zero (integer)

     Following that, the sender receives data regarding the receiver's copy of
     the file list contents.  This data is not ordered in any way.  Each of
     these requests starts as follows:

     1.   file index or -1 to signal a change of phase (integer)

     The phase starts in phase 1, then proceeds to phase 2, and phase 3
     signals an end of transmission (no subsequent blocks).  If a phase change
     occurs, the sender must write back the -1 constant integer value and
     increment its phase state.

     Blocks are read as follows:

     1.   block index (integer)

     In (-n) mode, the sender may immediately write back the index (integer)
     to skip the following.

     1.   number of blocks (integer)
     2.   block length in the file (integer)
     3.   long checksum length (integer)
     4.   terminal (remainder) block length (integer)

     And for each block:

     1.   short checksum (integer)
     2.   long checksum (bytes of checksum length)

     The client then compares the two files, block by block, and updates the
     server with mismatches as follows.

     1.   file index (integer)
     2.   number of blocks (integer)
     3.   block length (integer)
     4.   long checksum length (integer)
     5.   remainder block length (integer)

     Then for each block:

     1.   data chunk size (integer)
     2.   data chunk (bytes)
     3.   block index subsequent to chunk or zero for finished (integer)

     Following this sequence, the sender sends the following:

     1.   whole-file long checksum (16 bytes)

     The sender then either handles the next queued file or, if the receiver
     has written a phase change, the phase change step.

     The receiver may need to request a redo for files that did not match the
     final whole-file long checksum, which it should do at this time or it
     will send another phase change.

     If the sender is the server, then the sender must send statistics whether
     -v has been specified or not.

     1.   total bytes read (long)
     2.   total bytes written (long)
     3.   total size of files (long)

     Finally, the sender must read a final constant-value integer.

     1.   end-of-sequence -1 value (integer)

     If in receiver mode, the inverse above (write instead of read, read
     instead of write) is performed.

     The receiver begins by conditionally writing, then reading, the exclusion
     list count, which is always zero.

     1.   if client, send zero (integer)
     2.   if receiver and --delete, read zero (integer)

     The receiver then proceeds with reading the File list as already defined.
     Following the list, the receiver reads the IO error, which must be zero.

     1.   constant zero (integer)

     The receiver must then sort the file names lexicographically.

     If there are no files in the file list at this time, the receiver must
     exit prior to sending per-file data.  It then proceeds with the file
     blocks.

     For file blocks, the receiver must look at each file that is not up to
     date, defined by having the same file size and timestamp, and send it to
     the server.  Symbolic links and directory entries are never sent to the
     server.

     After the second phase has completed, there is an optional redo phase
     that may follow.  This proceeds in an identical fashion to the second
     phase, but with the checksum length increased to the full MD4 size.  This
     phase is primarily designed to catch files that may have changed while
     the transfer was in progress, but it also catches incorrectly selected
     blocks from the second phase.  Such blocks may happen due to relatively
     rare collisions in the short hash and the two-byte version of the long
     hash that the second phase uses.  Regardless of needing to redo any
     files, the receiver sends one more end of phase marker to signal the end
     of file transfers.

     Following the optional redo phase and prior to writing the end-of-data
     signal, the client receiver reads statistics.  This is performed
     regardless of (-v), but will only be written out if -v has been
     specified.

     1.   total bytes read (long)
     2.   total bytes written (long)
     3.   total size of files (long)

     Finally, the receiver must send the constant end-of-sequence marker.

     1.   end-of-sequence -1 value (integer)

   Sender and receiver asynchrony
     The sender and receiver need not work in lockstep.  The receiver may send
     file update requests as quickly as it parses them, and respond to the
     sender's update notices on demand.  Similarly, the sender may read as
     many update requests as it can, and service them in any order it wishes.

     The sender and receiver synchronise state only at the end of phase.

     The reference rsync(1) takes advantage of this with a two-process
     receiver, one for sending update requests (the generator) and another for
     receiving.  openrsync(1) uses an event-loop model instead.

SEE ALSO
     openrsync(1), rsync(1), rsyncd(5)

BUGS
     Time values are sent as 32-bit integers.

     When in server mode and when communicating to a client with a newer
     protocol (>27), the phase change integer (-1) acknowledgement must be
     sent twice by the sender.  The is probably a bug in the reference
     implementation.

macOS 15.2                     January 16, 2025                     macOS 15.2