Hamilton LaboratoriesHamilton C shell 2012User guideExternal utilities

tar

Oregon Coast

tar
Previous | Next

        Read/Write UNIX TAR and CPIO Format Files

Usage:  tar [-acCMtxXyh] [-#!ADFjJLNpPqrRsSTvV] [-fQwWZ-] [-#n]
              [-B blksize] [-Hon] [-Hoff] [-b sex] [-bL] [-bH]
              [-d dir] [-E endset] [-ff] [-I include] [-m map]
              [-O offset] [ tarfile ] [ file1 file2 ... ]

   tar is used to read or write a simple archive format popular
   for exchanging files between dissimilar machines.

   tar normally expects the archive to be in a file specified by
   the tarfile operand.

   When adding files, the names are in the user's normal file
   name space and wildcards can be used in the normal fashion.

   When listing or extracting files, the file names that follow
   are considered to be case-sensitive in the name space of
   what's in the archive and must match the complete path
   specified there.  Full wildcarding is supported.  For example,

      tar -x myarchive.tar ".../*.[ch]"

   would cause any .c or .h files anywhere in the archive to be
   extracted.  (The "..." construct matches any number of
   directory levels and the "[ ]" construct matches any
   character in the enclosed set.)  Notice that if wildcards are
   used, they should be enclosed in single or double quotes so
   the C shell won't try expanding them before tar sees them.
   Also, if want to specify a character that's normally a
   wildcard as an ordinary character, you will need to "double-
   escape" it.  For example, to extract a file named
   "mail[2008]", you would need to type:

      tar -x myarchive.tar mail^^[2008^^]

   to ensure that the escape character (even if it was inside
   quotes) is actually passed through the C shell to tar.

   When extracting files, this version of tar incorporates logic
   to interactively crunch up a filename in the archive into
   something legal on an NT filesystem.  If -F is specified, FAT
   naming rules are enforced.  Otherwise, HPFS or NTFS rules
   are assumed, meaning long filenames assumed to be legal.
   Any renamings will be listed in a .map file.

   When reading an archive, this version of tar automatically
   detects whether it was written in CPIO or TAR format and
   what bytesex was used.

   tar also incorporates logic to automatically convert between
   the \n line endings used in an archive and the \r\n line
   endings used under NT unless the file appears to be
   binary, based on its content.  The environment variables
   TARBINARY and TARASCII can also be used to specify sets of
   files by name which are to be considered binary or ASCII,
   respectively, regardless of content.  Each of these variables
   may contain a list of wildcards.  If a filename or just the
   tail of it (i.e., just the name + extension, leaving off the
   preceding path) matches one of the wildcards in the list,
   that file is considered to be of the specified type.  If a
   filename matches both lists or if it matches neither list,
   the usual test based on file content will be made.  Files
   that receive line end conversions are highlighted in the
   listings produced by tar in the ASCIICONVERT color for easy
   review.

   There is no limit on the overall length of an archive except
   whatever limit may be imposed by the filesystem if the archive
   is written to disk.  The filesize limit for individual files
   within an archive is determined by the archive format:  for
   tar archives, the limit is 8.4 million petabytes, essentially
   unlimited; for CPIO binary and new portable CPIO archives, the
   limit is 4G bytes; for CPIO ASCII archives, the limit is 8G
   bytes.  (But when using tar for interchange with other
   systems, bear in mind that those other systems may impose
   their own smaller limits.)

   When adding files to an archive, timestamps outside the legal
   range (January 1, 1970 to 337February 7, 2106) for a tar archive
   will be truncated to these dates.

Tape Drives:

   The tarfile can be the tape device, specified by its special
   file name, \\.\tape0 (or \\.\tape1, \\.\tape2, etc.,
   if you have more than one), or via the -# option.

   When reading/writing to a tape, tar rewinds the tape when it
   starts up and rewinds again and then ejects when it finishes
   unless -N is specified.

Basic Commands:

   -a         Add files to the end of the archive.  If the
              archive is on a tape device, this operation may
              not be possible, depending on whether your drive
              supports repositioning and rewriting the last
              physical block on the tape.  For example, it
              works with DAT drives but not with QIC drives.
              If -a does not work with your drive, you'll have
              to use -c instead.
   -c         Create a new archive, truncating any existing
              archive to zero bytes before writing to it.
   -C         Copy entire archive segments (including headers and
              and any padding) to stdout.  After the last
              segment, write a trailer to mark the end of the
              archive.  (If you intend to concatenate archives,
              use the -Z option to suppress writing the trailer.)
   -M         Just build a mapfile for renaming files in the
              archive to NT conventions; don't extract
              anything.
   -t         List the contents of the archive.  This is the
              default.
   -x         Extract files from the archive.  Default is all
              files in the archive.
   -X         Extract everything EXCEPT the specified files from
              the archive.
   -y         Extract the specified files in the archive to
              stdout.
   -h         Help.  (This screen.)

Basic Options:

   -#         Use the default tape device, \\.\tape0.
   -#n        Use the n-th tape device, where n is a single
              decimal digit.  For example, -#1 means tar should
              use \\.\tape1.
   -!         Non-interactive.  Files are renamed as necessary
              for NT conventions.  (Particularly useful
              with -M when trying to read a new, large archive
              file.)
   -A         The Archive bit is reset for any files or direct-
              ories copied to a TAR or CPIO archive file.  (When
              extracting files, the -A option is ignored and the
              Archive bit is always set.)
   -B blksize Use the specified blocksize when creating a new
              archive.  Default is 10240 bytes if supported
              by the device.  When reading or adding to an
              existing archive on tape, tar tries to determine
              and use whatever blocksize was used when the
              archive was created.   How it does that depends
              what release of Windows NT you're running and
              whether your drive supports variable blocksizes.
              If you're running NT 3.51 or later and variable
              blocksizes are supported, this option is ignored
              and the actual blocksize is determined directly
              using variable blocksize support.  Otherwise,
              tar first tries this specified blocksize; if that
              doesn't work, it tries all the possible multiples
              of 512 bytes up to the maximum supported on your
              machine.
   -D         Dim.  Don't insert ANSI escape sequences into the
              output to highlight anything.
   -F         FAT filesystem naming when extracting or building
              the map file.
   -Hon       Hardware compression on, if supported.  (Default is
              to use the current setting for compression.)
   -Hoff      Hardware compression off.
   -j         New portable System V CPIO ASCII format.
   -J         New portable System V CPIO ASCII format with
              checksum.
   -L         Long listing similar to ls -L showing the attri-
              butes, timestamp and length of each file in the
              archive.
   -N         No rewind or eject.  If the tarfile is on a tape
              device, tar normally rewinds the tape at the start
              and then rewinds and ejects at the end.  This
              option turns that off.
   -p         CPIO format, using binary headers.
   -P         CPIO format, using ASCII headers.
   -q         Quiet.  tar normally prints the header of each
              file as it's extracted (-x) or added (-a or -c) to
              the archive. This option turns that off.
   -r         CarriageReturn/NewLine expansion is turned off.
              (Default is normally to convert any \n characters
              not preceded by a \r in the archive to \r\n
              combinations under NT unless the file
              appears to be binary.)
   -R         CarriageReturn/NewLine expansion is forced ON, even
              for files that appear to be binary.
   -s         Read the archive from stdin when listing the table
              of contents or extracting.  Write the archive to
              stdout when adding files.  (Implies non-inter-
              active.)
   -S         Stop if a file is encountered that cannot be
              extracted.  Normally, a warning message is given
              but processing continues.
   -T         Total the sizes of all selected files.
   -v         Verbose.  Like -L, but also show the offset of each
              file from the beginning of the archive and what
              archive format and bytesex was used.  Also turns
              on warnings about line-end conversions being turned
              off on binary files.
   -V         Don't use variable block I/O even if the drive
              claims it supports it.  Useful as a workaround
              if your drive's firmware has a bug.
   --         End of options.

Advanced Options:

   -b sex     Byte sex in the archive:  abcd (little-endian),
              badc (big-endian), cdab or dcba.  Default is to
              autosense bytesex in existing archives and to use
              abcd for new archives.
   -bL        Little-Endian bytesex.  (An alias for -b abcd.)
   -bB        Big-Endian bytesex.  (An alias for -b badc.)

              Note:  To write an archive intended to be read
                     on a RISC or Motorola-based UNIX machine,
                     use -b badc or -bB (big-endian).

   -d dir     Default destination drive and directory when
              extracting files.
   -E endset  Offset at which to stop reading the archive file.
   -f         Fullpath option.  Put the full pathname (minus any
              disk prefix) specified on the command line into the
              archive header when adding.  (In this context, the
              full path means the full name given on the command
              line, not the fully-qualified name starting from
              the root directory.)  When extracting, use the full
              pathname given in the header to determine where the
              files will go.
   -ff        Another variation on the fullpath option that will
              put the entire pathname, even including the drive
              letter into the tar archive.  The resulting name
              isn't really legal in a tar file, but it's useful
              for doing backups of several drives at once.
   -I include Files to be added to or read from the archive are
              specified in the include file.  If the name of
              the include file is given as "-", the names
              will be read from stdin.  If more than one -I
              include file is given, the lists of names they 
              hold will be concatenated, one after another.
              Any files specified on the command line will be
              added onto the end.
   -m map     Specific filename to be used for showing mappings
              from names in the archive to names used on
              NT.  (If -M is specified, but -m isn't
              used to specify a name for the mapfile, the
              default is to paste a .map extension onto the name
              of the tar file; if -s is specified, i.e., the tar
              file doesn't have a name, no map file is used
              unless -m is given.)
   -O offset  Offset at which to start reading the archive file.
              Given in bytes from beginning of the file.
   -Q         Very Quiet.  tar normally warns of any garbled
              sections that it skipped; this turns off those
              warnings also.
   -w         Share all files being copied to the archive for
              read/write access by other processes.  (Default
              is to do that only with files already open by
              another process.)
   -W         Warnings.  Show just the files that can't be
              extracted to NT because of their file
              types.
              (Shown in the FOREIGNFILES color.)
   -Z         Suppress writing the trailer normally written
              following the last segment extracted from an
              archive with the -C option.  (Useful for
              concatenating segments extracted from several
              separate archives.)

Examples:

   1. To list the contents of a tar file on tape, showing the
      timestamps and sizes of the files:

         tar -L \\.\tape0

   2. To extract everything on the tape into the current
      directory, again showing timestamps and sizes:

         tar -xL \\.\tape0

   3. To copy all the *.c files in the current directory to a
      new tar tape, overwriting anything that may already be
      on the tape, again showing timestamps and sizes:

         tar -cL \\.\tape0 *.c

   4. Same as (3), but write it in big-endian format, suitable
      for a UNIX RISC machine:

         tar -cLbB \\.\tape0 *.c

   5. Same as (3), but adding files to an existing archive
      on the tape rather than overwriting it:

         tar -aL \\.\tape0 *.c

      Note:  Adding to an archive on tape isn't supported by
             all types of tape drives.  See the comments
             regarding the -a operation above.

   6. Extract everything on a tar-format floppy into the
      current directory:

         dskread a: | tar -xsL

   7. Write all the *.c files in the current directory to a
      tar-format floppy in big-endian format, verifying each
      write operation along the way:

         tar -csbB *.c | dskwrite -vx a:

TAR Format:

   Tar files are organized as a series of 512-byte blocks.
   Individual files always start on a block boundary with a
   header block followed by the uncompressed data in the file.
   At the end of the file are two blocks filled with binary
   zeros.  The header has the following format, packed with
   individual fields byte-aligned:

      typedef struct {
            char  name[100],
                  mode[8],
                  userid[8],
                  groupid[8],
                  filesize[12],
                  timestamp[12],
                  checksum[8],
                  linkflag,
                  linkname[100];
            union {
               char  unused_chars[255];
               struct {
                  char  magic[6],
                        version[2]
                        username[32],
                        groupname[32],
                        devmajor[8],
                        devminor[8],
                        prefix[155];
                  } ustar;
               } u;
            } tar_header;

   Traditionally, everything in a tar header is in ASCII with
   nulls and spaces to punctuate the fields and numbers are
   always in octal.  But eleven octal digits (plus a space) in
   the filesize field would only allow a maximum value of 8.59GB,
   which is certainly smaller than may be supported on many
   modern systems, including Windows.  Thus, a popular extension
   supported by this tar is to interpret numeric fields as
   binary if the high bit is set in the first character.

   The mode, user and group ids aren't meaningful on NT
   and are ignored when extracting and just filled in with
   read/write for owner, owned by root when adding.  The
   timestamp is in seconds since Jan 1 00:00:00 GMT 1970.  The
   checksum is calculated as if that field contained spaces.
   The linkflag tells the file type, reported in the long listing
   as one of the following:

      -    Normal File
      D    Directory
      L    Link (not a separate file, just another name
           for one that already exists)
      S    Symbolic Link
      C    Character Device
      B    Block Device
      F    FIFO

   Under NT, only the normal files and directories have
   any meaning.  Directories are normally highlighted.  The other
   file types are normally reported in bright red but otherwise
   ignored.

   The last 255 bytes may contain either all binary zeros or
   the new "USTAR" trailer, used when the filename is longer
   than 100 characters.  In USTAR format, the magic field
   contains the null-terminated string "ustar", the version
   is "00" (without a null) and, if the prefix field is not
   null, the actual pathname is formed by concatenating the
   prefix + a slash + the name.  If the prefix is null, the
   name field is used alone.

   When writing USTAR format, the username and groupname
   are null, the devmajor is 0 and devminor is 1.  When
   reading USTAR format, all the fields except the prefix
   are ignored.

   If the filename is too long even in USTAR format, tar will
   use the GNU extension convention of writing a special prefix
   consisting of a header marked with a special linkflag
   indicating that the data which follows is the full name of
   the next file in the archive.

CPIO Format:

   If -p is specified, tar will read and write CPIO format files,
   using binary headers of the following format:

      typedef struct {
            short  magic,     /* Always 0x71c7 == Octal 070707 */
                   dev;       /* Device containing directory
                                 entry for this file. */
            ushort inode,     /* UNIX inode number. */
                   mode,
                   userid,
                   groupid,
                   nlink,
                   rdev;     /* Device ID for special files. */
            ulong  timestamp;
            ushort namelen;  /* including trailing null. */
            ulong  filesize;
            char   name[ namelen rounded to word ];
            } cpio_header;

   The dev, inode, mode, userid, groupid, nlink and rdev fields
   are not meaningful on NT and are ignored when
   extracting and filled in with 1, 1, read/write by owner,
   0, 0, 1 and 0, respectively, when writing.

   If -P is specified, tar will read and write CPIO format files
   using the alternate ASCII format headers, where each ushort is
   written as a 6-character octal number, each ulong as an 11-
   character octal number, and name is null-terminated.

   In a CPIO file, data immediately follows the header and is not
   padded to a block boundary.

Portable CPIO ASCII Format:

   If -p or -P is specified, tar will read and write archives
   using headers defined by the portable CPIO ASCII format
   introduced with UNIX System V:

      typedef   char   long_hex[8]
      
      typedef struct new_cpio_ascii_header_str {
            char       magic[6];
            long_hex   inode,
                       mode,
                       userid,
                       groupid,
                       nlink,
                       timestamp,
                       filesize,
                       dev_major,
                       dev_minor,
                       rdev_major,
                       rdev_minor,
                       namelen,    /* including trailing null. */
                       checksum;
            char       name[ namelen+1 ];
            } new_cpio_ascii_header;

   The magic field always contains either "070701" (normal) or
   "070702" (checksum variation.)  The inode, mode, userid,
   groupid, nlink, dev and rdev fields are not meaningful on
   NT and are ignored when extracting and filled in with
   1, read/write by owner, 0, 0, 1, 1, and 0, respectively, when
   writing.

   The header, including the filename, is padded to the next
   DWORD (4-byte) boundary.  The data section immediately follows
   and is also padded to the next DWORD boundary.

   The difference between the -p and -P options is that if -P
   is specifed, the checksum field is filled in with a simple
   32-bit sum of all the bytes in the file, each taken as an
   unsigned 8-bit value.


Colors:

   You may set your own choices for screen colors using these
   environmental variables:

    Name           Use                              Default
    ASCIICONVERT   ASCII files receiving line       Bright Yellow
                   end conversion
    COLORS         Normal screen colors             <null string>
    DIRECTORIES    Directories                      Bright
    FOREIGNFILES   Filetypes not supported by NT    Bright Red
    READONLYDIRS   Directories marked read-only     same as DIRECTORIES
    READONLYFILES  Files marked read-only           same as COLORS

   Colors recognized are black, red, green, yellow, blue, magenta
   (or red blue), cyan (or blue green) or white.  Foreground and
   background colors may also be bright, dim or reverse.  The
   names of the colors and the words bright, dim, reverse and on
   may be in either upper or lower or mixed case.

   Either or both the foreground and background colors may be
   specified; if you don't specify a value, it's considered
   transparent and inherits the color underneath it.  DIRECTORIES,
   FOREIGNFILES and ASCIICONVERT inherit from COLORS.  If COLORS
   is null, tar uses the current screen colors it finds at startup.
   Specifying COLORS=none turns off all use of COLOR.

   If the -D (dim) option is specified, all highlighting is
   turned off, regardless of the settings for these environment
   variables.

Previous | Next