Read/Write UNIX TAR and CPIO Format Files Usage: tar [-acCMtxXyh] [-#!ADFjJLNpPqrRsSTvV] [-fQwWZ-] [-#n] [-B blksize] [-Hon] [-Hoff] [-b sex] [-bL] [-bH] [-d dir] [-E endset] [-ff] [-I include] [-m map] [-O offset] [ tarfile ] [ file1 file2 ... ] tar is used to read or write a simple archive format popular for exchanging files between dissimilar machines. tar normally expects the archive to be in a file specified by the tarfile operand. When adding files, the names are in the user's normal file name space and wildcards can be used in the normal fashion. When listing or extracting files, the file names that follow are considered to be case-sensitive in the name space of what's in the archive and must match the complete path specified there. Full wildcarding is supported. For example, tar -x myarchive.tar ".../*.[ch]" would cause any .c or .h files anywhere in the archive to be extracted. (The "..." construct matches any number of directory levels and the "[ ]" construct matches any character in the enclosed set.) Notice that if wildcards are used, they should be enclosed in single or double quotes so the C shell won't try expanding them before tar sees them. Also, if want to specify a character that's normally a wildcard as an ordinary character, you will need to "double- escape" it. For example, to extract a file named "mail[2008]", you would need to type: tar -x myarchive.tar mail^^[2008^^] to ensure that the escape character (even if it was inside quotes) is actually passed through the C shell to tar. When extracting files, this version of tar incorporates logic to interactively crunch up a filename in the archive into something legal on an NT filesystem. If -F is specified, FAT naming rules are enforced. Otherwise, HPFS or NTFS rules are assumed, meaning long filenames assumed to be legal. Any renamings will be listed in a .map file. When reading an archive, this version of tar automatically detects whether it was written in CPIO or TAR format and what bytesex was used. tar also incorporates logic to automatically convert between the \n line endings used in an archive and the \r\n line endings used under NT unless the file appears to be binary, based on its content. The environment variables TARBINARY and TARASCII can also be used to specify sets of files by name which are to be considered binary or ASCII, respectively, regardless of content. Each of these variables may contain a list of wildcards. If a filename or just the tail of it (i.e., just the name + extension, leaving off the preceding path) matches one of the wildcards in the list, that file is considered to be of the specified type. If a filename matches both lists or if it matches neither list, the usual test based on file content will be made. Files that receive line end conversions are highlighted in the listings produced by tar in the ASCIICONVERT color for easy review. There is no limit on the overall length of an archive except whatever limit may be imposed by the filesystem if the archive is written to disk. The filesize limit for individual files within an archive is determined by the archive format: for tar archives, the limit is 8.4 million petabytes, essentially unlimited; for CPIO binary and new portable CPIO archives, the limit is 4G bytes; for CPIO ASCII archives, the limit is 8G bytes. (But when using tar for interchange with other systems, bear in mind that those other systems may impose their own smaller limits.) When adding files to an archive, timestamps outside the legal range (January 1, 1970 to 337February 7, 2106) for a tar archive will be truncated to these dates. Tape Drives: The tarfile can be the tape device, specified by its special file name, \\.\tape0 (or \\.\tape1, \\.\tape2, etc., if you have more than one), or via the -# option. When reading/writing to a tape, tar rewinds the tape when it starts up and rewinds again and then ejects when it finishes unless -N is specified. Basic Commands: -a Add files to the end of the archive. If the archive is on a tape device, this operation may not be possible, depending on whether your drive supports repositioning and rewriting the last physical block on the tape. For example, it works with DAT drives but not with QIC drives. If -a does not work with your drive, you'll have to use -c instead. -c Create a new archive, truncating any existing archive to zero bytes before writing to it. -C Copy entire archive segments (including headers and and any padding) to stdout. After the last segment, write a trailer to mark the end of the archive. (If you intend to concatenate archives, use the -Z option to suppress writing the trailer.) -M Just build a mapfile for renaming files in the archive to NT conventions; don't extract anything. -t List the contents of the archive. This is the default. -x Extract files from the archive. Default is all files in the archive. -X Extract everything EXCEPT the specified files from the archive. -y Extract the specified files in the archive to stdout. -h Help. (This screen.) Basic Options: -# Use the default tape device, \\.\tape0. -#n Use the n-th tape device, where n is a single decimal digit. For example, -#1 means tar should use \\.\tape1. -! Non-interactive. Files are renamed as necessary for NT conventions. (Particularly useful with -M when trying to read a new, large archive file.) -A The Archive bit is reset for any files or direct- ories copied to a TAR or CPIO archive file. (When extracting files, the -A option is ignored and the Archive bit is always set.) -B blksize Use the specified blocksize when creating a new archive. Default is 10240 bytes if supported by the device. When reading or adding to an existing archive on tape, tar tries to determine and use whatever blocksize was used when the archive was created. How it does that depends what release of Windows NT you're running and whether your drive supports variable blocksizes. If you're running NT 3.51 or later and variable blocksizes are supported, this option is ignored and the actual blocksize is determined directly using variable blocksize support. Otherwise, tar first tries this specified blocksize; if that doesn't work, it tries all the possible multiples of 512 bytes up to the maximum supported on your machine. -D Dim. Don't insert ANSI escape sequences into the output to highlight anything. -F FAT filesystem naming when extracting or building the map file. -Hon Hardware compression on, if supported. (Default is to use the current setting for compression.) -Hoff Hardware compression off. -j New portable System V CPIO ASCII format. -J New portable System V CPIO ASCII format with checksum. -L Long listing similar to ls -L showing the attri- butes, timestamp and length of each file in the archive. -N No rewind or eject. If the tarfile is on a tape device, tar normally rewinds the tape at the start and then rewinds and ejects at the end. This option turns that off. -p CPIO format, using binary headers. -P CPIO format, using ASCII headers. -q Quiet. tar normally prints the header of each file as it's extracted (-x) or added (-a or -c) to the archive. This option turns that off. -r CarriageReturn/NewLine expansion is turned off. (Default is normally to convert any \n characters not preceded by a \r in the archive to \r\n combinations under NT unless the file appears to be binary.) -R CarriageReturn/NewLine expansion is forced ON, even for files that appear to be binary. -s Read the archive from stdin when listing the table of contents or extracting. Write the archive to stdout when adding files. (Implies non-inter- active.) -S Stop if a file is encountered that cannot be extracted. Normally, a warning message is given but processing continues. -T Total the sizes of all selected files. -v Verbose. Like -L, but also show the offset of each file from the beginning of the archive and what archive format and bytesex was used. Also turns on warnings about line-end conversions being turned off on binary files. -V Don't use variable block I/O even if the drive claims it supports it. Useful as a workaround if your drive's firmware has a bug. -- End of options. Advanced Options: -b sex Byte sex in the archive: abcd (little-endian), badc (big-endian), cdab or dcba. Default is to autosense bytesex in existing archives and to use abcd for new archives. -bL Little-Endian bytesex. (An alias for -b abcd.) -bB Big-Endian bytesex. (An alias for -b badc.) Note: To write an archive intended to be read on a RISC or Motorola-based UNIX machine, use -b badc or -bB (big-endian). -d dir Default destination drive and directory when extracting files. -E endset Offset at which to stop reading the archive file. -f Fullpath option. Put the full pathname (minus any disk prefix) specified on the command line into the archive header when adding. (In this context, the full path means the full name given on the command line, not the fully-qualified name starting from the root directory.) When extracting, use the full pathname given in the header to determine where the files will go. -ff Another variation on the fullpath option that will put the entire pathname, even including the drive letter into the tar archive. The resulting name isn't really legal in a tar file, but it's useful for doing backups of several drives at once. -I include Files to be added to or read from the archive are specified in the include file. If the name of the include file is given as "-", the names will be read from stdin. If more than one -I include file is given, the lists of names they hold will be concatenated, one after another. Any files specified on the command line will be added onto the end. -m map Specific filename to be used for showing mappings from names in the archive to names used on NT. (If -M is specified, but -m isn't used to specify a name for the mapfile, the default is to paste a .map extension onto the name of the tar file; if -s is specified, i.e., the tar file doesn't have a name, no map file is used unless -m is given.) -O offset Offset at which to start reading the archive file. Given in bytes from beginning of the file. -Q Very Quiet. tar normally warns of any garbled sections that it skipped; this turns off those warnings also. -w Share all files being copied to the archive for read/write access by other processes. (Default is to do that only with files already open by another process.) -W Warnings. Show just the files that can't be extracted to NT because of their file types. (Shown in the FOREIGNFILES color.) -Z Suppress writing the trailer normally written following the last segment extracted from an archive with the -C option. (Useful for concatenating segments extracted from several separate archives.) Examples: 1. To list the contents of a tar file on tape, showing the timestamps and sizes of the files: tar -L \\.\tape0 2. To extract everything on the tape into the current directory, again showing timestamps and sizes: tar -xL \\.\tape0 3. To copy all the *.c files in the current directory to a new tar tape, overwriting anything that may already be on the tape, again showing timestamps and sizes: tar -cL \\.\tape0 *.c 4. Same as (3), but write it in big-endian format, suitable for a UNIX RISC machine: tar -cLbB \\.\tape0 *.c 5. Same as (3), but adding files to an existing archive on the tape rather than overwriting it: tar -aL \\.\tape0 *.c Note: Adding to an archive on tape isn't supported by all types of tape drives. See the comments regarding the -a operation above. 6. Extract everything on a tar-format floppy into the current directory: dskread a: | tar -xsL 7. Write all the *.c files in the current directory to a tar-format floppy in big-endian format, verifying each write operation along the way: tar -csbB *.c | dskwrite -vx a: TAR Format: Tar files are organized as a series of 512-byte blocks. Individual files always start on a block boundary with a header block followed by the uncompressed data in the file. At the end of the file are two blocks filled with binary zeros. The header has the following format, packed with individual fields byte-aligned: typedef struct { char name[100], mode[8], userid[8], groupid[8], filesize[12], timestamp[12], checksum[8], linkflag, linkname[100]; union { char unused_chars[255]; struct { char magic[6], version[2] username[32], groupname[32], devmajor[8], devminor[8], prefix[155]; } ustar; } u; } tar_header; Traditionally, everything in a tar header is in ASCII with nulls and spaces to punctuate the fields and numbers are always in octal. But eleven octal digits (plus a space) in the filesize field would only allow a maximum value of 8.59GB, which is certainly smaller than may be supported on many modern systems, including Windows. Thus, a popular extension supported by this tar is to interpret numeric fields as binary if the high bit is set in the first character. The mode, user and group ids aren't meaningful on NT and are ignored when extracting and just filled in with read/write for owner, owned by root when adding. The timestamp is in seconds since Jan 1 00:00:00 GMT 1970. The checksum is calculated as if that field contained spaces. The linkflag tells the file type, reported in the long listing as one of the following: - Normal File D Directory L Link (not a separate file, just another name for one that already exists) S Symbolic Link C Character Device B Block Device F FIFO Under NT, only the normal files and directories have any meaning. Directories are normally highlighted. The other file types are normally reported in bright red but otherwise ignored. The last 255 bytes may contain either all binary zeros or the new "USTAR" trailer, used when the filename is longer than 100 characters. In USTAR format, the magic field contains the null-terminated string "ustar", the version is "00" (without a null) and, if the prefix field is not null, the actual pathname is formed by concatenating the prefix + a slash + the name. If the prefix is null, the name field is used alone. When writing USTAR format, the username and groupname are null, the devmajor is 0 and devminor is 1. When reading USTAR format, all the fields except the prefix are ignored. If the filename is too long even in USTAR format, tar will use the GNU extension convention of writing a special prefix consisting of a header marked with a special linkflag indicating that the data which follows is the full name of the next file in the archive. CPIO Format: If -p is specified, tar will read and write CPIO format files, using binary headers of the following format: typedef struct { short magic, /* Always 0x71c7 == Octal 070707 */ dev; /* Device containing directory entry for this file. */ ushort inode, /* UNIX inode number. */ mode, userid, groupid, nlink, rdev; /* Device ID for special files. */ ulong timestamp; ushort namelen; /* including trailing null. */ ulong filesize; char name[ namelen rounded to word ]; } cpio_header; The dev, inode, mode, userid, groupid, nlink and rdev fields are not meaningful on NT and are ignored when extracting and filled in with 1, 1, read/write by owner, 0, 0, 1 and 0, respectively, when writing. If -P is specified, tar will read and write CPIO format files using the alternate ASCII format headers, where each ushort is written as a 6-character octal number, each ulong as an 11- character octal number, and name is null-terminated. In a CPIO file, data immediately follows the header and is not padded to a block boundary. Portable CPIO ASCII Format: If -p or -P is specified, tar will read and write archives using headers defined by the portable CPIO ASCII format introduced with UNIX System V: typedef char long_hex[8] typedef struct new_cpio_ascii_header_str { char magic[6]; long_hex inode, mode, userid, groupid, nlink, timestamp, filesize, dev_major, dev_minor, rdev_major, rdev_minor, namelen, /* including trailing null. */ checksum; char name[ namelen+1 ]; } new_cpio_ascii_header; The magic field always contains either "070701" (normal) or "070702" (checksum variation.) The inode, mode, userid, groupid, nlink, dev and rdev fields are not meaningful on NT and are ignored when extracting and filled in with 1, read/write by owner, 0, 0, 1, 1, and 0, respectively, when writing. The header, including the filename, is padded to the next DWORD (4-byte) boundary. The data section immediately follows and is also padded to the next DWORD boundary. The difference between the -p and -P options is that if -P is specifed, the checksum field is filled in with a simple 32-bit sum of all the bytes in the file, each taken as an unsigned 8-bit value. Colors: You may set your own choices for screen colors using these environmental variables: Name Use Default ASCIICONVERT ASCII files receiving line Bright Yellow end conversion COLORS Normal screen colors <null string> DIRECTORIES Directories Bright FOREIGNFILES Filetypes not supported by NT Bright Red READONLYDIRS Directories marked read-only same as DIRECTORIES READONLYFILES Files marked read-only same as COLORS Colors recognized are black, red, green, yellow, blue, magenta (or red blue), cyan (or blue green) or white. Foreground and background colors may also be bright, dim or reverse. The names of the colors and the words bright, dim, reverse and on may be in either upper or lower or mixed case. Either or both the foreground and background colors may be specified; if you don't specify a value, it's considered transparent and inherits the color underneath it. DIRECTORIES, FOREIGNFILES and ASCIICONVERT inherit from COLORS. If COLORS is null, tar uses the current screen colors it finds at startup. Specifying COLORS=none turns off all use of COLOR. If the -D (dim) option is specified, all highlighting is turned off, regardless of the settings for these environment variables. |