xz is another general purpose data compression tool with syntax similar to the older and more popular gzip and bzip2 options. With xz we will typically get a better compression ratio.
We are going to cover 13 examples of xz here, showing you common tasks that can be completed and just how easy it is to use.
Check out some of our other compression tool examples:
Requirements
Before starting you will need to have the xz package installed, this is usually already installed by default, however you can install it now if required.
RHEL:
yum install xz
Debian:
apt-get install xz-utils
Example XZ Commands
-
1. Compress a single file
This will compress file.txt and create file.txt.xz, note that this will remove the original file.txt file.xz file.txt
-
2. Compress multiple files at once
This will compress all files specified in the command, note again that this will remove the original files specified by turning file1.txt, file2.txt and file3.txt into file1.txt.xz, file2.txt.xz and file3.txt.xzxz file1.txt file2.txt file3.txt
To instead compress all files within a directory, see example 10 below.
-
3. Compress a single file and keep the original
You can instead keep the original file and create a compressed copy with the -k or --keep flag.xz -k file.txt
This can also be done with the -c flag as below.
xz -c file.txt > file.txt.xz
The -c flag outputs the compressed copy of file.txt to stdout, this is then sent to file.txt.xz, keeping the original file.txt file in place.
-
4. Decompress a xz compressed file
To reverse the compression process and get the original file back that you have compressed, you can use the xz command itself or unxz which is also part of the xz package.xz -d file.txt.xz
OR
unxz file.txt.xz
Both of these commands will produce the same result, decompressing file.txt.xz to file.txt, removing the compressed file.txt.xz file.
Similar to example 3, it is possible to decompress a file and keep the original .xz file as below.
unxz -k file.txt.xz
-
5. List compression information
With the -l or --list flag we can see useful information regarding a compressed .xz file such as the compressed and uncompressed size of the file as well as the compression ratio, which shows us how much space our compression is saving.[root@centos ~]# xz -l linux-3.18.19.tar.xz Strms Blocks Compressed Uncompressed Ratio Check Filename 1 1 79.7 MiB 553.9 MiB 0.144 CRC64 linux-3.18.19.tar.xz
In this example, an xz copy of the Linux kernel has compressed to 14.4% of its original size, taking up 79.7MB of space rather than 553.9MB.
-
6. Verbose information
The -v or --verbose flag can be specified to provide up to date information on a running operation. In the below example we are compressing the Linux kernel with xz with -v which is showing us the current percentage, compression ratio, MB/s, elapsed time and estimated amount of time until completion which is very useful information especially when compressing a large file.[root@centos test]# time xz -v linux-3.18.19.tar linux-3.18.19.tar (1/1) 21.4 % 18.6 MiB / 118.6 MiB = 0.157 1.3 MiB/s 1:28 5 min 30 s
This only works for files that have a known size, it will not work if you are piping content into xz.
-
7. Adjust compression level
The level of compression applied to a file using xz can be specified as a value between 0 (less compression) and 9 (best compression). Using option 0 will complete faster, but space saved from the compression will not be optimal. Using option 9 will take longer to complete, however you will have the largest amount of space saved.The below example compares the differences between -0 and -9, as shown while -0 finishes much faster it compresses around 6% less (approximately 37mb more space required).
[root@centos ~]# time xz -0v linux-3.18.19.tar linux-3.18.19.tar (1/1) 100 % 112.1 MiB / 553.9 MiB = 0.202 12 MiB/s 0:44 real 0m44.533s user 0m44.084s sys 0m0.420s [root@centos ~]# time xz -9v linux-3.18.19.tar linux-3.18.19.tar (1/1) 100 % 77.0 MiB / 553.9 MiB = 0.139 1.1 MiB/s 8:20 real 8m20.859s user 8m20.007s sys 0m0.778s
-0 can also be specified with the flag --fast, while option -9 can also be specified with the flag --best, these are provided for backwards compatibility with older tools and should be avoided. By default xz uses a compression level of -6, which is slightly biased towards higher compression at the expense of speed. When selecting a value between 1 and 9 it is important to consider what is more important to you, the amount of space saved or the amount of time spent compressing, the default -6 option provides a fair trade off.
-
8. Extreme mode
The -e or --extreme flag can be specified to use more CPU when encoding to increase the compression ratio, however this will take more time. This uses a slower variant of the selected compression level (-0 … -9) to improve the ratio, however it can be possible for this to make the ratio worse depending on the data.Below we will compare the difference between compressing the Linux kernel with a compression level of 0 with and without extreme.
[root@centos ~]# time xz -0v linux-3.18.19.tar linux-3.18.19.tar (1/1) 100 % 112.1 MiB / 553.9 MiB = 0.202 12 MiB/s 0:44 real 0m44.533s user 0m44.084s sys 0m0.420s [root@centos ~]# time xz -0ve linux-3.18.19.tar linux-3.18.19.tar (1/1) 100 % 89.9 MiB / 553.9 MiB = 0.162 1.4 MiB/s 6:27 real 6m27.445s user 6m26.298s sys 0m1.136s
As shown this particular example took a lot longer to complete and only compressed 4% more. For further comparison, below we have the same file compressed with the default -6 option. This took slightly less time than compression level 0 with extreme, yet compresses even better.
[root@centos ~]# time xz -6v linux-3.18.19.tar linux-3.18.19.tar (1/1) 100 % 79.7 MiB / 553.9 MiB = 0.144 1.4 MiB/s 6:23 real 6m23.128s user 6m22.491s sys 0m0.610s
While -e will take longer to complete, it can be used to save extra space if required. Below is the result of compression level 6 with extreme specified, it takes even longer and only slightly improves the compression ratio.
[root@centos ~]# time xz -6ve linux-3.18.19.tar linux-3.18.19.tar (1/1) 100 % 79.2 MiB / 553.9 MiB = 0.143 1.0 MiB/s 8:51 real 8m52.038s user 8m51.373s sys 0m0.627s
-
9. Increase xz performance with threads
By default xz runs as a single thread, this means that if you have a 4 CPU cores available it will only use the resources of 1 CPU core. By using the -T or --threads flag you can specify the number of worker threads to use, potentially increasing xz performance. The amount of threads can be set specifically, or the special value of 0 can be used which will use as many threads as there are CPU cores available on a system.The below example compares running xz with a compression level of 6 with the extreme flag set as a default single thread against special value 0 being specified.
[root@centos ~]# time xz -6ve linux-3.18.19.tar linux-3.18.19.tar (1/1) 100 % 79.2 MiB / 553.9 MiB = 0.143 1.0 MiB/s 8:51 real 8m52.038s user 8m51.373s sys 0m0.627s [root@centos ~]# time xz -6ve --threads=0 linux-3.18.19.tar linux-3.18.19.tar (1/1) 100 % 80.2 MiB / 553.9 MiB = 0.145 4.0 MiB/s 2:16 real 2m16.994s user 8m43.762s sys 0m0.488s
The test server has 4 CPU cores so the work completes almost 4 times faster in just 2 minutes and 16 seconds compared to 8 minutes and 51 seconds, a significant performance improvement although the resulting compressed file is slightly larger when run with multiple threads. The amount of threads could also be specified as --threads=4 which would have the same result, or alternatively it could be set to some other number to use a specific amount of CPU resources.
-
10. Compress a directory
With the help of the tar command, we can create a tar file of a whole directory and xz the result. We can perform the whole lot in one step, as the tar command allows us to specify a compression method to use.tar cJvf etc.tar.xz /etc/
This example creates a compressed etc.tar.xz file of the entire /etc/ directory. The tar flags are as follows, ‘c’ creates a new tar archive, ‘J’ specifies that we want to compress with xz, ‘v’ provides verbose information, and ‘f’ specifies the file to create. The resulting etc.tar.xz file contains all files within /etc/ compressed using xz. Note that there is a difference between the lower case j (which will use bzip2) and the upper case J (which will use xz).
-
11. Integrity test
The -t or --test flag can be used to check the integrity of a compressed file.On a normal file, no result will be returned if the file is fine. You can also add the -v option to add some verbosity which will show the percentage of the check.
[root@centos ~]# xz -tv linux-3.18.19.tar.xz linux-3.18.19.tar.xz (1/1) 100 % 79.7 MiB / 553.9 MiB = 0.144 55 MiB/s 0:10
I have now manually modified this file with a text editor and added a random value, essentially introducing corruption and it is now no longer valid.
[root@centos ~]# xz -t linux-3.18.19.tar.xz xz: linux-3.18.19.tar.xz: Compressed data is corrupt
The compressed .xz file makes use of cyclic redundancy check (CRC) in order to detect errors. The CRC value can be viewed by running xz with the -l and -v flags, as shown below.
[root@centos ~]# xz -lvv linux-3.18.19.tar.xz linux-3.18.19.tar.xz (1/1) Streams: 1 Blocks: 1 Compressed size: 79.7 MiB (83,592,736 B) Uncompressed size: 553.9 MiB (580,761,600 B) Ratio: 0.144 Check: CRC64 Stream padding: 0 B Streams: Stream Blocks CompOffset UncompOffset CompSize UncompSize Ratio Check Padding 1 1 0 0 83,592,736 580,761,600 0.144 CRC64 0 Blocks: Stream Block CompOffset UncompOffset TotalSize UncompSize Ratio Check CheckVal Header Flags CompSize MemUsage Filters 1 1 12 0 83,592,696 580,761,600 0.144 CRC64 09c77823ecc52290 12 -- 83,592,675 9 MiB --lzma2=dict=8MiB Memory needed: 9 MiB Sizes in headers: No Minimum XZ Utils version: 5.0.0
By default xz makes use of CRC64, this can be changed when compressing with the -C or --check flag.
-
12. Concatenate multiple files
Multiple files can be concatenated into a single .xz file.xzip -c file1.txt > files.xz xzip -c file2.txt >> files.xz
The files.xz now contains the contents of both file1.txt and file2.txt, if you decompress files.xz you will get a file named ‘files’ which contains the content of both .txt files. The output is similar to running ‘cat file1.txt file2.txt’. If instead you want to create a single file that contains multiple files you can use the tar command which supports xz compression, as covered above in example 10.
-
13. Additional commands included with xz
The xz package provides some very useful commands for working with compressed files, such as xzcat, xzgrep and xzless/xzmore.As you can probably tell by the names of the commands, these are essentially the cat, grep, and less/more commands, however they work directly on compressed data. This means that you can easily view or search the contents of a compressed file without having to decompress it and then view or search it in a second step.
[root@centos test]# xzcat test.txt.xz test example text [root@centos test]# xzgrep exa test.txt.xz example
This is especially useful when searching through or reviewing log files which have been compressed during log rotation.
Summary
As shown the xz package can be used in a number of helpful ways to compress data and save disk space. For further information on xz you can refer to the xz man page by typing ‘man xz’ at the command line, or leave a comment below!
Well written, crystal clear presentation. Thanks!
No problem!
thanks for the information
great list of examples!
For your #10 above, you should state that xz uses -6 in this case but if you want higher compression to use:
tar cvf - DIRECTORY | xz -9 -c - > FILE.tar.xz
Thanks for this guide.
My compression job was slow and took many hours, and I’m afraid that along the way the process was interrupted and the file may have been truncated.
I wanted to check that the compressed file matches the original file.
Will the integrity test work for this? Or do I need to extract the file and check the md5/crc of the original and decompressed file match?
Is there an integrity test that takes the original file as an input argument?
Thanks
Thanks a lot for this very helpful guide!