OLD | NEW |
1 | 1 |
2 XZ Utils FAQ | 2 XZ Utils FAQ |
3 ============ | 3 ============ |
4 | 4 |
5 Q: What do the letters XZ mean? | 5 Q: What do the letters XZ mean? |
6 | 6 |
7 A: Nothing. They are just two letters, which come from the file format | 7 A: Nothing. They are just two letters, which come from the file format |
8 suffix .xz. The .xz suffix was selected, because it seemed to be | 8 suffix .xz. The .xz suffix was selected, because it seemed to be |
9 pretty much unused. It has no deeper meaning. | 9 pretty much unused. It has no deeper meaning. |
10 | 10 |
11 | 11 |
12 Q: What are LZMA and LZMA2? | 12 Q: What are LZMA and LZMA2? |
13 | 13 |
14 A: LZMA stands for Lempel-Ziv-Markov chain-Algorithm. It is the name | 14 A: LZMA stands for Lempel-Ziv-Markov chain-Algorithm. It is the name |
15 of the compression algorithm designed by Igor Pavlov for 7-Zip. | 15 of the compression algorithm designed by Igor Pavlov for 7-Zip. |
16 LZMA is based on LZ77 and range encoding. | 16 LZMA is based on LZ77 and range encoding. |
17 | 17 |
18 LZMA2 is an updated version of the original LZMA to fix a couple of | 18 LZMA2 is an updated version of the original LZMA to fix a couple of |
19 practical issues. In context of XZ Utils, LZMA is called LZMA1 to | 19 practical issues. In context of XZ Utils, LZMA is called LZMA1 to |
20 emphasize that LZMA is not the same thing as LZMA2. LZMA2 is the | 20 emphasize that LZMA is not the same thing as LZMA2. LZMA2 is the |
21 primary compression algorithm in the .xz file format. | 21 primary compression algorithm in the .xz file format. |
22 | 22 |
23 | 23 |
24 Q: There are many LZMA related projects. How does XZ Utils relate to them? | 24 Q: There are many LZMA related projects. How does XZ Utils relate to them? |
25 | 25 |
26 A: 7-Zip and LZMA SDK are the original projects. LZMA SDK is roughly | 26 A: 7-Zip and LZMA SDK are the original projects. LZMA SDK is roughly |
27 a subset of the 7-Zip source tree. | 27 a subset of the 7-Zip source tree. |
28 | 28 |
29 p7zip is 7-Zip's command line tools ported to POSIX-like systems. | 29 p7zip is 7-Zip's command-line tools ported to POSIX-like systems. |
30 | 30 |
31 LZMA Utils provide a gzip-like lzma tool for POSIX-like systems. | 31 LZMA Utils provide a gzip-like lzma tool for POSIX-like systems. |
32 LZMA Utils are based on LZMA SDK. XZ Utils are the successor to | 32 LZMA Utils are based on LZMA SDK. XZ Utils are the successor to |
33 LZMA Utils. | 33 LZMA Utils. |
34 | 34 |
35 There are several other projects using LZMA. Most are more or less | 35 There are several other projects using LZMA. Most are more or less |
36 based on LZMA SDK. See <http://7-zip.org/links.html>. | 36 based on LZMA SDK. See <http://7-zip.org/links.html>. |
37 | 37 |
38 | 38 |
39 Q: Why is liblzma named liblzma if its primary file format is .xz? | 39 Q: Why is liblzma named liblzma if its primary file format is .xz? |
40 Shouldn't it be e.g. libxz? | 40 Shouldn't it be e.g. libxz? |
41 | 41 |
42 A: When the designing of the .xz format began, the idea was to replace | 42 A: When the designing of the .xz format began, the idea was to replace |
43 the .lzma format and use the same .lzma suffix. It would have been | 43 the .lzma format and use the same .lzma suffix. It would have been |
44 quite OK to reuse the suffix when there were very few .lzma files | 44 quite OK to reuse the suffix when there were very few .lzma files |
45 around. However, the old .lzma format become popular before the | 45 around. However, the old .lzma format became popular before the |
46 new format was finished. The new format was renamed to .xz but the | 46 new format was finished. The new format was renamed to .xz but the |
47 name of liblzma wasn't changed. | 47 name of liblzma wasn't changed. |
48 | 48 |
49 | 49 |
50 Q: Do XZ Utils support the .7z format? | 50 Q: Do XZ Utils support the .7z format? |
51 | 51 |
52 A: No. Use 7-Zip (Windows) or p7zip (POSIX-like systems) to handle .7z | 52 A: No. Use 7-Zip (Windows) or p7zip (POSIX-like systems) to handle .7z |
53 files. | 53 files. |
54 | 54 |
55 | 55 |
(...skipping 10 matching lines...) Expand all Loading... |
66 | 66 |
67 Q: I have many .lzma files. Can I quickly convert them to the .xz format? | 67 Q: I have many .lzma files. Can I quickly convert them to the .xz format? |
68 | 68 |
69 A: For now, no. Since XZ Utils supports the .lzma format, it's usually | 69 A: For now, no. Since XZ Utils supports the .lzma format, it's usually |
70 not too bad to keep the old files in the old format. If you want to | 70 not too bad to keep the old files in the old format. If you want to |
71 do the conversion anyway, you need to decompress the .lzma files and | 71 do the conversion anyway, you need to decompress the .lzma files and |
72 then recompress to the .xz format. | 72 then recompress to the .xz format. |
73 | 73 |
74 Technically, there is a way to make the conversion relatively fast | 74 Technically, there is a way to make the conversion relatively fast |
75 (roughly twice the time that normal decompression takes). Writing | 75 (roughly twice the time that normal decompression takes). Writing |
76 such a tool would take quite a bit time though, and would probably | 76 such a tool would take quite a bit of time though, and would probably |
77 be useful to only a few people. If you really want such a conversion | 77 be useful to only a few people. If you really want such a conversion |
78 tool, contact Lasse Collin and offer some money. | 78 tool, contact Lasse Collin and offer some money. |
79 | 79 |
80 | 80 |
81 Q: I have installed xz, but my tar doesn't recognize .tar.xz files. | 81 Q: I have installed xz, but my tar doesn't recognize .tar.xz files. |
82 How can I extract .tar.xz files? | 82 How can I extract .tar.xz files? |
83 | 83 |
84 A: xz -dc foo.tar.xz | tar xf - | 84 A: xz -dc foo.tar.xz | tar xf - |
85 | 85 |
86 | 86 |
87 Q: Can I recover parts of a broken .xz file (e.g. corrupted CD-R)? | 87 Q: Can I recover parts of a broken .xz file (e.g. a corrupted CD-R)? |
88 | 88 |
89 A: It may be possible if the file consists of multiple blocks, which | 89 A: It may be possible if the file consists of multiple blocks, which |
90 typically is not the case if the file was created in single-threaded | 90 typically is not the case if the file was created in single-threaded |
91 mode. There is no recovery program yet. | 91 mode. There is no recovery program yet. |
92 | 92 |
93 | 93 |
94 Q: Is (some part of) XZ Utils patented? | 94 Q: Is (some part of) XZ Utils patented? |
95 | 95 |
96 A: Lasse Collin is not aware of any patents that could affect XZ Utils. | 96 A: Lasse Collin is not aware of any patents that could affect XZ Utils. |
97 However, due to nature of software patents, it's not possible to | 97 However, due to the nature of software patents, it's not possible to |
98 guarantee that XZ Utils isn't affected by any third party patent(s). | 98 guarantee that XZ Utils isn't affected by any third party patent(s). |
99 | 99 |
100 | 100 |
101 Q: Where can I find documentation about the file format and algorithms? | 101 Q: Where can I find documentation about the file format and algorithms? |
102 | 102 |
103 A: The .xz format is documented in xz-file-format.txt. It is a container | 103 A: The .xz format is documented in xz-file-format.txt. It is a container |
104 format only, and doesn't include descriptions of any non-trivial | 104 format only, and doesn't include descriptions of any non-trivial |
105 filters. | 105 filters. |
106 | 106 |
107 Documenting LZMA and LZMA2 is planned, but for now, there is no other | 107 Documenting LZMA and LZMA2 is planned, but for now, there is no other |
108 documentation that the source code. Before you begin, you should know | 108 documentation than the source code. Before you begin, you should know |
109 the basics of LZ77 and range coding algorithms. LZMA is based on LZ77, | 109 the basics of LZ77 and range-coding algorithms. LZMA is based on LZ77, |
110 but LZMA is a lot more complex. Range coding is used to compress | 110 but LZMA is a lot more complex. Range coding is used to compress |
111 the final bitstream like Huffman coding is used in Deflate. | 111 the final bitstream like Huffman coding is used in Deflate. |
112 | 112 |
113 | 113 |
114 Q: I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma? | 114 Q: I cannot find BCJ and BCJ2 filters. Don't they exist in liblzma? |
115 | 115 |
116 A: BCJ filter is called "x86" in liblzma. BCJ2 is not included, | 116 A: BCJ filter is called "x86" in liblzma. BCJ2 is not included, |
117 because it requires using more than one encoded output stream. | 117 because it requires using more than one encoded output stream. |
118 A streamable version of BCJ2-style filtering is planned. | 118 A streamable version of BCJ2-style filtering is planned. |
119 | 119 |
(...skipping 21 matching lines...) Expand all Loading... |
141 A: See the documentation in XZ Embedded. In short, something like | 141 A: See the documentation in XZ Embedded. In short, something like |
142 this is a good start: | 142 this is a good start: |
143 | 143 |
144 xz --check=crc32 --lzma2=preset=6e,dict=64KiB | 144 xz --check=crc32 --lzma2=preset=6e,dict=64KiB |
145 | 145 |
146 Or if a BCJ filter is needed too, e.g. if compressing | 146 Or if a BCJ filter is needed too, e.g. if compressing |
147 a kernel image for PowerPC: | 147 a kernel image for PowerPC: |
148 | 148 |
149 xz --check=crc32 --powerpc --lzma2=preset=6e,dict=64KiB | 149 xz --check=crc32 --powerpc --lzma2=preset=6e,dict=64KiB |
150 | 150 |
151 Adjust dictionary size to get a good compromise between | 151 Adjust the dictionary size to get a good compromise between |
152 compression ratio and decompressor memory usage. Note that | 152 compression ratio and decompressor memory usage. Note that |
153 in single-call decompression mode of XZ Embedded, a big | 153 in single-call decompression mode of XZ Embedded, a big |
154 dictionary doesn't increase memory usage. | 154 dictionary doesn't increase memory usage. |
155 | 155 |
156 | 156 |
157 Q: Will xz support threaded compression? | 157 Q: Will xz support threaded compression? |
158 | 158 |
159 A: It is planned and has been taken into account when designing | 159 A: It is planned and has been taken into account when designing |
160 the .xz file format. Eventually there will probably be three types | 160 the .xz file format. Eventually there will probably be three types |
161 of threading, each method having its own advantages and disadvantages. | 161 of threading, each method having its own advantages and disadvantages. |
(...skipping 15 matching lines...) Expand all Loading... |
177 Match finder parallelization is another threading method. It has | 177 Match finder parallelization is another threading method. It has |
178 been in 7-Zip for ages. It doesn't affect compression ratio or | 178 been in 7-Zip for ages. It doesn't affect compression ratio or |
179 memory usage significantly. Among the three threading methods, only | 179 memory usage significantly. Among the three threading methods, only |
180 this is useful when compressing small files (files that are not | 180 this is useful when compressing small files (files that are not |
181 significantly bigger than the dictionary). Unfortunately this method | 181 significantly bigger than the dictionary). Unfortunately this method |
182 scales only to about two CPU cores. | 182 scales only to about two CPU cores. |
183 | 183 |
184 The third method is pigz-style threading (I use that name, because | 184 The third method is pigz-style threading (I use that name, because |
185 pigz <http://www.zlib.net/pigz/> uses that method). It doesn't | 185 pigz <http://www.zlib.net/pigz/> uses that method). It doesn't |
186 affect compression ratio significantly and scales to many cores. | 186 affect compression ratio significantly and scales to many cores. |
187 The memory usage scales linearly when threads are added. It isn't | 187 The memory usage scales linearly when threads are added. This isn't |
188 significant with pigz, because Deflate uses only 32 KiB dictionary, | 188 significant with pigz, because Deflate uses only a 32 KiB dictionary, |
189 but with LZMA2 the memory usage will increase dramatically just like | 189 but with LZMA2 the memory usage will increase dramatically just like |
190 with the independent blocks method. There is also a constant | 190 with the independent-blocks method. There is also a constant |
191 computational overhead, which may make pigz-method a bit dull on | 191 computational overhead, which may make pigz-method a bit dull on |
192 dual-core compared to the parallel match finder method, but with more | 192 dual-core compared to the parallel match finder method, but with more |
193 cores the overhead is not a big deal anymore. | 193 cores the overhead is not a big deal anymore. |
194 | 194 |
195 Combining the threading methods will be possible and also useful. | 195 Combining the threading methods will be possible and also useful. |
196 E.g. combining match finder parallelization with pigz-style threading | 196 E.g. combining match finder parallelization with pigz-style threading |
197 can cut the memory usage by 50 %. | 197 can cut the memory usage by 50 %. |
198 | 198 |
199 It is possible that the single-threaded method will be modified to | 199 It is possible that the single-threaded method will be modified to |
200 create files indentical to the pigz-style method. We'll see once | 200 create files identical to the pigz-style method. We'll see once |
201 pigz-style threading has been implemented in liblzma. | 201 pigz-style threading has been implemented in liblzma. |
202 | 202 |
203 | 203 |
204 Q: How do I build a program that needs liblzmadec (lzmadec.h)? | 204 Q: How do I build a program that needs liblzmadec (lzmadec.h)? |
205 | 205 |
206 A: liblzmadec is part of LZMA Utils. XZ Utils has liblzma, but no | 206 A: liblzmadec is part of LZMA Utils. XZ Utils has liblzma, but no |
207 liblzmadec. The code using liblzmadec should be ported to use | 207 liblzmadec. The code using liblzmadec should be ported to use |
208 liblzma instead. If you cannot or don't want to do that, download | 208 liblzma instead. If you cannot or don't want to do that, download |
209 LZMA Utils from <http://tukaani.org/lzma/>. | 209 LZMA Utils from <http://tukaani.org/lzma/>. |
210 | 210 |
211 | 211 |
212 Q: The default build of liblzma is too big. How can I make it smaller? | 212 Q: The default build of liblzma is too big. How can I make it smaller? |
213 | 213 |
214 A: Give --enable-small to the configure script. Use also appropriate | 214 A: Give --enable-small to the configure script. Use also appropriate |
215 --enable or --disable options to include only those filter encoders | 215 --enable or --disable options to include only those filter encoders |
216 and decoders and integrity checks that you actually need. Use | 216 and decoders and integrity checks that you actually need. Use |
217 CFLAGS=-Os (with GCC) or equivalent to tell your compiler to optimize | 217 CFLAGS=-Os (with GCC) or equivalent to tell your compiler to optimize |
218 for size. See INSTALL for information about configure options. | 218 for size. See INSTALL for information about configure options. |
219 | 219 |
220 If the result is still too big, take a look at XZ Embedded. It is | 220 If the result is still too big, take a look at XZ Embedded. It is |
221 a separate project, which provides a limited but significantly | 221 a separate project, which provides a limited but significantly |
222 smaller XZ decoder implementation than XZ Utils. You can find it | 222 smaller XZ decoder implementation than XZ Utils. You can find it |
223 at <http://tukaani.org/xz/embedded.html>. | 223 at <http://tukaani.org/xz/embedded.html>. |
224 | 224 |
OLD | NEW |