ZIP files now support Unicode filenames

at 2011-01-16 in UnicodeExamples by friebe (0 comments)

The XP Framework's io.archive.zip package allows reading and creating ZIP files and is tested against zip archives created by Info-ZIP 3.0, PHP's Zip class, 7-zip, WinRAR and Windows' "compressed folders".

In SVN head, we have made a couple of adjustments to be able to support files written by Java's java.util.zip package (which is internally used by the jar command):

  • Reading the central directory to find the compressed and uncompressed lengths, Java doesn't write this into the local file header - making it impossible to stream those files
  • Supporting character set detection on extraction - up until JDK7 Build 57, the Java API was incorrectly writing entry names in UTF-8 but not setting the so-called "Language Encoding bit (EFS)".
  • Supporting to write archives with Unicode names - a JDK build downloaded yesterday still chokes on non-Unicode filenames inside ZIP archives when no charset is given (default is "CP437" here).
This way we can now create JAR files, for example.



Subscribe

You can subscribe to the XP framework's news by using RSS syndication.


Categories

News
General
PHP5
Announcements
RFCs
Further reading
Examples
Editorial
EASC
Experiments
Unittests
Databases
5.8-SERIES
Unicode
Language
5.9-SERIES

Related

Find related articles by a search for «ZIP».