Extreme Compression

  Pesala 19:27 11 Sep 2006

Don't you just hate it when someone tries to tell you that compressed files cannot be compressed any further? Take a look at this 7-zip archive: click here

9.22 Mbytes of JPG files compressed into 9 Kbytes.

An extreme test case, of course, but relevant when compressing large directories of images. In my experience standard ZIP utilities won't compress this file set very much, if at all.

  sean-278262 20:12 11 Sep 2006

9.22mb of files all the same. So overall using the alogaritms (cant spell it) it reads the fact that all the files are the same and compresses it as it were just 1 file and gives the 1000 odd names. What point did this have?

You dont seem to understand compression very much I feel.


  Pesala 20:24 11 Sep 2006

If you compressed the same files with ZIP the 9.22 Mbytes would become 9.32 Mbytes.

The ability to recognize duplicated data is the main asset of compression programs. Say, for example, you have a few thousand JPG images or MP3 files in several directories. You may have collected a lot of duplicates and saved them with different file names or with the same name, but in different subdirectories. You decide to back up the lot with a ZIP archive to store on a CD, but you haven't time to weed out the duplicates. Use 7-zip, and it will recognize that data is duplicatd and only store it once, as long as its dictionary is bigger than the duplicate files.

  sean-278262 20:29 11 Sep 2006

And the point of saving a few MB in a world of 320gb hard drives? Sorry I see no real point to this. Unless we are talking about an archive of photos that is 100s of gigs I can really see no point to it.

  Pesala 20:46 11 Sep 2006

Who stores backups on the same hard drive that is used for the original data? What about email attachments, or copying backups to CD or USB drives, or uploading stuff to a website with limited server space?

Depending on the data set, the difference between 7z and zip is very noticeable. Try compressing some Truetype fonts for example. click here — 312K as 7z, but nearly 0.98 Mbyte as standard Zip. Again, the better compression is achieved by recognizing duplicated data. The fonts contain a lot of characters that are identical, as well as many that are not.

  sean-278262 21:05 11 Sep 2006

What you have yet to discuss is compression failures. AKA the dreaded CRC errors you can often get with .zip files when backed to CD or DVD.

Also you cannot describe the point of 7zip being better than winzip by only demonstrating one file. You need to show that winzip is producing such a file.

  Pesala 12:08 14 Sep 2006

>>Also you cannot describe the point of 7zip being better than winzip by only demonstrating one file.<<

I don't have WinZip, but I did already demonstrate that IZarc produces a ZIP file of 9.32 Mbytes from 9.22 Mbytes of JPG files. What you need to demonstrate is that any other compression program can produce the same kind of compression as 7z format using ZIP, RAR, or some other form of compression. Then we can say that they are just as good as 7-zip. Otherwise, 7z format is clearly a better format to use. IZarc can also compress files to 7z format with similar results on file sets containing at least some duplicated files.

The extreme case is not typical, of course, but it serves to demonstrate the point that a compression program should detect any kind of duplicated data, and not store it twice. That is how file compression saves space.

This thread is now locked and can not be replied to.

Elsewhere on IDG sites

Dell XPS 13 9370 (2018) review

No need to scan sketches into your computer with Moleskine's new smart pen

WWDC history: Apple's product launches since 2005

Comment importer des contacts d’un iPhone à un autre iPhone ?