What is an example of a compression

HTTP compression

Data compression is the “packing” of text-based files (e.g. HTML files; HTML is a programming language for building websites). The HTML files are on the serveruncompressedand willon demand from the client (browser)compressed, transmitted and decompressed there again. During the process, redundant parts of documents (repetitive characters such as HTML tags) are removed and the digital documents are reduced to a minimum without loss. During decompression, the redundant information is replaced again and the browser can display the complete file.

An example to illustrate compression:

Original text:My hat has three corners, my hat has three corners.
Coded text:My hat has three corners, -2 -2 -5 -9 -9.

The words “three”, “corners”, “has”, “mine” and “hat” were coded here. The numbers indicate where the words already appeared(-2 = two digits back). With this procedure, texts can be reduced to the most necessary information.

 

Use of HTML compression

Nowadays, data compression takes place in all online data transmission. A large part of the resources required for this, such as bandwidth or storage space, are saved through compression. The operator of a compressed website saves a lot of resources through the process, since only a small part of the bandwidth otherwise required is used. If your own servers are used, an entire website can also be saved here despite the low storage space. On average, web pages with HTML files are reduced to ¼ of the original size. Because the higher the text based content is, the higher the potential savings in “packing”.

Pure HTML texts can be reduced to 10-20% of the original size.

Due to the compression on websites, the documents load faster and can be used by the user in a shorter time. Packaged HTML pages are transmitted with fewer bytes. This saves resources and money for the site operator and ensures shorter loading times. This increases the user friendliness and this ensures a lower bounce rate for the user, since users are more likely to leave pages with long loading times. If a browser cannot load compressed pages for the user because it does not support this, these are transmitted in the conventional way (uncompressed).

Advantages of compressed HTML web pages

  • Up to 80% smaller, in individual cases more
  • Packaged file is transmitted to the user faster
  • Lower bounce rate, no long loading times
  • Saves money for the site operator, smaller site size = lower costs for the traffic used
  • The server can cope with more users / surfers
  • For search engines (e.g. Google), the duration of the loading time is a ranking factor

Does a compressed image transmission make sense?

Data types such as images, audio or video files are already optimized during the creation process. In the PNG graphic format, for example, the relevant colors are saved and the less relevant color values ​​are predicted from empirical values. This compression will"Lossy compression" called theresome parts irreversibly removed become. Basically, only the deviations are saved and the rest are added when the image is called up. Repacking is therefore useless with such. The size of a folder of compressed images is only minimally different from a folder of uncompressed images. Therefore, the compression method is usually not used for images.

 

Method of HTTP compression

If you want to call up a website with a browser, the browser sends the server various suggestions, which it can understand as a response from the server. They are listed in the header of the request from the browser (client) and are interpreted by the web server.

Accept: What kind of data types can the client (browser) process, e. B. html
Accept charset: Which character sets can the client (browser) display, e. B. utf-8
Accept-Language: What language does the client (browser) accept, e.g. B. en-US

If you want to specify in the http request that you want to receive packaged responses as far as possible, you specify the following in the http request:

Accept encoding: Here is what you can decompress, e.g. B.Accept-Encoding: gzip,
deflate

The http response that has been received from the web server can be viewed in the http response header and indicates the information with which it corresponds to the client:

Content encoding: This shows which compression the server has chosen, e.g. B.Content encoding: gzip

If all requests from the browser match the requests from the web server, theStatus code 200 is output, stating that the web page was loaded successfully.

The most common types of compression

Many compression methods are based on theZiv-Lempel (ZIP) process. In whichlexicon-based procedures words are replaced by codes / pointers (Tuesday => Tue, August => 8). As a result, the content is shorter and fewer bytes are required for online transmission. The algorithms used today (e.g. theEntropy coding)are based on the ZIP process from 1977. With this coding, the frequency of the occurring bit patterns is first determined. The Huffman coding and the arithmetic coding are based on this coding and only differ in the bit allocation.

In theprogressive compression (also called compact or solid compression) the texts are grouped into as few blocks as possible. The more similar the texts, the smaller they can end up being. This results in the lowest possible compression. Tools such as RAR or 7-ZIP can do this (Windows ZIP but not). However, it must be noted that all content will be damaged if the archive gets a defect.

In whichDeflate procedure the Lempel-Ziv-Storer-Szymanski algorithm and the Huffman coding were combined. The deflate algorithm was developed by Phil Katz and guarantees lossless compression of the data, which is why the method is very popular. First of all, all double strings are searched for and replaced with shorter symbols. The more frequently a character string appears, the shorter the symbol (entropy coding according to Huffman). The larger the data window (the amount of information), the greater the likelihood of finding a replaceable string. Accordingly, the algorithm then takes longer to compress. If a fast execution speed is preferred for compression, data reduction suffers.
The Deflate Algorithm was developed by Phil Katz for theZIP file format (English: zipper) developed. The format can be recognized by the ending.zip. It is used for space-saving archiving and as a container file in which several files have been combined into a smaller folder file. The data container can be protected with a password and save storage paths. The data contained within the container are compressed individually. The ZIP file is not transferred to the smallest possible file size compressed, but it remains flexible and can be deleted and edited at will without having to recompress everything.

Conclusion

The fact that data packing is an advantage cannot be denied. The most common and popular tools run without problems and in German on Windows and other operating systems, which prevents the loss of time, money and rankings.