Huffman coding .pdf



Nom original: Huffman coding.pdf

Ce document au format PDF 1.4 a été généré par Impress / LibreOffice 4.3, et a été envoyé sur fichier-pdf.fr le 17/12/2014 à 09:28, depuis l'adresse IP 46.193.x.x. La présente page de téléchargement du fichier a été vue 458 fois.
Taille du document: 273 Ko (16 pages).
Confidentialité: fichier public

Aperçu du document


COMPRESSION
HUFFMAN CODING

Étienne WATTEBLED
Compression 0 %

Contents
I. Introduction
II. What is compression ?
III. What are the uses ?
IV. How does it work ? (Huffman)
Compression
Decompression
V. Conclusion
Compression 5 %

Introduction
Compression is a way in computer science which is :
- very used
and
- very useful
To start : what is compression ?

Compression 10 %

What is compression ?
Is the fact of reducing the size of a folder or a file.
Sometimes :
- you can lose information (jpg images).
- you need an utilitarian (zip folder)

Compression 15 %

What are the uses ?
Principals uses.
1)
To reduce the size.
2)
To conserve a file hierarchy (folder compressed)
3)
To send a folder (folder compressed).

Compression 20 %

How does it work ? - Compression
Huffman is based on using occurrences.
For instance, we suppose :
- that the alphabet is { 'H' ; 'e' ; 'l' ; 'o' ; ' ' ; 'K' ; 'y' ; ',' ; 'h' ; 'w' ; 'a'
; 'r' ; 'u' ; '?' }
- the sentence is : "Hello Kelly, how are you ?"

Compression 25 %

How does it work ? - Compression
If you don't compress :
14 different characters
=> you need 4 bits per character.
"Hello Kelly, how are you ?" (26 characters).
26*4 bits = 104 bits

Compression 30 %

Character

Code

H

0000

e

0001

l

0010

o

0011
0100

K

0101

y

0110

,

0111

h

1000

w

1001

a

1010

r

1011

u

1100

?

1101

How does it work ? - Compression
Huffman : first, to count occurrences

Character

Occurrences

H

1

e

3

l

4

o

3
5

Compression 40 %

K

1

y

2

,

1

h

1

w

1

a

1

r

1

u

1

?

1

How does it work ? - Compression
Then, we need to do a sorted list using the number of
occurrences.
'H'
1

'K'
1

','
1

'h'
1

'w'
1

'a'
1

'r'
1

'u'
1

'?'
1

'y'
2

'e'
3

'o'
3

'l'
4

''
5

Compression 50 %

How does it work ? - Compression
2
'H'
1

'K'
1

','
1

'h'
1

'w'
1

'a'
1

'r'
1

'u'
1

'?'
1

'y'
2

'e'
3

'o'
3

'l'
4

''
5

After, you create a new node which is the result of first + second
node, and no character on it.

Compression 60 %

How does it work ? - Compression
This new node must be insered (list is always sorted).
','
1

'?'
1
'H'
1

'h'
1

'w'
1

'a'
1

'r'
1

'u'
1

2

'y'
2

'e'
3

'o'
3

'l'
4

''
5

'K'
1

The first and second node are now accessible by this new node.
Compression 70 %

How does it work ? - Compression
We need to continue until to have a tree:
26
15

11
7

''
5

8

6
'e'
3

3
'?'
1

'o'
3

'u'
1

'H'
1

'l'
4

4
'y'
2

2

2
'r'
1

4

'K'
1

Compression 70 %

2
'w'
1

2
'a'
1

','
1

'h'
1

How does it work ? - Compression
Now, left is 0 and right is 1. For instance : 'H' is 10100
26
15

11
7

''
5

8

6
'e'
3

3
'?'
1

'o'
3

'u'
1

'H'
1

'l'
4

4
'y'
2

2

2
'r'
1

4

'K'
1

Compression 80 %

2
'w'
1

2
'a'
1

','
1

'h'
1

How does it work ? - Compression
"Hello Kelly, how are you ?"
now, takes :
1*5 + 3*3 + 4*3 + 3*3 + 5*2
+ .... + 1*5 + 1*4
= 92 bits

But in real,
+ correspondance table.

Character

Occurrences Code

H

1

10100 (5)

e

3

011 (3)

l

4

111 (3)

o

3

100 (3)

5

00 (2)

K

1

10101 (5)

y

2

1011 (4)

,

1

11010 (5)

h

1

11011 (5)

w

1

11000 (5)

a

1

11001 (5)

r

1

01010 (5)

u

1

01011 (5)

?

1

0100 (4)

Compression 85 %

How does it work ? - Decompression
Character

Occurrences Code

To decompress : you read bit
per bit and when you find a
letter, you put it and restart
with the rest.

H

1

10100 (5)

e

3

011 (3)

l

4

111 (3)

o

3

100 (3)

5

00 (2)

For example :
101000111111111000010...

K

1

10101 (5)

y

2

1011 (4)

,

1

11010 (5)

h

1

11011 (5)

w

1

11000 (5)

a

1

11001 (5)

r

1

01010 (5)

u

1

01011 (5)

?

1

0100 (4)

1 no letter
10 no letter
101 no letter
1010 no letter
10100 'H'
and so on with the rest...

Compression 90 %

Conclusion




Huffman : one algorithm used for character
string and you don't lose information.
Don't compress small files or small folders
(it will be worse)

Compression 100 %


Huffman coding.pdf - page 1/16
 
Huffman coding.pdf - page 2/16
Huffman coding.pdf - page 3/16
Huffman coding.pdf - page 4/16
Huffman coding.pdf - page 5/16
Huffman coding.pdf - page 6/16
 




Télécharger le fichier (PDF)

Huffman coding.pdf (PDF, 273 Ko)

Télécharger
Formats alternatifs: ZIP



Documents similaires


tm3
huffman coding
cri chap 2 et 3
seaiq transferring your opencpn charts to seaiq open
curriculum vitae
world championship poker 2 manual psp

Sur le même sujet..