Entropy function

Send Close Add comments: (status displays here)

Got it! This site "creationpie.org" uses cookies. You consent to this by clicking on "Got it!" or by continuing to use this website. Note: This appears on each machine/browser from which this site is accessed.

Entropy function

by RS admin@creationpie.org : 1024 x 640

1. Entropy

More: Entropy

2. Entropy

Data (and information) can be represented using symbols that are made up of bits - zeros and ones in a binary code used to represent such data (and information).

A set of symbols (data) has a measure called entropy that approximates the lower bound on the number of bits needed to encode those symbols.

Such entropy constitutes a lack of order or a lack of predictability.

3. Entropy Formula

The following Shannon entropy formula expresses how entropy for a set of symbols is calculated.

S is the entropy
P_i is the probability of a given symbol i in a sequence of symbols appearing

More: Entropy

4. Twenty questions

Most people have played the game of twenty questions.

Here is a simplified version.

Think of a number from 1 to 1,000,000 I will guess that number in 20 yes-no questions.

5. More simplified version

Here is a more simplified version.

Think of a number from 1 to 16 I will guess that number in 4 yes-no questions.

How can you do it? Can you explain why it works?

6. Algorithms

In the study of algorithms, one learns how to use divide and conquer problem solving strategies for a problem space of size n.

Divide the space by two at each step, or binary division of the problem, leads to a solution of order ln₂n or O(ln₂n).
Divide the space of size n into 1 and n-1 leads to a solution of order n or O(n).

7. Question game

For guessing the 1 to 16, or 2⁴ the following hold.

An O(n) solution requires 16 guesses to obtain the solution (in the worst case) and 8.5 guesses on average. Why?
Note: If someone does not remember the numbers guessed, it can take longer, on average.
An O(ln₂n) solution requires exactly 4 guesses. Why?

8. Entropy of splitting lists

Here is a way to create partitioned lists of numbers using list on a lazy range.

The flowing program shows the entropy of each possible partition of a list of 16 elements into two non-empty lists.

9. Python program

Here is the Python code [#1]

import math
def entropyOf1(list1):
	# assume all different items
	len1 = len(list1)
	e1 = len1 * math.log(len1,2)
	return e1
len1 = 16
# make a list of a lazy range generator
list1 = list(range(1,len1+1))
s1 = str(list1)
print("List: {0:s}".format(s1))
print("")
ePos1 = -1
eMin1 = 999.99
for i1 in range(1,len1):
	left1 = list1[:i1]
	lefts1 = str(left1)
	eLeft1 = entropyOf1(left1)
	right1 = list1[i1:]
	rights1 = str(right1)
	eRight1 = entropyOf1(right1)
	print("#{0:d}:".format(i1))
	print("\t{0:0.2f} {1:s}".format(eLeft1,lefts1))
	print("\t{0:0.2f} {1:s}".format(eRight1,rights1))
	eMax1 = max(eLeft1,eRight1)
	print("\t{0:0.2f} (max)".format(eMax1))
	if eMax1 < eMin1:
		ePos1 = i1
		eMin1 = eMax1
print("")
print("Minimum maximum entropy at #{0:d}:{1:0.2f} of {2:d} partitions.".format(ePos1,eMin1,len1-1))
print("")

Here is the output of the Python code.

List: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
#1:
	0.00 [1]
	58.60 [2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
	58.60 (max)
#2:
	2.00 [1, 2]
	53.30 [3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
	53.30 (max)
#3:
	4.75 [1, 2, 3]
	48.11 [4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
	48.11 (max)
#4:
	8.00 [1, 2, 3, 4]
	43.02 [5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
	43.02 (max)
#5:
	11.61 [1, 2, 3, 4, 5]
	38.05 [6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
	38.05 (max)
#6:
	15.51 [1, 2, 3, 4, 5, 6]
	33.22 [7, 8, 9, 10, 11, 12, 13, 14, 15, 16]
	33.22 (max)
#7:
	19.65 [1, 2, 3, 4, 5, 6, 7]
	28.53 [8, 9, 10, 11, 12, 13, 14, 15, 16]
	28.53 (max)
#8:
	24.00 [1, 2, 3, 4, 5, 6, 7, 8]
	24.00 [9, 10, 11, 12, 13, 14, 15, 16]
	24.00 (max)
#9:
	28.53 [1, 2, 3, 4, 5, 6, 7, 8, 9]
	19.65 [10, 11, 12, 13, 14, 15, 16]
	28.53 (max)
#10:
	33.22 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
	15.51 [11, 12, 13, 14, 15, 16]
	33.22 (max)
#11:
	38.05 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11]
	11.61 [12, 13, 14, 15, 16]
	38.05 (max)
#12:
	43.02 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
	8.00 [13, 14, 15, 16]
	43.02 (max)
#13:
	48.11 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13]
	4.75 [14, 15, 16]
	48.11 (max)
#14:
	53.30 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14]
	2.00 [15, 16]
	53.30 (max)
#15:
	58.60 [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
	0.00 [16]
	58.60 (max)
Minimum maximum entropy at #8:24.00 of 15 partitions.

10. Result

The result is that the best place to partition the list into two non-empty lists such that the minimum of the maximum entropies is in the middle. That is, that splits the list into two equal sized lists.

11. Claude Shannon

Book: The mathematical theory of communication

Claude Shannon (Information scientist)

Claude Shannon's masters thesis was a detailed description of how to build (the first) programmable digital electronic computer. He later founded the modern field of information and communication theory. He worked on encryption technologies during World War II.
Claude Shannon, in the 1950's, used the term "information" to describe the statistical properties of data transmitted from one place to another. He was working for the telephone company AT&T at Bell Labs in order to improve the quality of phone calls. He founded the modern field of (statistical) information - he was not overly concerned with the meaning of the "data" that he called "information".

More: Claude Shannon

Previously, the term "information" was used as the verb "to inform". That is, one is "informed" that such and such is true - what we now call "information".

12. Von Neumann

Whenever the mathematical genius John Von Neumann (conventional computer architectures are called Von Neumann machines) was asked by Claude Shannon what to call a measure of the lack of content in data, Von Neumann is reported to have told Shannon to call it "entropy" because then nobody would understand it.

13. Von Neumann quote

Often reported comments of Von Neumann to Claude Shannon about the word "entropy".

You should call it entropy, for two reasons. In the first place your uncertainty function has been used in statistical mechanics under that name, so it already has a name. In the second place, and more important, no one really knows what entropy really is, so in a debate you will always have the advantage.

14. Statistics

Note that the concept of information developed by Shannon has to do with statistical properties of data.

It does not have anything to do directly with the meaning of the data. That is, information.

So what is called "information theory" might be better called "data theory".

15. Python code

Here is the Python code [#2]

import math
testList1 = [
	[100],
	[50, 50],
	[60, 40],
	[70, 30],
	[80, 20],
	[90, 10],
	[25, 25, 25, 25],
	[40, 30, 20, 10],
	[99, 1],
	]
def entropyGet1(list2):
	sum1 = sum(list2)
	e1 = 0.0
	for x1 in list2:
		x2 = x1 / sum1
		if x2 != 0.0:
			e1 += x2 * math.log(x2,2)
	return - e1
def entropyDo1(list1):
	print("Entropy Distribution")
	print("")
	for list2 in list1:
		e1 = entropyGet1(list2)
		print("{0:5.2f} {1:s}".format(e1,str(list2)))
entropyDo1(testList1)

Here is the output of the Python code.

Entropy Distribution
-0.00 [100]
 1.00 [50, 50]
 0.97 [60, 40]
 0.88 [70, 30]
 0.72 [80, 20]
 0.47 [90, 10]
 2.00 [25, 25, 25, 25]
 1.85 [40, 30, 20, 10]
 0.08 [99, 1]

16. Alphabets

Languages vary in how many letters they contain in making words.

The Greek alphabet has 24 letters.

αβγδεζηθικλμνξοπρστυφχψ

The English alphabet has 26 letters.

ABCDEFGHIJKLMNOPQRSTUVWXYZ

The Russian alphabet has 33 letters

АБВГДЕЖЗИЙКЛМНОПРСТУФХЦЧШЩЪЫЬЭЮЯ

17. Frequency

The average frequency with which letters happen in words (in a given corpus of texts) can be determined and varies from letter to letter.

If each letter appeared with the same frequency, then, for English, the entropy would be as follows.

ln₂ 26 = 4.7004397181411 bits

Does this make sense?

2⁴ = 16 2⁵ = 32

So the log, base 2, of 26 is between 4 and 5 bits.

18. Logarithms

Note that all logarithms are related by a constant factor. Thus, the following are true.

log₂(26) = log₁₀(26)/log₁₀(2)

Here is the Python code [#3]

import math
x1 = 26
e1 = math.log(x1)/math.log(2)
e2 = math.log(x1,10)/math.log(2,10)
e3 = math.log(x1,2)
print("Logs of {0:d}:".format(x1))
print("")
print("\t{0:0.3f} (default base - 10, adjusted for base 2)".format(e1))
print("\t{0:0.3f} (base 10, adjusted for base 2)".format(e2))
print("\t{0:0.3f} (base 2 directly)".format(e3))

Here is the output of the Python code.

Logs of 26:
	4.700 (default base - 10, adjusted for base 2)
	4.700 (base 10, adjusted for base 2)
	4.700 (base 2 directly)

19. Random guessing

On average, guessing a letter in a puzzle, without any other information, requires 13 guesses (for 26 letters).

How many guesses would it take to fill in the missing letters in the following?

H __ L L __ , W __ R L D

20. Semitic languages

Some languages, such as Hebrew, do not have vowels, only consonants. One learns to infer the vowels according to the context in which the consonants appear.

The Hebrew alphabet has 22 letters.

21. English letter guessing

Shannon showed that the English language has an entropy of between 0.6 and 1.3 bits.

The redundancy of natural language helps in removing ambiguity when not all of the text can be transmitted accurately, as in listening to a broadcast with a lot of static noise.

22. Google n-grams

Google has collected a large amount of data on letter sequences in the text of various languages.

Such "grams" are known as "bi-grams" (2 letters) , "tri-grams" (3 letters), and so on. The general term is "n-grams" where n is an integer (greater than one).

23. Security

The bit length of a security key, such as 1024 bit, is not always a maximum entropy security key.

You need the max entropy value to compare security by the bit length of keys.

24. Random numbers

A true random number generator needs a lot of entropy.

25. Perfect passwords

The GRC web site, by Steve Gibson (famous security pod-caster), provides passwords that have high entropy.

Every one is completely random (maximum entropy) without any pattern, and the cryptographically-strong pseudo random number generator we use guarantees that no similar strings will ever be produced again.

See https://www.grc.com/passwords.htm. There are more details there on how maximum entropy is used/achieved.

26. Imperfect random numbers

Pseudo-random numbers appear random but actually have a pattern. True random numbers cannot be generated via strictly computer methods.

Any one who considers arithmetical methods of producing random digits is, of course, in a state of sin.

27. John Von Neumann quotes

From WikiQuote:

In mathematics you don't understand things. You just get used to them.

If people do not believe that mathematics is simple, it is only because they do not realize how complicated life is.

There probably is a God. Many things are easier to explain if there is than if there isn't.

28. About John Von Neumann

From WikiQuote:

He was a really remarkable man. He listened to me talk about this rather obscure subject and in ten minutes he knew more about it than I did. He was extremely quick. David Blackwell

29. Random numbers

Applications:

Security: random numbers required (assumed intelligent adversary - humans)
Simulation: pseudo-random numbers work well (non-intelligent adversary - nature)

Note: Raspberry Pi has hardware random number generator. How well does it work?

30. Genetic entropy

The concept of genetic entropy is the tendency of the coded information in genetic material to accumulate mutations and degenerate over time due to errors in the copying process.

Secular view: genetic bottleneck about 100,000 years ago in Africa
Biblical view: genetic bottleneck about 4,500 years ago at Mount Ararat.

More: Genetic entropy

31. End of page

Entropy function equation

Shannon, Claude (Information scientist)

Logarithms

Python code

Python program

by RS admin@creationpie.org : 1024 x 640