Why Transcendental Compression Is Impossible

By “Transcendental Compression”, I mean the concept of taking p or e or, root 2, any of the Transcendental Numbers or functions and using some part of the sequence of its digits as a dictionary for tokens in a compression algorithm.

For example, using p



you could compress the sequence of numbers



I = 13
N = 60

So if you start at the 13th digit of p, the next 60 characters contain the uncompressed data.

There is no way disprove that there is not an index I into the digits of p that encodes for every movie ever made.

While delirious with the flu I was IMing to a friend in London over MSN on a mobile phone while riding in a taxi on the way home, its a crazy world.

He had pointed out this posting and was working out an algorithm. There are other links like this.

I have a raging flu but by the time I got home I came up with two proofs that this is impossible, and its not because the amount of computing power required would be infinite. 

A theoretical infinite or quantum computer will not save this idea.

Because I have a raging flu these proofs are probably wrong, but they make sense to me in my current head-space.

If this is possible then please feel free to correct me and explain how so I can start building a super cluster of compression servers. If my proofs are Bogus, tell me how I can make them better.

Proof 1: Infinite Compression Is Bogus

A bit is just the number 0 or 1

Assume we have a function C(D) = d

Which can take any sequence of bits D, N bits long and reduce it to a sequence of bits d, n bits long.

Lets assume that N = 2 and n = 1

So you could in theory take any sequence of bits, and compress it, which would halve it,

then compress it again, which would halve it again

then compress it again, which would halve it again

An entire movie could be compressed to 1 bit.

This is of course impossible.

A bit contains only 2 possible values 0 or 1, and I have seen at least 3 movies 🙂

Proof 2: Transcendental Compression Is Bogus

Take any sequence of numbers S length N and start looking for an index I in the digits of p that matches such that you can encode your sequence of digits as

I = Index into p,N = Length of sequence of numbers.

The digits of PI are infinite and random, so there is no upper limit on the values of I.

So the possible value of I is somewhere between 0 and infinity.

Using probability, the average value of I will be  

I = Infinity/Finite.

Which is Infinity.

So on average, the length of your index is infinite.

As you are compressing your data to the number pair “I,N“, it is more than likely that your probably infinite I will be longer than the S you are trying to compress.

This does not sound like a viable compression routine to me.

Here is another more general proof.

About James McParlane

CTO Massive Interactive. Ex Computer Whiz Kid - Now Grumpy Old Guru.
This entry was posted in Coolhunting, Faster Than Light, Rants. Bookmark the permalink.

7 Responses to Why Transcendental Compression Is Impossible

  1. AnonyMouse says:

    First, "transcendental compression" is not really compression. It merely provides a lookup into an already existing outside source of incompressable data. It’s a technicality, to be sure, but it’s an important one.

    Second, you play fast and loose with the term infinite here. It is not the case that "on average, the length of your index is infinite." It is true that the index I will always be *finite* and *arbitrarily large*. This is a key distinction that arises in many incorrect "proofs" in real analysis as well. There is no real average value of I in a theoretical sense. Of course, in an empirical sense, there will be some finite one.

    You are certainly correct that I is potentially much larger than |S| or even S itself, given that I is arbitrarily large. The main problem I still see is the reliance on an outside data source for the values anyway; this not being compression. The data must still exist somewhere, and given the nature of transcendental numbers, that is a nontrivial task itself.

  2. Thanks for the great feedback!

    Yes I agree something ‘large but countable’ is something that makes more sense – was just trying to get across the idea quickly that |I| is unbounded.

  3. Emerson says:

    Ahh… thanks, i assume this relates to our conversation 🙂

    I will attempt to digest and understand it for the betterment of my sanity…

  4. Emerson says:

    Ok, now that ive read it.

    Proof one seems sound, but it doesn’t disprove the transcendental compression (perhaps better described as a hash for the cryptographic pedants amongst us). It only proves that there exists no function C(D) = d.

    The second proof relies on I being larger than S, which will not *always* be the case. Could this not be enhanced by trying to find a multiple of S, A such that A * B = S and where the I of A is smaller than S, so compression is achieved. It then becomes an issue of computational complexity ?

  5. Proof #1 Essential demonstrates that there is no guaranteed compression. At a certain point you can’t compress any more. So yes it shows that C(D) = d. does not exist, but only for for the constraints of "C(D) = d takes any sequence of bits D, N bits long and reduce it to a sequence of bits d, n bits long were n < N" ; which is a recipe for guaranteed compression.

    Proof #2 Concludes that there is no upper bound to I, so |I| will be somewhere between 0 and infinity. It will be finite, but probably very very large.

    If you break the problem up into finding the smallest I, according to Proof #1 you have no guarantee that you will find I such that |I| < |S|.

    So yes it could boil down to computational complexity – but you can’t tell if its pointless to continue. Most likely the ocean will boil first.

  6. Jurijus says:

    Infinite compression formula (B(p^p)*m^d)*s^r
    Let compression any file to size of SMS,
    I write this program, when I will write I show you.
    Just, wait this program.

  7. mio says:

    “sum(n=1 to /infinity) { 10^(-n!)}” is a transcendental number. It can be expressed as an algorithm with finitely many symbols to produce this number, therefore…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s