Saturday, December 11, 2010

Using entropy analysis to estimate efficiency

We want to actually use entropy analysis on some known distribution network, in this case a fun one to pick are oil imports.  Let's look at oil shipments.

At 10 million barrels a day, our economy requires about 70 oil tankers per week, and a week is the typical usage cycle for oil consumption.  So, we can directly compute the information rate for oil tankers, how much unique flow variations can we describe with 70 oil tankers per week?  Using  a little binary log system, 70 oil tankers per week is about a 6 bit computer computing once per week, of 6 baud per week.  In other words, a properly constructed six bit computer would generate as accurate estimation of oil tanker flow as our ports support.  The quantization error is 1/70 or 1.5%.

Now, let us assume that oil price is a fair measure of equilibrium (oil prices are a drunkards walk). We would expect oil prices to generate the same information rate, that is, oil price variation over time should generate novel information at the rate of 6 baud/week with a quantization error of about 1.5%.  Well we can look directly at oil prices, and construct the minimum baud rate needed to create the price stream within the accuracy.  When I say oil is choppy, I mean that oil prices do not generate the information variation the equals the flow variation in potential flow.  Currently oil prices generate about 4 baud per week in novel information. (I eye balled it)

If prices are less accurate than oil flow capability, what does that mean?  It means the oil ports in the  country are now overbuilt, we have more capacity then we are using.

If you are working for an oil company, or Wal Mart; you will want a direct method to compute entropy from a data series. I now invent one on the fly.

Take the data series as a graph y(t). Now, limite the Y axis to 1024 segmants and the time axist to 1024 segments (for example). The y(t) is now on graph paper, essentially, and the goal is to reduce both the segments in Y and the segments in t, in alternation, computing the variance between the original series and the quantized series as you go. Continue reducing segments until the quantization error starts to exceed your limits. Then take the binary log of the finite number y segments divided by the t segment and you get baud rate.

If you are at Wal Mart, measure the potential entropy of packs of underwear arrivals and compare that to entropy of the underwear price series. You will be amazed.

I might mention that the reader should see similarity here and the Levine production chains, where we substitute quantization error for Levine reliability.

What about the drunkards walk assumptin? The point of QM theory is that we quantize to make a drunkars walk, and in this case when the real economy scales the graph paper in Y and T, it can use variable quant sizes, like a Huffman encoder. The Levine production chains are optimal Huffman encoders in the time dependent case.

No comments: