Thursday, January 10, 2019

Alphabetize by arithmetic

I have input words and output sorted words using unsigned long. I used the bubble sort routing below:

  // A function to implement bubble sort 
void bubble(Val *arr, int n) 

   int i, j; 
   for (i = 0; i < n-1; i++)  { 
   //printf("%d \n",i);
   //parray(arr,n);   
       // Last i elements are already in place    
       for (j = 0; j < n-i-1; j++)  {
   //printf("%lx \n",arr[j].l - arr[j+1].l);
           if (arr[j].l > arr[j+1].l) 
              swap(&arr[j].l, &arr[j+1].l); 
}
       //parray(arr,n);
  
   }


 My console output with the prints. The int values are the long version of the union:
typedef union{ unsigned long l; chat txt[8];} Val;

[matt@localhost snips]$ ./a.out
sort 7 words. Print out includes symbol length

before
:apples..6170706c65730000  6
:orange..6f72616e67650000  6
:cherry..6368657272790000  6
:banana..62616e616e610000  6
:fruit...6672756974000000  5
:pears...7065617273000000  5
:lemons..6c656d6f6e730000  6

after
:apples..6170706c65730000  6
:banana..62616e616e610000  6
:cherry..6368657272790000  6
:fruit...6672756974000000  5
:lemons..6c656d6f6e730000  6
:orange..6f72616e67650000  6
:pears...7065617273000000  5
[matt@localhost snips]$

I have to build my own support for packed char, not yet recognized as a data type.  I  can do this with double longs, and get a lower order packed char, for a 16 byte symbol.  One can construct the search path length is one recognizes the sorting algorithm in the symbol table.  On can essentially "Huffman" code you symbols by using common prefixes to shunt group of symbols down a particular path.  If a byte has bits, then a word has byes and a long has packed char.

My plot is to make a complete symbol table that works on the principle, without overwhelming the package with spaghetti.  Mainly I need good packing routine to fit the chars in the long where everyone expects, but that is no how we arrange text today.  Today, the most significant byte of  plain text is in the opposite position of the most significant bit of an int, while in the computer, somehow.

This will not work: printf("%s\n",pchars[1].txt);

The chars need to be shifted left and reversed to work. so printf see a zero where it expects beginning of word.  We, he computer industry, frigged this up and it may not be repairable.  There is a rule we neglected:  The data 'significance' and the 'ardess count' have to count together, I and log(i). Log(i) considered as an an address, able to count down to the bit level, that address will not increment smoothly for all datatype compared to what humans do in a left to right manner.

My goal is to make a 16 character high speed symbol table in under 150 lines of code.  As you can see I need utility  support, print8, packchar, stuff like that to convert from strings.  Once the symbols are in the packed char table, look up is very fast, and controllable. We need that in the snippets, we want to load a standard 16 char table into our snippets and completely forget about it.


No comments: