Saturday, May 8, 2010

The database plan

Database pick up words from the user. The database search hordes of semi organized lists of plain meta data text. The meta data text has some simple rule, their should be a group of columns which identify match codes, of short length,and unknown order. I query common mixes of these narrower codes against known data series codes, or known classes of codes, pointing to a much smaller set of possibilities.

In the R environment, lists of known series can be associated with user defined words.
It all about entropy maximizing a limited channel as the database goes from general to specific, it encodes information with shorter code lengths, from more common terms to fewer specific codes.

So, in the series id approximation, using SQL Views, the engine, takes various combination of the shorter codes and does various approximate SQL In SQL Like procedure against the known list of codes. In R the user isolates groups, and names them for easy reference.

So, the user searches through large tables of text, trying to match keywords typed in. The different views of these tables of text have narrowed down the search under the mild assumption of increasing accuracy in the left columns. The user can match these tanles against actual series id values, obtaining tables of those.

No comments: