Wednesday, November 30, 2011

The parentheses has lost its meaning to me

Blocks of graph with a potential name:
Name(some nested block)

The parentheses goes away in nested order, replaced by the forward pointer system. So we are left with this grammar, blocking some text into a group and potentially giving it a name. Any reason to do this? I say no, so explicit blocks of graph are not in the current version of nested order. Blockig is natural, each link having the potential of a forward pointer.

Done. That gets my TE parser down to 30 lines, open graph, append graph and close_update graph. I do some of these calls depending upon the operator. The results are perfectly formed nested graphs based on the grammar, embedded in the order I do these calls.

Syntax errors in version will crash the machine.

Microsoft safe routine

They are not too safe, Microsoft writes to my memory in get_s (get string) call, they think it is safe to write to user's memory. Microsoft!!! Can we get back to standard lib, please?

Tuesday, November 29, 2011

Command line tested, put to bed

It obey's the grammar, taking a stream of TE text and mapping to the the standard nested format, into a table store ready or execution.

No syntax checking, and gobblygook will crash it, but OK for version 2.0. I have format data streaming through the system, not yet completely checked out. I have the match function, completely untested. It is a stateless 20 lines of code.
SQL direct, within the grammar works:

_@(ColSchema:(Colname,..),_(select * from some table;))

Real, honest to god typeface explosion.

And, I am using the underscore, _, to get into the machine. Sql interactive is simple:
_select 'hello world';

Two underscores get you farther into the machine, tools and cheats. No underscores means everything you type will be seen as TE.

I am now running schema streams simultaneous with row input, but still needs testing.

Simple parsing with nested graph format

A portion of my TE parser.:

if(last_op == ',' ) {
close_update_graph(other());
new_graph(other());
append_graph(other(),ts);
last_op = op;
continue;
};

No expression trees visible. When the expression analyzer needs to link up an expression block, is simply closes and updates the subgraph upon which the expression exists. When it want to start a new expression, creating subgraph from the parent does the trick. Pointers are maintained to keep sub graphs nested with their parent graphs properly.

Only three operations allow the machine to compile ultra nested search graphs into nested order. Make a subgraph on the old, update and close a sub graph on the older, and appending to a sub graph. Given the simple four character operator set, the state transition are simple, so why bother cursing though the syntax then subgraphing does the trick.

I will try to get updates out in a few hours. Right now I am banging complex search strings at the parser.

Done

Coding complete for version 2 of the machine. Total lines, including command line, schema, output filtering, output table formatting; about 700 lines.

The following forms work:
@((MySchema:ColName,key1,key2,key3),SomeTripleGraph)

In that case, schema defaulted to the single column search. Schema can expand:

MySchema:(Part1,Part2.Part3),key1,key2...
Were the output picks matches only from column namess Part1,Part2..

This works
MySchema:(SomePattern),_(select* from random_table;)

The underscore gets the attention of the local machine, who always takes it directly to SQL.

The only blocking characters supported is the parentheses, today. It can be context specific, in the case above acting as a quote to collect an sql string.

Output, input, schema all use the sequence frame concept, and any given segment of triples is a set element of its parent, and blocks out a segment of the parent graph. So nesting is natural, and subgraphs update parent graph pointers when done, automatically. The frames have the start and end for sql selects to run through. They have the accumulated match and schema function. They keep a cursor for popping input triples and writing output triples, with the forward looking pointer updated automatically.

Fully nested and schema directed searches are available at the command line, like I type them here. Output can be directed to the screen or to the result table. Outputs are always well formed triple sequences. Row counting in triples start at zero, not one (like SQL).

By nested search I mean:
(key0.(key1,key2,key3),(key4.key5,key6))

But right now it is very inefficient, the select just running the limits and matches filtered out. Need some thought here, like making much closer integration with sql. When the thing is solid, there is lots of room to play with triggers and gfun calls from within sql. Version three will have ten times the efficiency, CPU cycles per match found.

Monday, November 28, 2011

Done with schema

Now doing output filtering. This is the problem where the client specifies set/descent ordering in the pattern select, using the comma/and link traversals. Following the pattern means tacking the descents, and rejecting their future matches when they terminate.

Any triple arriving from sqlite will be accepted for output if it is a continuing true result of the continuing descent. It is a true result based on the previous node, the previous matched node, the modifiers and the current accumulated truthfulness. If grammatical descents can guarantee an exit property at the last node, then that helps. The sql select can always report the previous node of any matched node. So G has a lot of tools, enough tools to confuse me. Remember, on output filtering, descent failure means the machine has to tack back to the previous match. Then reset the output row pointer pointer, deleting the rejected path. But the triple frames with their pointers make this a snap.

Then there is output destination: Spew triples to console or to any arbitrary table. I might ditch result, and create and output control operator for clients. Make the configuration of output a variable users can adjust.

Anyway, output is now the order of the day, then testing, testing and testing. Mainly large arbitrary square tables, and console sequences of complex TE language Include pasting very large queries straight from the notepad editor to the G console, just like sqlite3.

The Grammar we get in version 2.0:
The punctuation: , . ( ) ! :
Keywords are whatever sqlite3 wants them to be. The schema operator applies to sql column names only. But they can be matched with no problem, just dunno what their meaning is so far, except sql column names.

Working on schema

Full nested searches work with the more inefficient general select, not using gfun all backs on rowid. So I am leaving the general select as it is. Command line script improved so a TE expression is in proper nested order in the table before executing. My word lists are getting longer, I have to use spreadsheets to construct them. So I am anxious to work schema, output filtering and pointer arithmetic so I can more easily general triple files. On schedule, or ahead for a version 2 of full search and schema operation over square sql tables.

Triggers have been moved outside G and into install. So, using more advanced output techniques has been moved to the sql arena and I experiment with my M4 macro, just calling the G_EXEC to create a trigger.

For the general case of managing output, the assumption is that there exists a triple sequence, including the default sequence, which will process output on a column element by column element basis. So if the search script mentioned a scheme, then G has a pointer to the top triple in the schema and can just run the thing for each row received, the scheme doing tits self directed selects. So, very simple, probably three lines of code:

If this column equals this name then output that column to result as a triple.

So, the forward grammar identifies a schema subsequence. The machine open a thread, leaves it in the ready, but pending state. On each row received, the machine looks up any pending schema, does a swap thread on that schema, loops, and done.

So, the basic column select schema for arbitrary square sql table in version 2 is about 15 lines of code. This simplicity made possible by the staff at Imagisoft, where we imagine imaginary things.

Sunday, November 27, 2011

The fundamental match loop

The whole point of G is managing the following loop:

// if that thread is ready, run the select sql
thread = other();
if(thread->match_state==0) {
triple(SQL_MASTER_SELECT,0);
thread->match_state=1;
}
else { // else that thread runs to its match point
current_thread->state=0;
current_thread = thread;
for(;loop() != G_DONE;);
}

Two grammar sequences describing nested graphs have match points at some nodes. The one says,'I'm ready to try an sql select, is the other guy? No, then runs his triples through the machine, The presumption in the grammar is that when the graph opens a descending sequence, then all the priors about output formatting and select filtering have been established, as well as stop points.

I think I have an improved master select:
select self.key,self.link,self.pointer from self,other
where self.key == other.key
and self.rowid == SelfRow() and other.rowid == OtherRow();

Use the gfun call backs on rowid compares. I am shooting for this in version 2, a bit of an upgrade. If I get that working, the the gfun can start following descending set/descent subgraphs. Really simplifies output filtering makes for efficient graph matching.

Pitchfork warning

Prepare for riots in euro collapse, Foreign Office warns
As the Italian government struggled to borrow and Spain considered seeking an international bail-out, British ministers privately warned that the break-up of the euro, once almost unthinkable, is now increasingly plausible.
Diplomats are preparing to help Britons abroad through a banking collapse and even riots arising from the debt crisis.
The Treasury confirmed earlier this month that contingency planning for a collapse is now under way.
A senior minister has now revealed the extent of the Government’s concern, saying that Britain is now planning on the basis that a euro collapse is now just a matter of time.
“It’s in our interests that they keep playing for time because that gives us more time to prepare,” the minister told the Daily Telegraph.

G predicates in the machine

I guess we are supposed to use the word predicate for what the machine thinks is link. Never mind, all triple grammar has the same format: key,link,pointer. The triple machine can process any defined triple properly. Some of the triple predicates, link, are operators which the machine handles internally, and some are defined for activating match,format or select functions. Anyway, the internals link values are:
G_EXIT - Nothing left on this triple sequence
G_POP - There are three of these, they pop the next triple from one of three triple tables, all predefined names. The pop operator for a particular table is bound to the table when a triple frame is open on the table and usually not called explicitly.
G_CALL - Call the triple sequence identified by the row in key on the current table, if one knows what one is doing
G_SWAP - Pop from the other table.
Then we have the internals dealing with sql statements, mostly variations on the sqlite3 API set.
G_EXEC - Directly execute the key text as an sql
G_SQL - Step the sql statement in key. The output passes thru the matcher, formatter.
G_CONFIG - Install an sql sequence to the operator specified in key. Included is the opportunity to specify the bind parameters from the two open triple frames, like start, end, match function. This is a multi-step process and violates G grammar.
G_DUP - make a copy of the operator installed in key, duplicate it and run it.
G_SCRIPT - Recover the installed script for the operator in key from sqlite3, and print it

These are the ones likely to survive version 2.0.

How's my line count?
I have the triple machine solid at 230 lines, and the general thread and grammar handler at 250 lines.  I think I can get version 2.0 in under a total of 520 lines!

How's the software going, you ask?

I have this working (in the lab version):
_@( (k1,k2,k3),((w1,w2,w3),(x4,x5,x6))

Al the parentheses open and close properly with no lost data. In this case, the machine starts left, opens a parenth. It finds the opposing graph is idle, so it points the triple machine to the other graph. The parenth is popped off, g finds the opposing graph is now ready. The general select and match returns matches between the k keywords and the w keywords. The parenth on the right close at SQLITE_DONE. The process repeats, g finding the second parenth and opening that to continue, now finding the x word matches. It works because sqls are not nesting internally so G can easily update the thread row pointers and skip past the completed select blocks.

The one I am working on os this:
_@((k1,k2,k3),(w1,(x2,x3),w2))

In this one, the parentheses nest within a select. So I think I might need the gfun call back to determine proper rowid from within the SQl statement. Thus by managing rowid, the internal sql statement executing is updated thru the nested queries. Dunno all the details yet.

What about output?
The output, without modifying match function, is the default match on keyword and link, I would think. But any open triple frame that is running has its accumulated match, collect and format operators, they are light weight operators, like NOT and SCHEMA, would have preceded the open frame, so g knows what to do.
What is the grammar on schema then?

MyScheme:(schema pattern),(keyword list)

That is, the schema is applied to each element in the remaining set of elements. So from the machine's point of view, the script:
MyShema:(schema pattern) is a perfect valid form though it encloses no select function and has no effect, there is no way for G to look back and find the schema, even if referenced by the word 'MySchema'. G is a forward looking grammar, no backward references.

How about proprietary tables with an unknown number of columns, not nested triples?

On simple solution for version 2.0 is to install a different select, one that uses compound operators to pick off the columns, one by one, and presents them as triples to G, use install to change the operator on the general select. Automatic set up and install for proprietary tables is still under design, dunno the details.

Swapping threads, even across tables, here is the code:
void swap() {
TRIPLE_THREAD *thread;
thread = current_thread;
if(thread != other_thread)
current_thread = other_thread;
else
current_thread = self_thread;
loop();
current_thread = thread;
}

The machine just alternates between the two nested graphs. All the machine knows is that each graph has match points which it can recognize. A match request by the current thread causes the swap. It is the grammar of matching, typeface explosion, that must be consistent in the use of match points.

What is that thing loop? It just runs the current triple thread through the triple machine, the grammar of triples is self directed. Each triple is popped off and it may open a new select thread or modify the match, collect, and format function. All of the internal functions of the machine are available as triple predicates. For example, just use the G_SQL operator to execute the general sql statement in a triple: <0> Make that the only triple in your file. G will just execute the sql statement generating tons of output, which it runs through the most recently specified match,collect and format modifiers. Add a schema if you want, how would this look?

MyPrioprietaryData =
MySchema:(colname1,columnname4.columnname8,columnsname0),

then somehow add in your sql statement as text, specify the G fun and close the grammar.

So there are going to be some ways to kick into the machine internal function, g literal triples made available to the grammar. But the point is, even if we constructed the triple by hand, the machine will gladly accept any proprietary sql statement to generate potential match keys.

Saturday, November 26, 2011

The general match function

select 'Preliminary match',self.rowid,self.key,self.link
from self,other where

(self.rowid >= self.start)
and (self.rowid < self.end) (other.rowid >= other.start)
and (other.rowid < other.end) and (self.link == other.link) and (self.key == other.key); union select 'New subquery',... from self,other where (self.rowid >= self.start)
and (self.rowid < self.end) (other.rowid >= other.start)
and (other.rowid < other.end)
and ((self.link == '(' ) or (other.link == '(');

The machine executes this very inefficient general match function when both tables are triple format. G manages and filters the row output according to lighter weight match and select modifiers. The beginning and end points are known because both tables are nested triples with the forward pointers. Inefficiency when matches for terminated descents continue to report. But the counters move forward so the machine can keep the accumulated match results and knows when to reject failed descents.

When the 'New sub query' is returned, G will simply open a frame and repeat the same query, allowing it to terminate before re-stepping the prior query. This same format can also report any lighter weight link operators appear.

What are the row limits on the subquery?
I know that because I have the node of last valid matc (we filter extraneous matches). So the table opening the new query and the opposing table both have updated sub segment pointers. So the machine need merely launched the new frames with a duplicated copy of the master select. The binding of the limits is automatic, reference to the frame.

Triple form makes it easy because the sql can actually detect something other than canonical set and descent operators. The forward looking grammar means the machine can accumulate the match function defined by lighter weight operators.


Later, the sql sequences can get a bit smarter and faster, but for now this works in version 2.

G machine changes

1) The byte format for the command line input won't work anymore unless I put the command line into a table store as triples. I didn;t like it, but we do not have buyte mode, so until then, command lines must go through a triple store.

2) Threading triple sequences with frames works just fine.

3) Working with schema mainly, just now testing the default schema for an unknown target table and a specified schema.

So, yes, this thing should have the basic operators ready in a month for regular searching of square tables.

Friday, November 25, 2011

G grammar and SQL continued

Using the new framing format, we now have liked nests of frames, each frame representing a comma open description of the current pointer and overload for one o the two tables, self and other.

Now, G machine wants to make an assumption. At any given pointm, when self and other have their top frames in canonical form (commad and dot without overload) then G should execute the general sql: select where keywords match, links maybe match, and no overload operator in either link!

That last condition might come from a compound operator, it interrupts G and notifies G that a light weight operator has intruded. G machine, at the interruption, as all pointers up to date in frames for both self and other, at the time of the interruption. G can restart the query any time after the new operator has run through the triple machine.

So, down in the middle of a 10,000 word match, the client wants a three word sequence match and wants to suddenly deliver it in some special format, he introduces a schema operator. Not a problem, G should just kick in a new frame, execute the traversals associated with the new schema. When new frame is exhausted, exhausted, close that schema frame, restart the top two frames for self and other.

Thusly, we can relieve the grammaticists from having to develop context specific grammar, we just mainly have to agree on some customary results when we mix and entangle light weight operators.

The default treatment of an arbitrary sql table

The requirement of the _@ triple is that there exists at least one table that is in graph triple format. So, I think the assumption here is that the left argument is at least triple format:

_@(TripleStore,ArbitraryTable)

The default graph for the arbitray table would be: row1.(col1:col2:::),row2.(col1:col2:::) and assumed rectangular, so its a regular linear structured graph. I think the default might know the keywords col1,col2,.. or maybe not.

But g has at least one conformal graph. Generally the jumping off point is either the command line or the config table, so directing g to the appripriate triple store is manageable.

But the right argument is assumed to be, at least, a linear table of key words by default, with a discoverable column name schema and value type. How does G know? Well for now, it is looking for the first three columns to have names key,link,pointer. That a match for my g machine.

Without a triple in the first three columns, the client gets data as a linear keyword matches or the client imposes a schema. If the second argument does have three triple columns, then the machine still makes no assumptions about the remaining columns. It treats then as a descending set attached to the row triple. In other words, I think the machine assumes rowset,column descent when no information is available, but row triple, column set when the three triple columns are there. Not sure.

So that leaves us with triple threads

The machine, in its basic form, pops triples from a descending graph stored in an sql table. The execution unit is now rigged to support, essentially, unlimited threads ongoing from multiple tables. The machine, in the lab, can stack executable triple threads, maintain the pointers, and interleave them. Each thread frame blocks out a sub convolution of a g graph. So when the machine pops a new, light weight operator, the machine initializes a frame so it may repeatedly run its descending thread with an overload.

That's mechanical, I could do it with steel and wood. All the machine knows is relative light weightness, and that operators, including any small c helpers, are installed. It is the small c helpers I have been trying to avoid, but they are needed for overload.

But he multile thread concept is helping out with general matches because clocking characters can establish a key word thread with begin pointers and known terminators. So we can design general SQL selects against pestablished frames in run time. The simple expression: _@((k1,k2,k3),SomeTripleStore) will automatically set up a frame of the key words. The select where keywords match will have its row positions bound on start, to the sql, or have the frame established for calls back from within sql. The general frame, a convolvable sub graph of any g graph, built in, a given. It helps a lot with pointer arithmetic on output.It is established by default with the parentheses, and implicitly by light weight, installed operators. So, the parenthesis is the default blocking character for now, starts a frame with '(' and deletes it with ')' Overloading is inherited by sub frames. Everything is getting simple here at Imagisoft.

Here is my code in the call back for light weight operators:

case '(' :
p = new_triple_thread();
break;
case ')' :
del_triple_thread();
break;

See, it is built in, can't move it, the parenthesis has struck a blow for permanence!. So any operator can force itself to be lightweight with the parentheses mark, so lets not define a whole bunch of operators, ok.

Lightweight operators in grammar, some solutions

We are dealing with client form like:
MySchema:(a,b.d.h,r)

Where the colon is a light weight operator that modifies the standard g graph in the parenthesis. The solution is overload the colon and convert the schema into a convolution:

_@Name:(g_graph) -> Name:_@(g_graph) // How the schema is transformed in g machine

where the colon operator overloads the convolution operator with the schema modifier. Otherwise the schema sequence is run like any other convolution, but any of the graph operators will need to be overloaded, mostly with a default call to the schema c code. So we get a simple solution, the g engine and components remain simple, the schema becomes even lighter weight than convolution.

This works simplly in the machine. A light weight operator sets up the entry point to a subsequence and established the overload selector. This set up remains, and when the selecting or matching requires operation of schema. The overload is set and the already existing schema graph, a subset of the clients search graph, run as a normal subroutine call of triple.

Thus, the grammaticists, knowing what the engine is likely to do, can be a little aware when they design or proliferate lighter weight operators. Namely how much work and complexity do they want in operator overloads. We start with a general custom, you want schema, then impose them on columns of SQL because that is how g machine is going to be based in version 2.0.

The amount of c code schema needs is really minimal, the work is all in the sqlite as schema makes calls to column_text column_name etc.

How does that affect wildcards and schema?
What is: (*:) Match and collect all schema, I would think. This works because the wild card is a stationary keyword, not an operator. But, MySchema:(A,B,*) Hmm.., let's say the * means collect one time, then this collects the data by name until B, then it just grabs the next column. That implies a partial ordering we expect humans to do. Dunno, the industry will migrate toward a stable view of all this; but how soon?

Version 2.0, the target
In this versionwe want clients to type in expressive searches and get results. It should support, comma,dot,not,colon, aterisk, and parenthesis and general searches of the form:
_@(Schema:(Name.(Height,Hair)),SquareTable) //matching by schema

of _@((k1,k2,k3)),SomeTableStore) // treating the table store as a linear graph of keywords

and things like: _@((k1,*.*),SqlStoreViewedAsaGraph)

All of these with not thrown in here and there should work in about a month, and the data targets are all SQL tables, square arbitrary tables or triple tables.

So the schema operator as I see it

It overloads a graph g to match only by schema id, but collect the data; in it default operation. So:
_:(A,B.C,E)

Is, as follows, the null key word and schema link applied to all of key words in the nested order on output. (default). Schemas can nest, why not.

How is it used?

_@(_:(A,B),Somelargetable)

How might one declare a schema on an existing graph?
_:(A,B).(Graph)

Some of this is immaterial to the machine, as all it wants is to make sure the current schema populates a nested order, so it can descend it predictably as it picks off the columns of SQL.

But this is not difficult. The g fun has been outfitted with a column scanner by name in the call back. So all it really needs is to pick off the schema names in order. Collect the row, and then just execute the schema sequence and let it pick off the columns it wants.

So, I riged the machine so that the main execution unit is both re-entrant with respect to the c code ad with respect to the table. At any given time the call backs can switch the controlling triple store fro one of three, the config, self and other stores. Since each are descending, and since the schema is light weight, I can re-run schema pattern straight from the table it appears, over and over again. The machine already support subroutine calls on tripe sequences.

So for now, a schema is an executable sequence in a G graph, not a problem, the machine will handle it naturally.

Thursday, November 24, 2011

Simplifying the engine a bit

Clearly the ackwardness of using sql sequences directly into result is overcoming the future of the engine. I propose the sollowing, sql sequences assigned to operators do too write to reul, rather they perform a select and deliver the row to the engine, at in:
select key,link,pointer, and other variables where match and rowid conditions are met on other and self;

Deliver the data, and we put the g function in the general call back, and let it reform the result maintain pointers and write result. Much better control, allows arbitrary square tables with schema, and so forth. I will make that happen right away.

Done, everything now uses call backs, and it dropped another 20 lies of code fro the engine, but it expects more c call back code for managing the output flow to result from the ongoing process.

Collecting SQL columns with the schema operator

We want to view large square tables in graph syntax, how to we interpret a  table with many rows that can selected.  In typeface explosion, it looks like this query.

@((col1:col3:col8:),SomeTableStore)

Where the col1 name may be either the actual name or the standard alias, who cares. Some table may have 20 columns, who knows. So table store are comma separated rows of descending columns?  Rows are generally treated as a set, but here we are treating columns as a set. I think columns in a row are a bit heavier than comma separated byt a bit lighter than and descents.

How can we make an embedded sqlite3 generate triples from variably selected columns?
A compound sql selector run as often as the schema demands. Use a single statement:
insert into result select ?,link,pointer 
from squaretable
where other match and row conditons meet;


The question mark here is used in sqlite3 to bind literals to variables before running the statement, and is replacrd by squaretable.column.  So run that statement as often as the graph matcher clicks through the schema operators.  That is a combination of  some embedded code working in cooperation with an embedded sql.  The embedded code knows about both sql columns and graph schema links.  Later in a future version, make the operation even more embedded. For example, just install a call back on column schema, collect the whole row then apply the schema o it column by column in the engine. I will likely do this, refer to the grammar page in the engine section.

The grammar tells engine developers that the sequence: col1:col2:col3 can be stacked, and run as a subgraph, clicking off triple from the elect column.    That sequence is a simple repeated three node descent.  The schema operators appearing as a sequence of triple from the table other, for example.  The G machine need simple step by the node until it gets a descent termination, then overload the schema operator, and re run the same three nodes over repeatedly on each row sqlite_row.

Storing text blobs in each column is a great idea, no need to be triple compatible, run as many columns as you like.

What about: col1:*col4:col8   Grab col1 to cl4 and col8, right, if the wildcard means match and select. But is doesn't mean collect all schema. Bu wait, you say, shouldn't that be:  col1:*:col4:col8  Probably, making sure we select the data, not the schema itself. Yes, I think wikld cards are stationary wild keywords, with a possible repeat indefinitely link overload.

How about schema as a very lighweight operator, like convolution. It appear in his form:
:(A,B.(C,D),G)  where the schema refers to the whole patter i the expression.  That way we get bth structure and name in schemas.  So picking off rows of a table: :(col1,col2,col6.col9.col10,col20)


The graph machine will need to know about wild card, repeat indefinitely, and not operators.  They are like overloads.   But we want the machine to know about these very mechanically.  But a general syntax interpreter for graphs is only a few lines of code. So the machine marks the schema start operator and marks schema end pattern. It clicks through the patter emitting a insert selects to sql. The schema pattern, in default, indicates the nesting on output result. Thus a square table can be picked through by column assembling an arbitrary nesting pattern.

Semi automated reading with the graph machine

In this application, the client has a full graph machine running under the browser. He has a specific transformation graph, a widget up in the corner, called the reader. When the client wants to read he drags this widget over the text and window instantly replaced with an ontological derivation of the text. The client can just scan through kilobytes of blog text daily, generating and reading the ontological maps. In fact, the machine should run in batch mode, reading your daily blogs for you and generating key word ontologies.

I am reminded of the Steven Martin movie, he was an LA weatherman. Since the weather never changes he reran yesterday's forecast. We could have automated grabs that take out daily ontologies an automatically generate the associated text for automatic blog posting. Approaching the singularity with a sense of humor.

Wednesday, November 23, 2011

Take bookmarks for example

The client has a list of them, and he clicks on them once in a while. When he does, a tiny search graph is launched to collect the key word a node up and a node down, at the destination URL. It is a graph that collects the ontology of another graph and returns that as a graph, back to the client. Where to put it? In the client bookmark triple store. So run a utility store that rewrites the bookmark store in nested order including the newly acquired ontology, where not matched. Or use table links to keep scrap book information linked to URLs. Or grab chunks of text and add it to a utility column on your bookmarks, a general test blob per mark. Later accessible by the column selection property with an imposed nested schema.

News sites, wiki sites, corporate sites would produce ready made ontologies, clients widgets grab them. Next time through the client gets a quick pop up of recent possibilities, from the clients local store. Clients set slider bars and do clicks thrus to keep the ontologies to maximum entropy encoding. If the client ever grabs text from a return site, just link it in as a terminating text blob in some local ontology, in the client database. Quick, decisive personal information processing of the web, as seen thru graph traversal.

Clickable, slideable, dragging rapidly through graphs visually backed up by high speed local sqlite3. Tagging nodes for deselection, moving nodes up or down, making new node sets. Right on our browsers and local db. Take a search graph, your favorite, right there inn your browser and drag it into one or two new web sites. See if the new sites know what your search graph knows.

Another example, search graph normalizers. Wiki might give out graph traversals that expand and normalize a random word search for the user. Google does this, its a snap with graph traversal. Serious searchers and writers download keyword lists, make them available for specialized search expansion or text matching and deselections. The blog industry would boom, specialized blogs naturally developing shared ontology graphs. Writing code a breeze as the natural caching of the system is almost always on time to grab a software definition, even as you switch languages.

Dealing with large square tables by tagging them with triples

Existing, large application specific tables should be expanded because G machines can make them look like nested order with triple tags.  Then, using case statements, the sql sequence can further direct the match to a nested order selection of  one column.  So there is a general form for specifying the nested order of columns, existing tables have interfaces written with case statements, and these sql sequences can overload the normal set and descent operators.  The rows themselves indexed in nested order with a triple per row.  Ontology tagging makes proprietary tables more popular, makes it possible to have as many row and columns in any nested order, post defined on existing tables. So again, we have the grammar issue.  We can force the nested order on rows, or select from an existing nested order.  Operator weightings and names space agreement comes up again.

The other thing about proprietaries, they can run their own G machine and overload all the operators, even overload the alias tables, as long as they conform to graph syntax at the match points.

We need the definitive typeface definition that takes us from the bit graph to the world graph, something we can all write to.

Using the schema operator for high speed nesting

One of the things we can make G do is track a schema pattern in setting the pointer variable on output.  I am not sure of the grammar, but I am looking at a pattern like: A:(B:C:,D:E:) a nested schema. From the point of view of G and the gfun, there are simple pointer arithmetic expressions. On write to result, gfun just skips along this version from 'other' graph, using the set, descent weightings to determine pointer values, the result being a nested G graph of the specified structure.  In scanning either self or other, these schema become skip ahead patterns for sql functions.But again, the grammar confuses me.  Does this mean, force result to look like that, or only when 'self' looks like that.  What wild card properties modify this schema.  How does the grammar express higher level precedence between wild cards that supersede this? From the point of G as long as weightings follow the nested requirement, then G is never more than a link away from backing up on anything.  So gfun can manage whatever operator confluence the industry dreams up, within reason.

Conisder: !A: this would seem to mean, skip entries matching that schema component.  Of course, this implies that A alone would exclude schema? In other words A implies A!: match A except when it is a schema. But from the point of view of gfun trying to manage matches, the not operator is heavier than the schema.  It will have to be that way all the time, to make gfun mechanical. Not matching all schema? :!* or :*! or *!: It is all typeface explosion to us, but it matters for the simple few lines of  match and pointer function we want in gfun. I matter in coordinating sql sequences with the state of the two current nodes in self or other.

By the way, I like the string$ attribute, it tells G we are in byte mode and the triple store is a linear array of  non executable bytes.  Hopefully, there will be a compatible regex, or G will become regex compatible, such that the same typeface explosion grammar holds. G just wants to call regex, and be done with it. I go a bit farther here, and assign a quote operator; in quote mode it is a regex traversal.  There is no other solution within G compatible nested stores, that is all G is about, mutual graph traversal.

Table set up in the current G machine

It is now configured to pop triples from either the command line or the config triple store.  The config striple store, like the self and other triple stores are views of their real table names, which I disguise with the _alias tag. So self_alias is the real table, self is the prepared view.  Self_alias is a triple store, as is result (the real name) and other_alias.  But the views include rowid from sqlite3 the aliases don't have to delclare that.

The normal procedure for me is to clean out config_alias, delete the whole thing, then execute a command that loads config from an install table written in M4 macro.  It is in M4 that I create my sql sequences and assign them to link operators. So, in general operation, there is going to be a standard install, matching sql sequences to standard G link operators.  The G link field is generally readable, but need not be.  There are three or four reserved operator IDs for G itself, so it can do directed jumps, calls and configures.  These are available in M4, but they are funny looking on the screen.

Which brings up the point.  Does the search syntax have run time variable typing? Is there a typing operator that tells G to go ascii to int? Well maybe schema does it, we already have the colon. So maybe _OP:4 tells G to traverse standard internal G operator 4.  G knows about the key word OP as its reserved namespace? Do we have to get into name space? The underscore refers to the local thingy, so we presume precadence dictates the OP key word is locally known only. But then what is the scope of the local _ operators, and so forth.  Stuff out of my league.

Anyway, I will post the m4 macros a little later.

My blog is coming up on ontology searches

I probably cheat, but so does Google and Apple, right? Anyway, I am right on this. All of the ontology work is going to have to deal with sqlite, it is the major open source repository of data.  SPARQL gets it, that was designed around sqlite. So, yes, an open source adaptive ontology layer on sqlite, just about like the source code to the right will dominate

Plus I make the further claim for graph traversals.  It is a Turing machine, all of software can be formalized as a mutual graph traversal. So, there is no other choice than to formalize a type face explosion; a graph traversal bytecode, a triple format. Regex, SPARQL have gone too far down that road to be dismissed.  So, grab  the source code, anything that drives the ontology web will look very close to the thing I published.

Asterisk, the wildest of the operators

I propose it.  Look at the following:

A*B

That should pass the G graph every path between any A and any descending B.  Here I have the asterisk as an operator, not a keyword constant.  That is  bit different then nominal use of it as a wildcard, meaning substitute for filenames, for example. Does this work? What about:
(a,c,d)*(e,f.g)

Can G pass every path between a,c or d and one of the two e or f.g? Probably,  I am looking at it.  The problem with wildcards is that they descend down for a while then find outlets rejected.  But, with nested order, I think the pointer arithmetic can find the top of the rejected path, and restart output from there.

I am going with this for now, it is the most complicated issue, but its solution will lie naturally in the grammar of operators.  Let the asterisk be the wildest of cards. 

But wait!! Wha about the ?, the question mark. In SPARQL, that means collect tha path, not just pass it. How wild is the Question?
(A?B)
Collect everything between A and B? Or just collect one link? Do all these wildcards countfields? (*=4).... If not specified, it collects only one step? Dunno.

Righht now my wild card is asterisk and it passes everything to output. I have this installed

_@(Some Typed In Graph,*)

Which means, for the local engine, convolve he nested graph I type in with the wild card. The result should be set up with all pointers fixed, exact key word and operations duplicated. I should be able to concatenate these operations:
_@( (G1,G2,G2),*)

right about now. The interpretation here is, direct to the local mahine (the underscore) , do a convolution (the @) with the two conformal G graphs. Here the wild card asterisk is conformal.  The SQL behind it is:
insert into result select key,link,gfun(1,0) from other;

Where self is me, typing in. Notice in this insert I do not need any Sql trigger, because there are not back links to fix up at set endings.  The simplest form of wild card, but things will get a bit more complex soon.

The little things I notice, though.  In this example, the * is still an operator, where is the keyword?  Well SPARQL wants non-null keywords, so is there always a , _*, unless the key is unspecified?  (The underscore is the SPARQL non-null keyword).  That is OK for G, G is lightweight, anytime you have a nested null keyword, G is too far up to see it.  So the null key still means local to the position on the graph.  When null key appears a the top of graph than that is local to G.

Tuesday, November 22, 2011

Anybody but RINO

And yet Rep. Paul soldiers on, and you know what? As other candidates – Michele Bachmann, Rick Perry,Herman Cain – dash forward hare-like only to stumble or be run over by the next new thing, Paul is the perpetual tortoise in the race, mild-mannered, confident and unwavering in his positions (no flip-flopper he), advancing steadily toward the first real test in the Iowa caucuses six weeks from now. CS Monitor

The conformal SQL query for G graphs

Consider:
k1,k2,k3,k4@list1,list2
Trying to find occurrences of key words in two lists. Notice right away I need no blocking operators, the @ is he lightest weight operators, and will no trounce on the set operators on the left or right.

How should SQL be written against those comma operators?

How about:

insert or replace into result self.key,Link,0 from self,other
where self.key == other.key and
ResultRow() == result.rowid and
OtherRow() == other.rowid and
SelfRow() == self.rowid;

First, link, that is a field that SQL can mark up, it can actually change the property of the link and mark output as, say, unselected. Second, there needs some agreement between the SQL programmer and G about what he OtherFrow, SelfRow,.. functions do. The agreements determines the grammar and weighting properties of operators.

Everything in G is based upon a keyword match, even the rejection criteria is based on matches. So, by prior agreement, the SQL and G folks agree on how the NOT operator is managed. Select the matches, but discard them in the result, by pointer adjustment?   There is also the wirdcards, which are not realluy operators, but special keywords.  These wildcards have to have some agreed definition in terms of, like, match but pass or match but collect.  Even more are the wild cards for repeat indefinitely or just match in place.  I can see the repeat indefinitely property to support wild cards, but in version two, he queries can carry counters:
*.count++.count < MAX
Where we require the limit to be explicit. Graphs keep their own variables, who updates their values?  Probably G will do simple expression.

 SQL routines can also use filters on the link value:
where self.link & MASK != NOT;
So we want to understand how properties and operators ids are carried in the link field. The better we agree, the more likely we get some very serious SQL development behind graph query over the web.

SPARQL is almost their, they have mapped as closely as possible to industry standards, but keep pushing them and we get a complete traversal language.


So we have to understand the value ranges of operators and how can we sepate properties from function.

Socialist defense nework

WASHINGTON -- Republican presidential hopefuls warned in near unanimity against deep cuts in the nation's defense budget Tuesday night, criticizing President Barack Obama in campaign debate but disagreeing over the extent of reductions the Pentagon should absorb as part of an effort to reduce deficits and repair the frail economy.

Read more: http://www.sacbee.com/2011/11/22/4073546/gop-presidential-rivals-to-debate.html#ixzz1eUtNpzlM

Texas Gov. Rick Perry was harshly critical of the magnitude of potential cuts saying the Obama administration's Pentagon chief, Defense Secretary Leon Panetta, had called them irresponsible. "If he's a man of honor he should resign in protest," Perry said.

Read more: http://www.sacbee.com/2011/11/22/4073546/gop-presidential-rivals-to-debate.html#ixzz1eUtUWR24
Perry retreats to big government socialism.

Latest release

I dumped the latest lab version to the right under G engine. It has the 'call' function that allows multi-line SQL.

At this point, the engine itself is simple and nearly bug free. The real thing to work on, over time, is the gfun call back for match and pointer functions. The engine and the command line have enough utility for me, so I am leaving that code for a while. I will be playing with gfun, only as I need something from it.

Having proposed the engine and TE script, I am waiting the industry, will SPARQL go to a complete graph traversal system?

Sparql does nested graphs

{
  "head": { "vars": [ "book" , "title" ]
  } ,
  "results": { 
    "bindings": [
      {
        "book": { "type": "uri" , "value": "http://example.org/book/book6" } ,
        "title": { "type": "literal" , "value": "Harry Potter and the Half-Bloo
      {
        "book": { "type": "uri" , "value": "http://example.org/book/book1" } ,
        "title": { "type": "literal" , "value": "Harry Potter and the Philosopher's Stone" }
      }
    ]
  }
}
This is from an online SPARQL query. Notice the use of the sophisticated tab display format. For this output to be truly G machine compatible, the same blocking operators need to be used on input and output.


So my G machine sort of hit the sweet spot in the SPARQL world, the idea of a SPARQL adaptive layer over SQLITE3, makes great sense. But TE is the more advanced language. The way to go here is to map the SPARQL query to TE and run that through the triple machine.

Executing multiline SQL statements with G operators

Little that we really want to do with SQL databases is done on one statement, so we need a series of statements and tie them to a single operator. In G, version 1.x, the standard method is to issue a series of direct SQL stateents with the G_SQL operator. Then tie the operator to a call to the start of the sequence. G will run the sequence and return the machine properly.

I can set up series of sql statement in my M4 processor, as part of the install. In M4 it looks like:

START(MyProc);
SQL(`delete from result;');
SQL(`insert into result select * from other;');
SQL(`select key from result where self.key == result.key;');
END;
INSTALL(Operand,MyProc);


These generate the proper inputs fer sqlite to load them to config in triple format. As long as the install stays in the config triple store, this should work. The START macro define the starting row of the procedure. The END macro makes sure the procedure ends with the exit triple, forcing a return. Whenever I need the procedure, I deliver a triple G_CALL to the row. Nesting shouldn't be a problem, the call to the triple machine is re-entrant (or will be soon!!) and G automatically lets the c compiler manage stacking. In G the select output is delivered to the command line. Installing and configuring G is not something we need to be interactive right away, so the M4 does a fine job.

Let's say we had a list of triple stores that needed searching, and we have an operator set up. Then in TE we have:

(G1.G2.G3...) @ _OPS:Oper

Here, the Gi refer to conformal subgraphs. The _ operator tells me this is linked to the local Gm achine. The OPS schema tells me to look in the operator list and execute. The operator will re-execute on each of the subraphs Gi. There is no reason to make the Gi conformal, only that the operator understand their format. The parenthesis around the list Gi is unnecessary because the convolution operator @ is lightweight. It will not distribute over any set operators, nor will it commute with any descent operators.

Cute little buggers

Harvard calls them Kilobots.

Cheap looking. Buy them by the horde. I want to send these into orbit and have them build huge telescope mirror to precision.

Monday, November 21, 2011

More on list processing using SQL and the G machine

Consider the case that the client has a blob of words, organized as a linear graph of triples. This blob may have come from a 10,000 word article, or the new, or from anywhere; and it has been pre-processed to remove offending punctuation marks.

Step one, load the word list into result.
Now, to remove commonly used words the SQL does:

insert into result select self.key,NewLink(result.link),result.rowid)
from self,other
where self.rowid == NextSet() && Match(self.key == other.key):

where my syntax is likely horribly wrong. But note the use of function calls back to G? These function calls cause G to alter the pointers based upon the match conditions. Here is what I mean.

Start with the original list:

Cats,do,sleep,on,the,couch,mostly

When the match is a deselect the list becomes:

Cats.do,sleep.on.the,couch,mostly

The discarded words are relinked as descents on the kept words, not discarded. The top of each descent are the remaining set of key words. Sql can continue to select over them skipping the discarded, The NextSet() function jumps from the top of one set element to the top of the next. Thus, the usual method is to start with the bare bones lost, and un it over successive list filters, swapping words out, descending some words. SQL cannot change order until it decides to swap the table back and re-read it with sequential selects, back to result.

But the proper design of he call back such that various weightings of properties are obvious, hence he matches work as planned. There is match but select, match any, match select, match copy to result. If the industry really gets to a common consensus on operators, ten functional result of queries become much faster, our search sequences work as planned.

For example, the wildcard. If everyone agrees on the raw, most powerful wild card, then G can skip over entire search sequences when it sees one, not even activate SQL.

What is the wild card?
I say the most raw form of it is the * character in common use. Which I previously noted is also the multiply.

So I went ahead and dumped the latest code to the right there. It has the gfun completely separated out, that is the one the industry will debate, what is a good ontology of calls and match standards and pointer formats to male a sound "table" or even codes for dfun ations.

Looking at the future of gfun, then

The ontology framework depends on gfun for pointer control in nested formats. The gfun keep the finite state of the last match point for the three triple tables. It ust deal with descending weights, or perhaps utiple property chains as SQL skips, but its total vaiable set looks like:

Each of the three tables have something like:
current match rowid, operator, current match mask property
all in the G machine at any given time. So there is a limited matrix of possibilities of how the gfun should perform for any given call. The calls to gdun likely limited to around ten major calls differentiated with two two three variable. A lot more possibilities of manipulating the limited state information. But the limited et of grammars that we want to support makes it very possible for gfun to have a look up table, something which can be proven, outside of gfun, to be complete.

The goal here, for software, is to break out gfun into its own module. It is a self contained messaging with SQL, stateless, and via a table may contain all the control G needs ( along with sql operator installs). I can get this thing very close to SPARQL because the basic flow in G has all the paths and control loops. Break out the gfun, combined with operator installs, and tackle a SPARQL implementation.

Why not? make it part of the gfun test and expansion plan for the next day or so.

How do G and SQL crawl the stack?

We have a nest graph in tripple store, and want to follw a set of links back to the top, maybe adjusting a variable.  How do we do that?  With the gfun it is not a problem:
select AdjustPointer() where PrecedingNode() == rowid;

Its a snap. SQL maintains the triple store, the two coordinate in rapid forward and backward link skipping. Very high speed SPARQL queries, creating long, structured list, marked with relavent operators, all in triple format.

SPARQL and the null character

They set every node has a physical presences, bits in the machine. So they named the null character as the undersore, I think, _. This is also an indusry standard, macro started it, and it was used in maany assembly languages to mean, just like the other but different.
But I love it, and G adopts _ as its inhernet G graph. Want to know what G knows? _

While we are at it, lets make the convolution operator the lightest weight set perator, it always floats to he top and at the top of what G know then is: G itself knows
: _.(self@other) Written in TE. But sens @ is lightweight, this becomes
: _.self,_.other  but @ is comutative, so that can be _.other@_.self.
 Since both other and self are conformal nested diected graphs (Version 1.x) , then either form is equivalent, G is essentially one graph descending into two subgraphs, connected by the convolution operator. That is G (local).  The question is do you want G fully expanded.

G simple convolves with the one or a while, then with the other until they have exhausted each other. This simplicity is built into G, that is why it is only 280 lines, each of two G graphs mainly carry their own state information.
Does G know Sql?
_select 'Hello world';   works my my G machine. Later that will become: _SQL:select 'Hello world'; when G internals admit of some specialization, then I use the schema. Hey, I am happy with SPARQL, they are getting it.

What happens to files in the G system?

Dunno, lets ask:
@(Myfile?,FILE$)

Se, I have used the attribute operator, $, in my second argument. Take the name Myfile, and go match up with any atribute named FILE. If the result is null, then G dispensed with files, or files dsn't speak TE.

Really, do files stick around? Sure underneath the raw data store engine. Yet, I can see a formatted disk concept of row/column store, a sort of mechanical linked list.

So why are we keeping table stores around?

Because Sql can scan them at zooming speeds, get at the row quickly with indexing, all under the control of a simple language, very powerful.  The only restriction G imposes is that if SQL spews triples to result, then some simple grammar rules must not be violated.  For example, SQL can spew standard G lists, all on its own, they are simple and they are conformal G graphs.

I must have built a SPARQL machine

Or else the SPARQL folks are getting their research from the same place I got it. But G machine does a SPARQL architecture, once mapped properly. The keywords select, where are buried in the operator script and operator properties. They introduce the match operator, ?, which G machine hereby adapts, thank you very much.  They have the schema operator, :. I don't have to go far to pick these operators,  we have adopted an informal consensus in the industry.

SPARQL adopts the temporary solution to regex, the byte level search and replace. They just adpote the standard as an operator. We want to move beyond that, selecting a common operator set regardless of the byte compaction.

Their language makes clear the SQL engine underneath, like the G machine they intend adapters to graft ontology grammar onto table stores.  I see they use sql compound operators to designate alternative outcomes.  The intent of G engine is to remove the traces of tables, let the sql programmers go hog wild underneath, generating sql microcode.  Keep that entirely separate is the function of the new language Typeface Explosion.  Incomprehensible strings of text is the standard in graph traversal, deal with it.

The important thing here is that SPARQL is a target market for G machines, and the G machine is free source code (280 lines of engine), comes with a TE compatible interface, uses sqlite3 and can implement SPARQL.
Now G machine only generates tripple formatted graphs in G format, same format as its input.  So,  when the where clause of SPARQL  list a series of matches,  in G the where clause is becomes:
where descending_matches;  The entire were close driven by the Gfun call backs on descending_node and maches. The general is always:
insert into result select key,link,pointer from self, other where...;


Underneath,the G machine maintains the match rowids at the last match point. Also, the queries are built around the match of one graph to another. So, formally, the G machine and the SQL programmer understand the grammar of start and stop, as built not TE. So underneath the where clause is even simpler because it ultimately involves an operation specific match on identical triples. So the form is constant, only the match mask is relevant. The matching masks are set by prior agreement in the properties of operators. So any call to gfun of the form:

where matchop(triple:self,triple:other);
can be made to work.  Mainly because a prior call to gfun allowed gfun to pick off the match properties of the controlling operators on either graph.  Seriously, adopt this approach, adopt some form of TE, and then boom times on the web.

Graph convolutions with SPARQL

Look, the w3c group has produced a language and set of operations for ontology networks. They list graph traversal operators. They say things like this:
The result of a query is a solution sequence, corresponding to the ways in which the query's graph pattern matches the data. There may be zero, one or multiple solutions to a query.
See, here it is. Operator mappings make G machine compatible. Their system would come with operator property specifications, and hopefully, G machine can swap out operator tables to be compatible. Make Sqlite3 a SPARQL machine.
Look at their query form:
PREFIX foaf: some URL
SELECT ?name ?mbox
WHERE
{ ?x foaf:name ?name .?x foaf:mbox ?mbox }

See what they intend? They are adapting the underlying sqlite3 system to ontology. Those ? marks look especially like parameter mappings in sqlite3 embedded. But the language should admit of background pointer arithmetic to make it happen. An approximate version of the query in TE:

URL:(foaf.name,mbox).(name$k2,mbox$k1)
Or some approximate variation defined by wise geeks who know formal grammar. My intent was to limit traversal to URL schemes with mail box and name attributes, limiting the search for words k1,k2. But once geeks get it, then it is balls to the walls, Typeface Explosion, the language of ontology!

Heeeere's Newt

Washington (CNN) - Newt Gingrich tops the list in the race for the GOP nomination, according to a new national survey.

And a CNN/ORC International Poll released on the eve of a CNN presidential debate focusing heavily on national security and foreign affairs also indicates Republicans consider the former House Speaker the most qualified GOP candidate to be Commander-in-Chief. CNN
But wait, what's this?
Newt Gingrich was paid $1.6 million to lobby Congress on behalf of Fannie Mae. Here are ten other politicians who later became lobbyists . NRO
Newt's another one of those damned Communist Republicans!

The Socialist Defense Network

Moody’s Investors Service Inc. of New York lowered the outlook of five states with triple-A bond ratings -- Virginia, Maryland, New Mexico, South Carolina and Tennessee -- in July, partly because of their dependence on defense and other U.S. government revenue, said Bob Kurtter, Moody’s managing director of U.S. public finance.
“Any state or local government that has a significant federal presence is asking that question: Are they vulnerable?” Kurtter said.
Perhaps no state has more at stake than Virginia, which hosts the country’s largest naval base in Norfolk and is home to five of the Navy’s 11 aircraft carriers.

But mostly in the sunny warm regions:
Minnesota ranked 51st among the states and the District of Columbia, with defense spending accounting for 1.1 percent of its gross domestic product. The Defense Department spent $2.8 billion in the state in fiscal 2009, including $1.5 billion in contracts benefiting companies such as Lockheed Martin Corp. (LMT) of Bethesda, Maryland, BAE Systems Plc (BA/) of London, and General Dynamics Corp. (GD) of Falls Church, Virginia. Bloomberg

Minnesota is my favorite state, yet the poor folks suffer high losses in their relationship with DC spenders.

Using the G function in complex SQL filters

Consider the following search in TE language:
@((k1,k2,k3,k4..),(!list0.list2:(list2!list3))

Which with some liberal inerpretation I say means, remove the words that match list 0, then select the remaining words that match the list2 schema first list2 except those in list3.

How does the G function in the open source, patent free ontology engine help here? I can mark link as selected or unselected, for example, simplifying the where clause.

where gfun(unselected,rowid); // works as a selection clause because gfun had marked the node on a previous unmatch.

But wait, that's not all that the patent free pointer function can do.  I can chain the selected words as you go along, chaining the unselected together in a different list.  Now, to get the next, still selectyable word, the Sql stateent need only say:

where rowid = gfun(NextAvailable,0);  // Skipping over maybe reams of unselected words without having to move table around.

So this pointer arithmetic, and TE symbolic grammar makes a whole lot of sense, and potentially offers huge shortcuts.  Remember, G machine only remembers the current node and one node back.  It does this for three tables in the general format:

result = @(self,other);

Using gfun, G machine keeps sync with SQL procedures, they handshake. Potentially gfun can be even more powerful, chaining multiple threads of SET and DESCENTS with varying desendencies. The queries will be backed by a machine that can track schema chains and attributes sets; a chain for multiple operators. In gfun operator weights can be compared, and set closures selected appropriately. Normally I would expect gfun to fire five or more times during an update trigger. Gfuns are like the impulses of a large brain, constantly maintaining pointers and tracks. Keep it hidden in the pointer field, away from the sql side. But everyone in the environment must agree on grammar and precedence for operators.

How are my m4 macros lookin, you ask?



define(SQL_COMMA, ``insert into result select ?,?,?; '')dnl
INSTALL(COMMA,SQL_COMMA,SET_PROPERTY);

There, I have kind of a special triple language. Thee macros generate the format need for G machine to configure operators. In this case, the COMMA operaor is defined to write itself into result. I am using direct write to result from DOT and COMA operators to test pointer arithmetic.

Hows the command line?
It talks TE just fine. I can dot and comma to it all I want, and it can transapantly perform SQL statements.

The standard G machine will come with default operators and properties. The command line is always available and has access to all the gfuns. I can ype in: `select gfun(0,0); This wil fire teh gfun arithmetic from the command line, The command line input, and input from any control files, as well as any select emits, all pass through the same routine, triple routine. Then the triples are disposed of or handed of y the gh handler. I think it is bug free at the moment, except I am still playing with pointers.

Tha attribute operator

XML ontologies allow attributes, a special set of one or more elements that add typed information about ascendant node. So create an operator for them:
(Size$4,height$29,hair$brown)
The G machine still manages delivery in nested order, as long as the relative descendency order for the new operator is agreed on before hand. So, even all XML files can be preserved in nested order, retrieved with the same operators, and constructed with proper pointers.

THe issue came up again because Google uses XML for the search API. I need the API to grab Google search ontologies and fill up the G machines.

I dunno?

Sunday, November 20, 2011

I have defined the null keyword

The absence of a key word is a match on the local G machine. After hours of partitioning the selection of obtuse punctuation marks in search of the perfect 'machine' character, I hit upon the solution, nothing.

So subgraph .select * from self; looks to the local machine because it has no keyword for the DOT operator. In this case, my machine submits the search to sqlite and we get what is expected, a spew of triples.

This makes my command line simple, every thing is in TE language, dots and commas and returns and keywords; plus any installed and seriously deformed punctuation operators. Anyway, I got sidelined by the command line issue and spent time stepping and improving the main loop.

The goal is to free form TE expressions into the machine, and I think I am there. I should drop all this on source forge, but the thing is too complicated for me. Simply as a command line manager of SQL bases, the thing will move around. But the thing really takes off when others discover how easy it will be to dump irregular lists into the thing, and have them completely searchable, by path.

Pointer arithmetic working!

Clocking in at 288 lines, including everything for pointers. Testing as we speak.
I split the file up, separating the fancy command line function and dll initialization from the engine.

Saturday, November 19, 2011

Update!!

For those sqlite3 developers o the edge of their seat.
I put the latest lab version on the widget page. Not a complete 1.5, but i has the pointer operations working, and bug fixes.

Variable operations in Typeface Explosion

G machines work problems like:
G = (count=0,*.count++)

count all nodes everywhere. This requires variable operations on the graph G, (not on result) actually updating key values in the nested graph, simple pointers. But G machine also emits forms with variable operations, and G machine must maintain pointers. That problem is coming up, and I do not expect problems.

Then there are byte operations. The plan here is to use the sqlite3 regular expression code to dive into a blob of text in a key, treating it like a simple order G graph. But I am aiming for byte to be a property, not a separate set of operators.

I bring this all up, because the rules of grammar in Typeface Explosion, are not prewritten, and G machines have great flexibility, but they need agreement. Agreement among people, people generating triples, searching, dumping text. What are the set operations? Mostly functions with the property set (they are lightweight). Do we pick the first one in a set, or each in order, or all matches in any order, or first match only? Humans have to decide how variable expression execute, how they maintain order. Then the byte operations. We may not be completely compatible with industry standards.

This all involves G machines and hopefully the changes in G machines will be a simple reordering of heaviness properties, and perhaps one or two set properties. In other words, a simple table swap out. So, you see, I am deliberately vague about the rules of typeface explosion. I think the ontology industry will migrate toward common consensus, it is their problem.

Link properties in the land of G

Consider the case that some group agrees on a schema property, say e mail. Anything having the e mail schema can be digested by an e mail applications. So they create the schema operator ':'. That becomes an operator, an ontologies in G world can tag thir data as in:
G1 = (email:(mail1,mail2,mail3)
Then any other G colvolving aroung look for mails can search:
G2 = (email:*)

Hey no problem for G, except will we ever try o avoid e mail? Do we want to drop all paths that lead to e mail: Using the not property we have:
G = (!email:.*)

Well, no, we ight say, so then what, how does G deal with the higher binding property of :? The answer is organizing link function by property weights. They have to be ordered according to how heavily they descend relative to other ops.

How's the pointer arithmetic coming?
void gfunction(sqlite3_context* p,int n,sqlite3_value** v) {

int op = sqlite3_value_int(v[0]);
int x = sqlite3_value_int(v[1]);
//printf("gfun: %d %d\n",x,m.self_row);
switch(op) {
case RANK_SELF:
sqlite3_result_int(p, selfrow()+1);
break;
case FILE_VALUE:
sqlite3_result_int(p, prevvalue());
break;
case FILE_POINTER:
sqlite3_result_int(p, prevfile());
break;
}
}

This gfun goes with this t5rigger:
#define TRIGGER_SQL \
"create trigger mytrigger after insert on result "\
"begin "\
"update result set pointer = gfun(PVAL,0) "\
"where rowid = gfun(PROW,0); "\
"end;"

So, you see, anytime we get a new row in the result, the gfuns are called in the proper order to update previous open sets. There are simple math functions, prevfile, get the previous set file pointer (file as in rank and file), then prevvalue gets that prior set updates. Thus g machine keeps all pointers u to date and carries no state information except the last link.

From the point of view of the web, that means boom times. The web now spews key words in nested order, so the structure of data is always and everywhere available to all applications.

For SQL programmers it means that output should be organized by set and descent. Anytime, start a new set, and fill it with arbitrary length descents. Just adding those two properties to output gives applications most of the structure they need.

Me and my free Visual c

That is the software development package from Bill. I have my sqliyr3E3 dll, G machine, tools for running M$ and generating triples. Over time I will embed the G machine farther into sqlite3 and make sqliter3 the dominant DMS ion the web. Once the industry sees what they can do with nested tripples, then they ill all want sqlite4 to get guaranteed standard G nested triple output. I am betting here, but I think I am right.

Right now, I have the trigger installed. I relies on gfun, the pointer calculator which I can now add lines to. So, the point here is that I can write to a simple c program to make pointer updates on he prior open element sets. Probably have this thing generating perfectly formed nested triples in about a day. At that point, the ontology GUI develpers will think, hey, simple triple being spun from high performance DB. It is an easy adaption for them so we will get GUI right away.

I have ficed a couple of bugs in 1.0 and likely will try to get the source out for both pointers and bug fixes in a three days.

Making ontology pointers in sqlite3

Been working the problem but not writing code. The best solution is a combination of triggers and the pointer call back vunction.

When writing ontology nested orders, the set blocks have to be updated when closed and opened. That means looking back at the output, but not too bad. So the G machine does an output 5retult6, the trigger fires and updates any set pointers that have closed.

The whole principle of G machine is that it keeps no state information, except about the current nodes,the node at current graph pointers for self and other. So when a subgraph undergoing convolution selects a table jump, or a URL jump or adds another set element, the G macine keeps ink state information stores in the top link of an incoming jump, and the boottom link of a outgoing jump.

The same principle holds for adding another set element when the elements have variable length. Any new element requires recovering the pointer from the previous element (a simple subtract). The recovered link statw info is inserted into the head of the new set elemet. Sets have closing elements in the grammar, so when a set closes, the link state for the previous open set is carried along. The SQL trigger which does some of this requires two read and an update.

I should have it done is a few days, and I will call it version 1.1.

Friday, November 18, 2011

I don't get the anti China thing

“U.S. pressure on China has intensified,” said Tim Condon, Singapore-based head of Asian research at ING Groep NV (INGA), saying the shift has “startled” the Chinese. “China can’t ignore the U.S. stance. The only question is how they interpret it.”
The administration’s foreign policy strategy is being refocused on Asia as Obama wraps up wars in Afghanistan and Iraq, and two and a half years after he announced an effort to step up engagement in the Muslim world and on Mideast peace negotiations that largely have fizzled.Bloommberg

The word list I am playing with

I found it on wikiw, it is the most frequent words list. It is the first list I am inserting using the Inagisoft cut and paste method. Here again is what the lsit looks like:
(1.th,2.be,3.to,4.of,5.and,6.a,7.in,8.that,
9.have,10.I,11.it,12.for,13.not,14.on,15.with)


Writen in TE. G machine will eat that list directly, and if the dot and comas operators use the set and descent properties, the pointer field in the trip is set and SQL can uncover any nesting.

What will I do with the list?
I want to remove any word from my list of words that is in that list. That is an operation that SQL can do. A @(!(k1,k2,k3...),WordList). That is, in TE I want to set the grammar for the not list so ay matches do not pass. I think G might distribute the not property down to the rnd og linkd:
!(k1,k2,k3...) -> (k1.!,k2.!,k3.!...)

Right now the G is solid and I am working the pointer arithmetic and testing cutnpaste with large lists.

Thursday, November 17, 2011

It talks!

The G machine lives, a 200 line piece of open software you embed into sqlite3, complete with keyboard input and punctuation like graphics. All it nees is a zillion line of SQL.
I tested Typeface Explosion, like this:
5.one,
36.all,
37.would,
38.there,
39.their,

G machine just though it was some neste graph. So the comand line version works, one can load huge aunts of data lists with arbitrary order, an G will maintain the pointer to recover nested structure. I have some corrctions ade to the published version, bu I really need sourceforge.

Stuck!

I need an ontology view for my G machine soon.  Don't have one, looking for one.;
Solution!!


I am going to create the space/newline ontology display system for command lines! A descent is a space, a set element is a (*.space.newline) in Typeface Explosion  Basically, for ten lines of code I get a preliminary version of the TE byte code interpreter.That should take all of ten lines of code, and it gets me a standard command line interface to the engine.

User's manual for the G engine:
The user types in a string matching the rules of TE, namesly:
All keywords are character alpha characters or numbers, distinctly  All operators are highly unusual looking punctuation marks, namely the . amd , operators for now.   Given the built in operators, the user will be simply typing in G graphs in descending order.  The enhanced user interface converts  these to triple and runs them throogg the machine, after configuration.. The results displayed with standard tabs and new lines to preserve nested order.

I am up to 250 lines, but I get lotta leverage  the sqlite3 system.
The TE lexical analysis tool is done! A complex bit of machinery. It takes every alphanumerical you type at it to be the next char in a triple key. As soon at it finds a particularly ugly punctuation mark, it calls that the link. It then jumps into the triple machine.

G does the rest. G generates stuff in the same form you enter it. G only knows nested ontologies. So wen some mysterious SQL procedure des a select key,link,pointer, without any insert, then G prints that out on TE syntax. Theoretically two G machines could converse.
Here it is. The idea mainly is to accept input until something looks looke an operator, then try it out.
TRIPLE t;
do {
t.link = getchar();
if(!isalnum(t.link)) {
key[i] = 0;
t.pointer=0;
triple();
}
} while((key[i-1] != '\n') );

Wednesday, November 16, 2011

How's the software going?

I am about to release version 1.0.  It has the following builtins:

Pop the self stack and execute the triple
Config an operator
Print the script of an operator
Execute an sql script directly
Execute an operator
Jumps, manly jump relative and jump absolute in self

With these, I can use alter name in configured operators, no specialG code.  I call it version one, basically complete table jumping, sql executing batch processing.

The G machine has utility!

It operates as a controllable batch processor for large and proprietary SQL procedures on undifferentiated tables.  Here is how the current lab version works:
1) It pops a triple from the start of table view self
2) it executes the triple link operator
where triple is the record format:
(key text,link int,pointer int)

What is the link operator?
It is one of the following, to the G machine:
a) Install the SQL procedure in key
2) Execute the procedure identified by link
3) Jump the current table view row to  atoi(key)

Hence, a simple controllable batch processor for sql statements.  Since key is undifferentiated text, just dump a bunch of SQL in it as the initial install, and away you go, programming in triples, each triple may initiate some complex SQL, returning to G  at SQLITE_DONE.

How do we program in triples?
Use the m4 processor.  We can cut and past large SQL procedures right into any simple editor. My M4 macros have utilities to lay out the executing self table.  For jumps one macro increments a row counter as it lays out the triples.  Then I run the output through M4 and into sqlite3. I go from simple editor to direct control of sqlite3 for management of large complex groups of SQL procedures.


Here is my typical M4 layout in the lab.  It completely build the tables and view in sqlite3, inserts all the triples an manages the row counter for jumps.  SCRIPT refers to some arbitrary blob of text defines before this layout.  The output from the M$ pocessor feeds directly into sqlite3.  Then just start the G machine.
TABLE(`result')
TABLE(`self_alias')
TABLE(`other_alias')
VIEW(`self',`self_alias')
VIEW(`other',`other_alias')
define(Tname,self_alias) define(G_row,0)
NEWRECORD ("SCRIPT",2,1);
NEWRECORD ("Entry",operand,0);
NEWRECORD ("Return",3,0);

define(Tname,result) define(G_row,0)
NEWRECORD ("Test1",99,0);
NEWRECORD ("Test2",99,0);
NEWRECORD ("Test3",99,0);

Maintain libraries of batch procedures, each as a string of triples.  Then use alter table rename to self_alias, and run G.  self_alias, actually misnamed, is the real table underneath the view self. There is a jump operator for jump table, but still in lab. Looking at the code you can see swap table, tested but unconnected.
Finally, any arbitrary procedure stored in G machine can execute a specific jump with:
 select "Location",JUMP,0;

Latest from the lab

I loaded the most solid lab version of simple G I have. This version performs: @G(*,..)
Which stands for reading and executing a sequential list of G triples from the table store, one at a time. It includes installing operators, one sqlite3 statement at a time from the config store. It has the built in operator, CONFIG and POP. Pop just pops the node from self and looks up the operator. Otherwise, POP maintains its own pointer to self, and pop uses the gfun call back, it does:
select key,link,pointer from self where (gun(0,0) == rowid);
Since the G engine tracks its own pointer, it effectively single steps down self, one triple at a time.  The G function here delivers what might be called a program counter.

Tuesday, November 15, 2011

A bit about using G

At its basic, ignoring pointer arithmetic, the G machine simle stores random SQL procedures. Any procedure is executed with a, select 0,Operator,0; This is the simplest jump. Predefined parameters are bound by the sqlite2_bind and sqlite3_step is called. The jump triple can also be the last node in a config file. So one can configure the new stored SQL procedures and run.

So, whether the SQL procedures are proprietary and do not use triple format, even if they use a proprietary output table, G engine still acts as a simple, controllable stored procedure machine, optimized for sqlite3, using triple format control exchange between SQL and G graph.

A bit about G graphs

G = (a,b.(c,d,(x.y)),e.f.(g,h).k)

A G graph described in Typeface Explosion Laid in nested form:
a ,
b . 
    c  ,
    d ,
        x . y
e , f . 
        g  ,
        h  ,
            k . 

The graph is laid out in nested order, just pick off the indentation marks as rank and line count as file. I did not apply the distributive property on k. (Note: pointer arithmetic is still in design phase).

The G machine provides the gfun call under sqlite3 which can maintain pointers for the output result and perform skips on self and other. So SQL procedures can be written to a common definition of pointer, allowing them the property of skipping over nested subgroups. Gfun provides the NextSet function, skip ahead to the next element in the set, as well as a number pof other skipping about functions and they work on all triple formats in the equation:
result = @(self,other)
Gfun will be able to apply the distributive property on k in version 2.0. I that vesion, a NextDependentAnd function will be added to gfun methods.

Monday, November 14, 2011

My first convolution on the G machine

insert into result select key,link,pointer from other;
insert into result select key,link,pointer from self;

Concatenation. I put that operator in the config table, to be loaded and prepped. The G engine came up and popped the first node from config. It followed the installs triple until the next triple pulled was a direct execution of the installed operator.

Concatenation doesn't work in the general case, for G graphs. But for list of nearly the same width, they work OK, even with slightly different nesting pattern in each list. Using the rank and file indexing, with operator properties, SQL can do a great job of passing triples at ultra high speed.

More on SQL to G

Working on th engine I got the call back function working well.

SGL program can signal to the G unit any time by calling Gfun(op,...), where op is the function the SQL procedure wants G to compute.  This is working great for match points and pointer arithmetic.  But it is still in the lab.

I have a primising method working for table store jumps.  The problem here is we want SQL programmers to write and read from single source names so G can swap out the table names but keep the procedures.  A tricky problem, and I have a solution working.  This has to work, or this is the 'gotcha' that kills the idea, fingers crossed.  Graph traversal must cross table stores with relative ease, yet keep the SQL programmer free to innovate.

nterfaces.  All G compatible procedures correate self with other.  Those are the actual variable constants I had to pick.  Locally all procedure look like:
result=@(self,other)   For example:
select into result from an operation on self and other;. Write SQL against these three names and G will work.

Proprietary table, odd formats with existing procedures against them. Those table will have short interface operators to manage the graph match points. In other words, no hurry to reformat old tables and procedures, just be ready to issue G compatible output and write small legacy interfaces.. The only table G needs, evidently, are result, self and other. Bt G still reserves config.

I dumped the latest version of the to the right on the widget page. The additions are expanded use of the gfun oall,and less use of parameter mapping to procedure stateements. Pointers not tested, but their mechanism sure works in gfun.

.

How's the software you ask?

Holding at 170 lines, and most of it working.  Slow a bit with the macro, but table swapping and links will be working fine.

Sunday, November 13, 2011

Macro languages and word processors

You are not too geeky when you make macros for your favorite word processer, still done! M4 macro is no much different than a word processer macro.  Here is an example of an install layout, this layout expands into a set of G machine configuration triple, it is loading an operator into the machine.

INSTALL(7)
SCRIPT(`select key,link,pointer from ? where rowid == ?;')
PROPERTY(DESCENT)
PARM(Self_key) inc_pc
PARM(Self_rowid) inc_pc

I am telling the G machine that the link number 7 is associated with that particular SQL script and uses DESCENT pointer counting.  It defines two parameters, the self table, the ontology stream currently in control; and the self rowid, the G machine,s copy of where the self graph is, on its self traversal.  Whenever this type of link is traversed, G  will read the next instruction and repeat.

Normally G does not get and rows, G only gets active when ever an operator returns from a longer operation.  The G does pointer arithmetic, to catch up.  Is the operatiion up there useful?
Dunno, still in the lab.

What am I doing with SQL, you ask?

Right now looking at queries that maintain three counter on the same G graph.  U us one counter to get the start rowid of a nested block, the other to get the stop rowid, and the third to get the stuff in between.  We make this iterator concept stanrd for searching fixed from graphs, like dictionary lists.  Rank/file indexing make this a snap.

The sql query scans thousands of records, but still obey simple SET/DESCENT grammar.  The concept here is that the SQL operator on the list elements understands the fixed structure of the list. As the list executes, it can use optimized skimming with out disrupting the state of G.  Start and stpp iterators, with the compound statement make it simpel to add terminators to lists.  Even non triple lists, proprietary sql data can be handled, as long as the query procedure understands enough G grammar to  produce data in nested G triples.

This will work, this will make the SQL industry boom.

How's the software?

I dumped the latest lab version of G machine onto the widget thingie page to the right. It has everything and I mostly stepped thru it all.  I dropped ten bad lines and added pointer function to sqlite3, installs but untested. (about 150 total lines still)

On the M4 macro side, I am still at the level of programming with triples inserted like assembly language. No gestalt coning.  Likely tomorrow I will be throwing hordes of triples at the thing.

Pointer arithmetic, crawling up the graph is a bit harder than crawling down, but certainly no problem.  Untested.

Operators.  I have been divining properties and functions.  I like the not property inheritable by all operators, and it means, reverse the intended direction on the graph.  I have the debug property. The I put the new schema property next highest. The the rest of the properties (set and sequence)  followed by variable and application specific operators..

Distance property.  Byte jumps when the quote property is on, these are the smallest jumpt the machine makes. Then come the micro jumps.  Sql selects from do this this within procedures, scanning thousands of recodrs.  Then the mini jumps, node to node among the two current graphs, then the mini jumps across table stores, then the URL jump. So, I reserved about 15 spots on the table of unreadable fonts, just for jumps, stops, returns and reverses. .

Config.  This thing will have a standard configuration in the config graph, always locally ready.  About three core operators will allow standard graph traversal/ Other operators user definable..

Saturday, November 12, 2011

Pointer arithmetic in G

In the standard G triple: (keyword,link,pointer) The element pointer tracks rank and file indexing for the nested structure. In the operation:
Z = @(X,Y)

Neither X, not Y have nor want any information about the current rank/file pointer on Z. Coputing the pointer for Z is simle if the current pointer state information is kept in G.

Hence, my G Machine has an installed sqlite3 function called pointer, it returns the proper value for any given append on Z. Thus, in the world of G where all of G is a result of convolutions on G, then all pointers must be consistent at any time. Just perform:
insert into result keyword,link, pointer() ....;
when emitting to result.