Thursday, November 3, 2011

Gremlin: A graph traversal language

The Benefits of Gremlin
They have it essentially right. Check out their source code for graph traversal:
g = new Neo4jGraph('/tmp/neo4j')

// calculate basic collaborative filtering for vertex 1
m = [:]
g.v(1).out('likes').in('likes').out('likes').groupCount(m)
m.sort{a,b -> a.value <=> b.value}

// calculate the primary eigenvector (eigenvector centrality) of a graph
m = [:]; c = 0;
g.V.out.groupCount(m).loop(2){c++ < 1000} m.sort{a,b -> a.value <=> b.value}
And this bonus:
Gremlin is Turing complete: Many graph traversal languages can only recognize regular paths through a graph. Gremlin provides numerous memory and computing constructs to support arbitrary path recognition.
Notice the group operators, the . and the , ?

In the Imagisoft version, both the source code above and the target graph are both semantic graphs of the same construction, the language fits the Imagisoft model:
G3 = convolution(G1,G2)

The Gremlin code sequence is G1, the target ontology G2, and resulting match between the two, G3.

In the market:

The market place for Gremlin are news farms, web groups that perform sophisticated searches on news text. A aggregating news page might grab onto Gremlin and use to search and derive the search oltologies from the mass of new text. For the Imagisoft ontology engine, the Gremlin source is packed into a nested graph, including nodes for variable and script operators. The nested graph object model manages Gremlin computation correctly because the nest model is already expanded as directed graphs. That is, placing Gremlim script in nested objects causes the varaible computations and variable scoping to lay out in natural order.

Integration with the open source ontology engine:

At the core, the Imagisoft ontology engine is simply an execution unit for the match process between two nested object formats. The instruction counters in each object format are sub graph pointers, pointing to the descending graph in G1 and G1. Each instruction node attempting to perform a match to the other node, governed by its own filter design.   There is a third graph pointer for G3, pointing to the nested object model output.  G3 is stack oriented, and all three pointers are available in the Gremlin language. When the convolution is complete, G3 is the residual mutual traces where matches were completed.

 The Imagisoft goal, then would be to obtain the Gremlin intepreter and adapt it to the execution unit, then create a Gremlin compiler that puts Gremlin code into nest object form. Put the engine on top of an SQL data base, and release the thing. As far as me writing code, I can post my sqlite dll test code, nothing more so far.

Market demand:

An ontology machine into which one can dump gigabytes of text onto? Built on sqlite3 for fast, optimized scans, graphs stored and managed in object stores? Integrated with the Gremlin language in which search, ontology shaping, statistics; all processes, running as graph traversal convolutions. The entire ontology engine would be one piece of code, just download and run the thing on corporate servers. The most administrative task would be setting some slidebars to vary depth and rank of  the corporate ontology.  The engine would bring with it a gremlin programming base.

1 comment:

Billig Diablo 3 Gold said...

AWESOME perform. And remarkable choice of methods to be effective... Really great!! D3 Items
Buy GW2 Gold