Saturday, October 29, 2011

Designing the corporate ontology engine

The corporate semantic mesh we need contains two type of information, key words or phrases, you or I might type, and resource pointers. These are the ins and outs. So lets not define any more atoms than that, Internal to the data base we can keep click through statistics. Internally we might have type modifiers on the basic two types of elements.

Each node, in its minimal form, contains a list of keywords and a forward resource pointer. If a keyword in the list meets a sufficient match, then pull up the resource pointer.

Keywords can be typed, common words, proper names and phrases, built in types. Resource pointers can be types; a relative node name or a url reference, which may itself be a semantic graph or a returnable url.

Store great masses of contiguous graph serially into sql with rapid nested order retrieval. Keep whatever indices needed, internally, for fast retrieval. Publish a simple format so semantic analyzers and other processes can generate dense ontologies of text for submission.
Using WebSql or IndexDB, the engine can locate an entire sub branch of the ontology, and search it in place, skipping through a linear database. We never really need to retrieve an entire tree and can dispense with the DOM format. In the call back routine from the record base, just execute the graph traversal (queries are represented as graph traversals).

We need an open source version, Semantic Lite, built with javascript and the browser database. We all need a simple javascript semantic analyzer.

No comments: