Friday, November 4, 2011

Google's search syntax

The webpages maintained by the Google Help Center have text describing more than 15 various search options.[29] The Google operators:
  • OR – Search for either one, such as "price high OR low" searches for "price" with "high" or "low".
  • "-" – Search while excluding a word, such as "apple -tree" searches where word "tree" is not used.
  • "+" – (Removed on 10/19/11[20]) Force inclusion of a word, such as "Name +of +the Game" to require the words "of" & "the" to appear on a matching page.
  • "*" – Wildcard operator to match any words between other specific words. Wiki

Google really doesn't have a full ontology engine, but they do have something interesting, Big Table. It is their distributed file system.
They preprocess queries:
Google's search engine normally accepts queries as a simple text, and breaks up the user's text into a sequence of search terms, which will usually be words that are to occur in the results, but one can also use Boolean operators, such as: quotations marks (") for a phrase, a prefix such as "+" , "-" for qualified terms (no longer valid, the '+' was removed from google on 10/19/11[20]), or one of several advanced operators, such as "site:". The webpages of "Google Search Basics"[21] describe each of these additional queries and options (see below: Search options). Google's Advanced Search web form gives several additional fields which may be used to qualify searches by such criteria as date of first retrieval. All advanced queries transform to regular queries, usually with additional qualified term.
Which, in the more advanced Imagisoft vesion, becomes:

Displaygraph = convolve(convolve('user search terms',QueryExpander),WebOntology)


The user search word list is assembled as a nested order, dual predicate ontology map, with embedded operations. Then that list is convolved with a part of the Imagisoft ontology network that perform query expansion by convolution.  There is only one variable class in the Imagisoft system, the nested ontology graph.

What Google doesn't get is dual link classes, the so called OR class of links for siblings and sets; and the AND , SEQUENTIAL dot class of links.  That distinction makes for very fast scans of blocks of ontology skipping descending tracks already ruled out.  That can be done in conjunction with SQL as the 'micro-code'.

Node value is an arbitrary string, but the links operators settle into two classes.  In the nested order, the node,link pair contain a relative point to the next sibling in line; and the descending nodes follow immediately. So we have to have:
a,b,c,d  and   d.f.e.g

with property: a.(b,c) = a.b,a.c  from the left of the right.
But, in default we do not have commutation for the sequential operators, but in default will have it for the OR class of operators, if we decide to.  In either case, the execution engine is compatible with commutative limits on some operators but not others.  As usual, some times allowing OR commutation will result i great efficiencies in scanning blocks of ontology.

But, the message here is Google has nothing. I have done this two or three times, I know this graph traversal business.  I am interested in distributed files systems, however.

No comments: