Wednesday, December 14, 2011

Who would use an entropy encoded attribute schema?

Watson would. His best option at posing a question is to select from the most often used question formats, in the last 30 years of jeopardy. IBM stored the Huffman encoded version of various subjects, say history, in terms of question formats, likely focusing on the top 15, maybe, and getting 80% hits. Watson goes pattern matching with those formats, on a graph, around the particular history ontologies, looking for matches. Watson has likely watched every jeopardy game ever recorded, and memorized the top 15 formats for the top 10 subjects. And spent days in ponderment, ordering ontology formats, from history and other subject texts, in sets based on closest attribute pattern match to jeopardy. Really, grabbing the ordered key words patterns, doing some convolutions with english grammar, reducing the text, and streaming it onto disk in nested format, all graphed up (technical term). Further reducing the data to table and tables, but the formats limited to the most common set, the 80/20 rule. This machine, its 750 lines of code can do that, in version 2.5, when someone adds variable expressions, so that graphs can count their own appearances.

No comments: