I doubt I have more than 50 lines of code to make my text scraper an attachment and I will have a machine cross referencing web pages with plain text words, from words lists in Step and Skip form.
I stepped over and wrote the few lines of code needed to traverse the text, about 20, plus the copy and paste boiler plate for the method switch. Easy, i add in the dll interface, another boiler plate.
I do not get dll hell, all new and improved web scrape tools look the same, like a directed graph. There is no interface change across versions, and, absent changes in Match, new and old versions compatible.
I have this capable macro shell, access to the plain text web structure, across the web, and a Lazy J search scripting tool for advancesd searching and card list intersections. Here is the raw, untested code: It works of the stack of pointers created when the Dom was decomposed into ignore and collect nodes. I actually descended the Dom creating a Ignore,Collect tagged set of pointers. So, I add the code, report the ExecScrape routine back to join. During the Init cursor method, the user, or another graph, has produced a file name, internet IO is not enabled yet.
Mhy macro machine allows, files= "file1 file2 file3", or I can go edit the environment save file, adding the line, removing the quotes: files file1 file2,file3. Now, Then I write the utility: GetCursorFile which reads the file into a a big buff and the cursor will point to the first character. Here is where piping between console commands helps, but I don;t have it. Instead, I have the editor, and I can lay out all my cursors in slotsand establish the slot order. IO use a static map of the cursor stack, good enough.
:
int EvalScrape(PCursor self, int method,void * data) { int i= self->current; int offset=self->state; switch(method) { case Step if(stack[i].code == Done) return(Done); self->current++; break; case Skip: if(stack[i].code == Done) return(Done); self->current = stack[i].loc + offset; break; case Fetch //strcpy case Append // input only case Init case Set: case Init: default: break; } return(Null);}
No comments:
Post a Comment