Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

...

The IKANOW platform provides a JSON-based configuration language for building sources, together with a UI to help with configuring and connecting the JSON elements, together with various utility applications helping configure specific input types (RSS, datasift).

...

Using the flow builder to create sources

New sources

From the Source Editor. create a "New Source" (top right button), and select the "Empty Source Template" (one of the "user/shared templates", if created using the flow builder).

...

The source is then saved, tested, and published as normal using the source editor - consult the linked documentation for more details.

...

Simply select the source to edit in the source editor, and press the "FLOW UI" button.

...

Code Block
{
	"name": "string", // e.g. "pdf_extract"
	"type": "string", // e.g. "input", so will generate the component path "input/pdf_extract"
	"fields": [ // will map to input ports
		{
			"fieldname": "string", // the field name of the input			
			"type": "string" // one of "string", "boolean", "int","float"					
		},
		//..
	],
	sourceBuilder: function(flowElement, source, pipeline, lastSourceElement) {
		// user callback that will take the JSON "flowElement", and use it to create a new source pipeline element in "pipeline" (==source.processingPipeline, source is provided so you can set tags and things)
		// "lastSourceElement" element just points to the last element returned from "sourceBuilder", or pipeline[pipeline.length-1] if null			
	},
	sourceValidator: function(flowElement, source, pipeline, lastSourceElement) {
		// same params as above, called before "sourceBuilder" - if returns a non-null string then source building is interrupted and the string is returned to the user
	}
}
Supported types
  • The following type values are supported:

 "conditional" - creates an if-then-else element supporting splits in the flow logic.

The type values "input" , "globals" , "extractors" , "text" , "metadata" , "entities" , "storage"

are mostly used to order components in a hierarchy and are used to validate the overall flow.

 

  • The "fields" describe the input parameters by specifying a "fieldname" and a "type" for each input.


Please note, outputs are created based on the component's type, e.g. conditional elements have outputs where all other components will have 1 output.

Conditional Element

Conditional elements can be used to create a split component that handles if-then-else logic.

Each conditional element should return a "criteria" script attribute that is evaluated by the harvest control logic.e.g.

Code Block
"sourceBuilder":  function(flowElement, source, pipeline, lastSourceElement) {
          if (null == flowElement.state) flowElement.state = {};
          var critString = "_doc[" + flowElement.state['Fieldname to test'] + "].matches(/" + flowElement.state['Regex'] + "/)";
          var element = {
              display: "Check a document field",
              criteria: "$SCRIPT( return " + critString + ";)"
          };
        pipeline.push(element);       
    },

Please note the flowbuilder process will evaluate conditional components and logical splits and add "$SETPATH(),$PATH() and $SCRIPT() 

 to the criteria. This is internally used to keep track of branches in the logic.

The sourceBuilder function and the sourceValidator functions

...