Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

The Source editor is displayed when clicking on Source editor from the Manager interface.

Image Modified

 

 

 

 

Description:

Use the Source Manager to create, edit, validate, test, and publish your sources.

The Managing Sources with Infinit.e Source Manager Sources page provides a simple interface for adding and testing new sources, saving templates for future sources, and managing existing ones. Future iterations of the tool will provide actual support for the difficult bits of source writing, such as writing Javascript and regexes.

Info

Note that the grey lines can be dragged to increase or decrease the size of the editor window.

FieldDescriptionNotes
   
Filter Window

The "Filter" text box will by default search the source titles, but it can also search the following fields:

  • URL: type "url:<url fragment>" 
    • (note that URLs from the processing pipeline or feed configuration objects won't be searched unless you are currently editing them).
  • Community IDs: type "community:<community-id>"
  • ID: type "id:<source _id field>"
  • Tags: type "tags:<tag fragment>"
  • key, title, description, mediaType and extractType: use the same "fieldName:<field value fragment syntax>"
    • (note title is the default if no prefix is specified)
  • Suspended sources:
    • "suspended:true" to see manually suspended tasks
    • "fullQuarantined:true" to see unauthorized sources (this can happen automatically because they error too much, or if they are disabled by an administrator)
    • "tempQuarantined:true" to see sources quarantined for the day (because of a possibly transient source error)
 
Source Functions  
  
  
  
Title  
Share ID  
Description  
tags  
Owner  
Community  
Test Parameters

Full text:

By default, the full text of a document is not returned (it can be quite long). For testing text extractors (eg "boilerpipe" vs "none" vs "AlchemyAPI"), or for testing "unstructured analysis" transformations, the text maybe useful or essential though; in these cases, enable this check box.

Number of Documents:

 "Number of documents": the maximum number of documents that will be enriched and returned. The smaller the number of documents, the quick the API calls returns.

Update test Mode:

 

 
  
   
   
  



 

Code Editor

The editor window displays the source code for editing.

Image Modified

Description:

FieldDescriptionNotes
JSONthis is the full source including all fields 
New Source Pipeline
  • "JS" - The global script that all other elements can use - all of the logic can be written in here as separate functions, and then the scriptlets in other pipeline elements can be simple calls to these functions, to maximize the maintainability of the code in the source.
 
  • "LS" - If generated Logstash sources, you can write the configuration directly into here
 
  • "UI" (currently only supported in the enterprise build) - brings up the source builder GUI
 
   
Legacy Sources
  • "JS-U" - the Unstructured Analysis Module allows content to be transformed by "scriptlets" (xpath/regex/javascript) into document metadata. This view shows only the javascript maintained in "unstructuredAnalysis.script" - all of the logic can be written in here as separate functions, and then the scriptlets can be simple calls to these functions, to maximize the maintainability of the code in the source.
 
  • "JS-S" - the Structured Analysis Module allows content to be transformed by "scriptlets" (xpath/regex/javascript) into document metadata. This view shows only the javascript maintained in "structuredAnalysis.script" - all of the logic can be written in here as separate functions, and then the scriptlets can be simple calls to these functions, to maximize the maintainability of the code in the source.
 
  • "JS-RSS" - (only visible if the "searchConfig" field of "rss" is specified; use "Save Source" to reset visibility if it changes during editing) the Feed Harvester can use javascript (and xpath) to create multiple documents out of a single received feed. This view shows only the javascript maintained in "rss.searchConfig.globals" - all of the logic can be written in here as separate functions, and then the scriptlets can be simple calls to these functions, to maximize the maintainability of the code in the source.
 
Check format

If run on the "JS-U" or "JS-S" tabs then the javascript in "structuredAnalysis.script" or "unstructuredAnalysis.script" is checked instead. 

This validation is run automatically before the source is saved, tested, enabled/disabled, or published. (Or when switching between the JSON/JS tabs). Note that the automatic validation does not run on the javascript, only on the JSON.

 
   
   

 

Test Results Pop-Up

When you click on the test Source button a pop-up is triggered as shown below.

Image Modified

The test results pop up contains 2 text elements:

  • A status message including the number of documents returned, any errors or warnings encountered etc.
  • The JSON of the extracted and enriched /wiki/spaces/INF/pages/3899780, if the test was successful.
    • Future versions of the tool will allow the documents to be viewed in widgets in the main GUI, providing a much easier interface to validate the source.

Based off the results from testing, the source can then be refined until the desired functionality is obtained.

 

Panel

In this section:

Table of Contents

Panel

Related Procedural Documentation:

Source Editor