Source Editor Interface

Source Editor

The Source Editor is displayed when clicking on Source editor from the Manager interface.

 

Description:

Use the Source Manager to create, edit, validate, test, and publish your sources.

The Source Editor provides a simple interface for adding and testing new sources, saving templates for future sources, and managing existing ones.

 

Note that the grey lines can be dragged to increase or decrease the size of the editor window.

FieldDescriptionNotes
Filter Window

Text box that enables you to filter the list of sources. You can filter the list of sources using the following fields: URL, Community ID, ID, Tags

The "Filter" text box will by default search the source titles, but it can also search the following fields:

Search FieldTo Search Using This Field...
URL

type "url:<url fragment>"

URLs from the processing pipeline or feed configuration objects won't be searched unless you are currently editing them.


Community IDsType "community:<community-id>"
IDType "id:<source _id field>"
TagsType "tags:<tag fragment>"
key, title, description, mediaType, extractType

Use the same "fieldName:<field value fragment syntax>"

title is the default if no prefix is specified


Suspended sources

type...

  • "suspended:true" to see manually suspended tasks
  • "fullQuarantined:true" to see unauthorized sources (this can happen automatically because they error too much, or if they are disabled by an administrator)
  • "tempQuarantined:true" to see sources quarantined for the day (because of a possibly transient source error)
  



 
Source Functions

Suspend Source:

Button used to perform source functions.

For more information see section Source Editor

Test Source:

Button used to perform source functions.

See Test Results Pop-Up below

For more information see section  Source Editor 

Save Source:

Button used to perform source functions.

For more information see section  Source Editor 

Save As Template:

Button used to perform source functions.

For more information see section  Source Editor 

Publish Source:

Button used to perform source functions.

For more information see section  Source Editor 
TitleSource title. 
Share ID

System generated Share ID associated with the source. A share is an entry in the Community Edition (CE) database used for saving data.

eg. query history (JSON), map reduce jars(BINARY), widget swfs (BINARY).

Example:

53ac8b38e4b015f8f58175ec

 

 
DescriptionDescription of the source, added at source creation time. 
tagsA list of metadata tags, used for searching. 
OwnerSource owner 
CommunityCommunity to which the source is associated. 
Test Parameters

Full text:

By default, the full text of a document is not returned (it can be quite long). For testing text extractors (eg "boilerpipe" vs "none" vs "AlchemyAPI"), or for testing "unstructured analysis" transformations, the Full text maybe useful or essential though. In these instances, it is recommended to enable the checkbox.

Number of Documents:

 The maximum number of documents that will be enriched and returned. The smaller the number of documents, the quicker the API calls returns.

Update Test Mode:

todo

 
  
   

Code Editor

The editor window displays the source code for editing.

Description:

Use the code editor to perform basic editing of the source in JSON or JS format.

The CE platform supports both newer sources (source pipeline) and legacy sources.  UI elements vary based on source type, as described below.

FieldDescriptionNotes
JSONThis is the full source including all fields 
New Source Pipeline
  • "JS" - The global script that all other elements can use - all of the logic can be written in here as separate functions, and then the scriptlets in other pipeline elements can be simple calls to these functions, to maximize the maintainability of the code in the source.
 
  • "LS" - If generated Logstash sources, you can write the configuration directly into here
 
  • "SRC UI" (currently only supported in the enterprise build) - brings up the source builder GUI
 
   
Legacy Sources
  • "JS-U" - the Unstructured Analysis Module allows content to be transformed by "scriptlets" (xpath/regex/javascript) into document metadata. This view shows only the javascript maintained in "unstructuredAnalysis.script" - all of the logic can be written in here as separate functions, and then the scriptlets can be simple calls to these functions, to maximize the maintainability of the code in the source.
 
  • "JS-S" - the Structured Analysis Module allows content to be transformed by "scriptlets" (xpath/regex/javascript) into document metadata. This view shows only the javascript maintained in "structuredAnalysis.script" - all of the logic can be written in here as separate functions, and then the scriptlets can be simple calls to these functions, to maximize the maintainability of the code in the source.
 
  • "JS-RSS" - (only visible if the "searchConfig" field of "rss" is specified; use "Save Source" to reset visibility if it changes during editing) the Feed Harvester can use javascript (and xpath) to create multiple documents out of a single received feed. This view shows only the javascript maintained in "rss.searchConfig.globals" - all of the logic can be written in here as separate functions, and then the scriptlets can be simple calls to these functions, to maximize the maintainability of the code in the source.
 
Check format

If run on the "JS-U" or "JS-S" tabs then the javascript in "structuredAnalysis.script" or "unstructuredAnalysis.script" is checked instead. 

This validation is run automatically before the source is saved, tested, enabled/disabled, or published. (Or when switching between the JSON/JS tabs). Note that the automatic validation does not run on the javascript, only on the JSON.

 
Scrub

Allows you to scrub the selected source.

Scrubbing sources removes all fields added by the server after publishing, just retaining the actual ingest logic. It should be used before copying/templating.

 
Revert

Allows you to revert the selected source

The "revert" button in the top right hand corner of the code editor, for published sources, overwrites the existing temporary share with the current version of the source in the database. 

 

 

Test Results Pop-Up

When you click on the "Test Source" button a pop-up is triggered as shown below.

The test results pop up contains 2 text elements:

  • A status message including the number of documents returned, any errors or warnings encountered etc.
  • The JSON of the extracted and enriched /wiki/spaces/INF/pages/4358642, if the test was successful.

Based off the results from testing, the source can then be refined until the desired functionality is obtained.

 

In this section: