...
Source management is intrinsically a complex process (particularly when taking advantage of Infinit.e's customization engine).
The Infinit.e.Manager Sources page provides a simple interface for adding and testing new sources, saving templates for future sources, and managing existing ones. Future iterations of the tool will provide actual support for the difficult bits of source writing, such as writing Javascript and regexes.
Note that the grey lines can be dragged to increase or decrease the size of the editor window.
Create New Source
...
...
...
...
...
...
...
...
Edit Existing Sources
To edit an existing source click on the source's name in the list of Sources found on the left hand side of the page.
...
Info |
---|
If copying the logic of an existing source, it is recommended to first "scrub" it to remove any server-added fields (particularly "_id" and "key", which can overwrite the existing source). |
...
...
...
There are 3 tabs that can be edited:
- "JSON" - this is the full source including all fields
- New source pipeline:
- "JS" - The global script that all other elements can use - all of the logic can be written in here as separate functions, and then the scriptlets in other pipeline elements can be simple calls to these functions, to maximize the maintainability of the code in the source.
- "UI" (currently only supported in the enterprise build) - brings up the source builder GUI
- Legacy sources:
- "JS-U" - the Unstructured Analysis Module allows content to be transformed by "scriptlets" (xpath/regex/javascript) into document metadata. This view shows only the javascript maintained in "unstructuredAnalysis.script" - all of the logic can be written in here as separate functions, and then the scriptlets can be simple calls to these functions, to maximize the maintainability of the code in the source.
- "JS-S" - the Structured Analysis Module allows content to be transformed by "scriptlets" (xpath/regex/javascript) into document metadata. This view shows only the javascript maintained in "structuredAnalysis.script" - all of the logic can be written in here as separate functions, and then the scriptlets can be simple calls to these functions, to maximize the maintainability of the code in the source.
- "JS-RSS" - (only visible if the "searchConfig" field of "rss" is specified; use "Save Source" to reset visibility if it changes during editing) the Feed Harvester can use javascript (and xpath) to create multiple documents out of a single received feed. This view shows only the javascript maintained in "rss.searchConfig.globals" - all of the logic can be written in here as separate functions, and then the scriptlets can be simple calls to these functions, to maximize the maintainability of the code in the source.
...
If run on the "JS-U" or "JS-S" tabs then the javascript in "structuredAnalysis.script" or "unstructuredAnalysis.script" is checked instead.
...
...
...
...
...
...
...
...
...
...
...
...
As can be seen from the above screen capture, the pop up contains 2 text elements:
...
Based off the results from testing, the source can then be refined until the desired functionality is obtained.
...
...
- Future versions of the tool will allow the documents to be viewed in widgets in the main GUI, providing a much easier interface to validate the source.
...
...
Note that templates are saved into your personal community only, but you can see any templates shared across any of the communities to which you belong. To share a template you have created with one of your communities, use the file uploader.
Info |
---|
Before turning a source into a template, that existing source should be "scrubbed" first (middle right, "Scrub" button) - otherwise the presence of the "_id"/"key" fields will mean that the old source is modified instead of a new one being created. |
...
...
...
...
...
...
Editing sources that have previously been approved may not require further moderation, if only display fields have been modified; otherwise it is suspended pending approval as above.
Note that once a source has been published, its status can be monitored from "<ROOT URL>/InfiniteSourceMonitor.html" (eg http://infinite.ikanow.com/InfiniteSourceMonitor.html), provided you are logged into the main GUI or source builder.
After publishing a share, you should get an alert saying that the source has been published and the working copy "share" has been deleted. If you don't get this alert, then it is likely that an internal configuration error has occurred - contact your system administrator to get it fixed.
...
...
...
...
"Scrubbing" sources
...
...
Enabling/disabling sources
Sources can be disabled by setting their "searchCycle_secs" to a negative number. This button just automates that process.
...
...
...
...
...
...
...
...
...
...
Monitoring sources
There is a graphical utility to monitor sources available from the home page (Source Monitor link). It opens in a new tab and is pictured below. It is not possible to change any source information from this GUI.
A subset of this information can also be accessed from the Source Manager dialog of the main GUI.
The colors have the following meanings:
- Green: successfully harvested ("success")
- Blue: in progress ("in_progress")
- (or has partially harvested, "success_iteration" - means that the most recent harvest cycle completed but not all available documents were harvested because of document/cycle limitations)
- Red: harvested with errors ("error")
- Yellow: not yet seen by a harvester, or currently unapproved.
...
Panel |
---|
Related Reference Documentation: |