Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Overview

The Source Editor allows you to perform many useful source management actions.  Using the manager Source Editor you can create, edit, validate, test and publish sources so that the data is ready for visualization.

Source management is intrinsically a complex process (particularly when taking advantage of Infinit.e's customization engine).

Warning

The Source Editor GUI is not currently compatible with IE. It is compatible with Chrome, Firefox, and Safari.

 

Using the Source Editor

Creating a New Source

You can use the Source Editor to create a new source.  There are several ways to create new sources using the Source editorEditor.

  • Using a template
  • Starting with a blank source

Creating a New Source From Template

To create a new source from template

  1. Click on the New Source button in the upper right hand corner of the page.  The Infinit.e.Manager application will forward you to the Create New Source page.
  2. Choose a dropdown option from the "Create a New Source" box, and click on View template.
  3. Fill out the title/description/tags/community fields. 
  4. Click create sourceCreate Source.  

You will be able too modify the source on the next page before it starts running.

 

Info

When copying an existing source into the New Source window, that the existing source should be "scrubbed" first (middle right, "Scrub" button) - otherwise . Otherwise the presence of the "_id"/"key" fields will mean that result in the old source is being modified instead of rather than a new one being created.

Creating a New Source From Blank Template

To create a new source from scratcha blank template

  1. Click on the New Source button in the upper right hand corner of the page.  The Infinit.e.Manager application will forward you to the Create New Source page is displayed.
  2. Fill out the title/description/tags/community fields
  3. Click Create Source.  You can build a source from scratch or paste one in an existing source on the next page.

Editing Existing Sources

When you edit sources you should be aware of the types of documents that appear in the sources list.  Three types of source documents  The following source document types are possible: 

  • Published sources
  • Editable copies pf of published sources
  • Shares that are not yet published.  Shares are denoted by "(*)".
Info

If copying the logic of an existing source, it is recommended to first "scrub" it to remove any server-added fields (particularly "_id" and "key", which can overwrite the existing source).

InfoNote that

 

Info

"private" sources ("isPublic":"false") do not have all fields displayed unless you are an admin, community moderator, or the source owner. In this case, it is likely that testing them ( or using them as the basis for a new source ) will fail. Contact the source owner to get a full copy.

 

 

 

To edit an existing source

  1. Click on the source's name in the list of Sources found on the left hand side of the page.
  2. Edit the source using one of the applicable editor tabs. eg. JSON, JS, SRC UI

For more information about the editor tabs, see section Source Editor Interface.  For detailed information about the Source Builder, see section Source Builder User Interface.

Sharing Sources

By default only you can see your temporary copies of sources (so for example you cannot share links to sources being edited). You can use the file uploader to share sources in either read or read-write :

  • Go to the file uploader , filter on JSON type "source", select your source
  • Share with a community in which your collaborator belongs (and is at least a "content publisher" if you want him to make changes)
  • If you want to provide him with the ability to make changes, set the read accessWarning - there is no automatic synchronization, so if you both make changes at the same time work can be lost

    modes.

    To share a source

    1. Navigate your browser to the File Uploader interface.
    2. For "Filter On" select JSON
    3. Select the source of choice from the filtered results
    4. Share with the community of choice.  the user that you intend to share the source with must be a member of the community, and must have content publisher permissions.

    Info

    When co-authoring sources, there is no automatic synchronization. If two users make changes concurrently, work can be lost.

    Validating the Source Format

    You can use the source editor to check if the JSON is valid.

    To check the Source JSON format is valid

    • Select the "Check Format" button (middle right).

    Info

    The automatic validation does not run on the javascript, only on the JSON.

    For more information, see section Source Editor Interface.

    Testing a Source

    Once a first draft of a source is complete it should be tested to see which documents it extracts and how it enriches the documents with additional metadata, entities, and associations, etc.

    Two parameters can be set for testing the sources:

    • "Full text": by default, the full text of a document is not returned (it can be quite long). For testing text extractors (eg "boilerpipe" vs "none" vs "AlchemyAPI"), or for testing "unstructured analysis" transformations, the text maybe useful or essential though; in these cases, enable this check box.
    • "Number of documents": the maximum number of documents that will be enriched and returned. The smaller the number of documents, the quick the API calls returns.

    To test a source

    1. Configure the test parameters as required.  
    2. Click on the Test Source button to start the testing process.

    Info

    It can take a few minutes for the processed documents to be returned. Temporarily setting the "waitTimeOverride_ms" field of the "rss" object to be 1000 (ie 1s) can be useful during the debug stages.

     

    Info

    Note that the first time you test a source, you are likely to get an error accompanied by a request from the browser to allow/deny the window from launching pop ups. Select "Allow always" or the equivalent, refresh the browser if necessary, and press the test button again.

       3. Review the content of the test results pop-up.

    Based off the results from testing, the source can then be refined until the desired functionality is obtained.

    Saving Sources as Templates

    The Sources page allows you to save sources as templates to streamline the process creating new sources that share common attributes.

    To save a source as a template

    • Click on the Save Source as Template button.  Your new template will be available in the Source Templates drop down on the Create New Source page.

    The template is shared with the source's community - if you don't want to share with anybody else then set the dropdown to be your personal community before saving it as a template.

    Publishing Sources

    Sources need to be "published" to the system in order for the Infinit.e Core Server to begin harvesting. Once you have created and tested a source, or edited and tested an existing source, you can publish the source.

    If you submit (publish) a new source or to a community you do not own, then it is initially added in a "pending" state. An email is sent to the community owners and moderators, and they are given the option of allowing the source or not.

    Editing sources that have previously been approved may not require further moderation, if only display fields have been modified; otherwise it is suspended pending approval as above.

    Note that once a source has been published, its status can be monitored from "<ROOT URL>/InfiniteSourceMonitor.html" (eg http://infinite.ikanow.com/InfiniteSourceMonitor.html), provided you are logged into the main GUI or source builder.

    After publishing a share, you should get an alert saying that the source has been published and the working copy "share" has been deleted. If you don't get this alert, then it is likely that an internal configuration error has occurred - contact your system administrator to get it fixed.

    To publish a source

    1. Click on the Publish Source button. Provided that the source is valid it will be published.
    "

    Reverting

    "

    Sources

    The "revert" button in the top right hand corner of the code editor, for published sources, overwrites the existing temporary share with the current version of the source in the database. This can be useful for 2 reasons:

    • To discard unwanted manual changes 
    • (If there are no changes) to update the "harvest" status block

    To revert a source

    • Click on Revert from the code editor.  The current version of the source in the database overwrites the temporary share.

     

    "

    Scrubbing

    "

    Sources

    Scrubbing sources removes all fields added by the server after publishing, just retaining the actual ingest logic. It should be used before copying/templating.

    To scrub the source

    • From the code editor, click on Scrub.  Any extraneous fields are removed.

     

    Info

    If you accidentally scrub the source and then save it then you can get back to the original published source by just deleting the share and then re-selecting the source.

    Suspending Sources

    You can suspend a source to remove the source and its documents from queries.

    To suspend a source

    • From the source editor, click on Suspend Source.  The source will no longer be searched, and its documents will no longer be made available to queries from the visualization GUI.

     

    Info

    Sources can be suspended by setting their "searchCycle_secs" to a negative number. This button just automates that process.

    Info

    Note that this button only affects the un-published version of the source (ie the corresponding share). The source should be published to apply the change - you are automatically prompted for this.

    Deleting a Source's Documents

    Deleting source docs. will leave the source intact but will delete all of the documents harvested so far. It can only be performed on sources you own unless you are a community moderator or an admin.

    To delete a source's documents

    • From the Source Editor, click on Delete Docs.

     

    Info

    Use with caution. Also for sources with many documents, this operation may take some time (eg 10 minutes for 500,000 documents).

    Deleting Sources or Shares

    You can use the source editor to delete shares or sources.

    To delete a share

    1. Click the delete button in the sources list window.
    2. When prompted, click on Ok.
    • If the share has been published the share is deleted but the published source is left alone and will appear in the Sources list.
    • If the share has not been published the share will simply be deleted and will disappear from the Sources list.

    To delete a source

    1. Click the delete button in the sources list window
    2. When prompted, click on Ok.

     If you confirm the deletion the system will then delete the published source and all harvested documents associated with it.

     

    Info

    Deleting a published source will also delete all documents associated with that source. In some cases those documents will not be retrievable (eg old URLs from an RSS feed). This should therefore be used with caution. Also for sources with many documents, this operation may take some time (eg 10 minutes for 500,000 documents).

     


     

    Panel
    borderColorblack
    bgColorwhite
    titleColorwhite

    In this section:

    Table of Contents
    maxLevel2
    indent12px
    stylenone

    Panel

    Related Reference Documentation: