The Source Editor allows you to perform many useful source management actions. Using the Source Editor you can create, edit, validate, test and publish sources so that the data is ready for visualization.
The Source Editor GUI is not currently compatible with IE. It is compatible with Chrome, Firefox, and Safari. |
If you publish a new source, or submit (publish) a source to a community you do not own, the source is initially added in a "pending" state. An email is sent to the community owners and moderators for approval.
Changes to display fields for previously approved sources do not require further approval.
Once a source has been published, you can monitor its status using the Source Monitor.
After publishing a share, you should get an alert saying that the source has been published and the working copy "share" has been deleted. If you don't get this alert, then it is likely that an internal configuration error has occurred and you should contact your system administrator.
You can use the Source Editor to create a new source. There are several ways to create new sources using the Source Editor.
To create a new source from template
You will be able too modify the source on the next page before it starts running.
When copying an existing source into the New Source window, the existing source should be "scrubbed" first. Otherwise the presence of the "_id"/"key" fields will result in the old source being modified rather than a new one being created. |
To create a new source from a blank template
When you edit sources you should be aware of the types of documents that appear in the sources list. The following source document types are possible:
Authorization Requirements:
"private" sources ("isPublic":"false") do not have all fields displayed unless you are an admin, community moderator, or the source owner. In this case, it is likely that testing them or using them as the basis for a new source will fail. Contact the source owner to get a full copy.
To edit an existing source
For more information about the editor tabs, see section Source Editor Interface. For detailed information about the Source Builder, see section Source Builder User Interface.
By default only you can see your temporary copies of sources (so for example you cannot share links to sources being edited). You can use the File Uploader to share sources in either read or read-write modes.
To share a source
When co-authoring sources, there is no automatic synchronization. If two users make changes concurrently, work can be lost. |
You can use the source editor to check if the JSON is valid.
To check the Source JSON format is valid
The automatic validation does not run on the javascript, only on the JSON. |
For more information, see section Source Editor Interface.
Once a first draft of a source is complete it should be tested to see which documents it extracts and how it enriches the documents with additional metadata, entities, and associations, etc.
Two parameters can be set for testing the sources:
"Full text": By default, the full text of a document is not returned because it can be quite long. However, for testing text extractors (eg "boilerpipe" vs "none" vs "AlchemyAPI"), or for testing metadata generation, it may be useful to enable "Full text" mode.
"Number of documents": the maximum number of documents that will be enriched and returned. The smaller the number of documents, the quicker the API call is returned.
To test a source
Based off the results from testing, the source can then be refined until the desired functionality is obtained.
The first time you test a source, you are likely to get an error accompanied by a request from the browser to allow/deny the window from launching pop ups. |
The Sources page allows you to save sources as templates to streamline the process creating new sources that share common attributes.
To save a source as a template
Authorization Requirements:
The template is shared with the source's community - if you don't want to share with anybody else then set the dropdown to be your personal community before saving it as a template.
Sources need to be "published" to the system in order for the Community Edition Core Server to begin harvesting. Once you have created and tested a source, or edited and tested an existing source, you can publish the source.
To publish a source
The "revert" button in the top right hand corner of the code editor, for published sources, overwrites the existing temporary share with the current version of the source in the database. This can be useful for two reasons:
To revert a source
Scrubbing sources removes all fields added by the server after publishing, just retaining the actual ingest logic. It should be used before copying/templating.
To scrub the source
If you accidentally scrub the source and then save it then you can get back to the original published source by just deleting the share and then re-selecting the source |
You can suspend a source to remove the source and its documents from queries.
To suspend a source
Note that this button only affects the un-published version of the source (ie the corresponding share). The source should be published to apply the change - you are automatically prompted for this. |
Deleting source docs. will leave the source intact but will delete all of the documents harvested so far. It can only be performed on sources you own unless you are a community moderator or an admin.
To delete a source's documents
Use with caution. Also for sources with many documents, this operation may take some time (eg 10 minutes for 500,000 documents). |
You can use the Source Editor to delete shares or sources.
To delete a share
To delete a source
If you confirm the deletion the system will then delete the published source and all harvested documents associated with it.
Deleting a published source will also delete all documents associated with that source. In some cases those documents will not be retrievable (eg old URLs from an RSS feed). This should therefore be used with caution. Also, for sources with many documents, this operation may take some time (eg 10 minutes for 500,000 documents). |
In this section: |
Related Reference Documentation: |