Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.
/config/source/test?numReturn=<documents-to-return>&returnFullText=<1|0|true|false> (POST)
Info

Adds a new, or updates an existing, source via either post or get methodsReturns documents harvested and enriched according to the POSTed source object. This call can be used to test and debug source configuration prior to saving the object to be harvested (at which point it becomes harder both to debug and to fix problems, eg by deleting documents).

Authentication

Required, see Auth - Login

Arguments

communityIdnumReturn (requiredoptional)
Group ID you want to associate this source with

json (required)
Source object in its JSON representation... (note to see Source documentation)

Examples

Method.Get

http://infinite.ikanow.com/api/config/source/save/communityid?json={}Number of processed documents to return (defaults to 10). (Note that this does not affect how many documents are harvested, just how many are enriched. Therefore it may still take a while when run on large directories/databases/fileshares with slow IO.)

returnFullText (required)
If "false" (default) or 0, does not return the full text of the document (to make the document easier to read etc). If "true" or "1" returns the populated "fullText" field (which is sometimes necessary, eg to debug the text extraction or cleansing configuration).

Examples

Method.Post

Example using curl:

Code Block
curl -XPOST 'http://infinite.ikanow.com/api/config/source/save/communityidtest?numReturn=1' -d '{ "json": {...} }'

Actionscript

...

Info
Code Block
{
	response: {
		action: "Test Source"
		success: true
		message: "New source added successfully."
		time: 10
	}
}
Code Block
{
	response: {
		action: "Source"
		success: true
		message: "New source added successfully. Note functionally identical sources are also present within your communities, which may waste system resources. returned 4 docs: source=<url> extracted=4 updated=0 deleted=0 urlerrors=0. <Warnings/non-fatal errors>"
		time: 10
	}
	data: {
		// source JSON, including the generated communityIds, key, and _id fields.
	}
}
In this case above, the exact same source (only differing by "display" parameters, owner etc) already exists. It is recommended to delete the source unless you have a specific reason not to.
Error Response
Info
Code Block
{
	response: {
		action: "Test Source"
		success: true
		message: "Source updated successfully."
		time: 10
	}
}
Error Response
Info
Code Block
{
	response: {
		action: "Source"
		success: False
		message: "Unable to add new source."
 returned 0 docs: 4 file error(s).\n\n<List of errors>"
		time: 10
	}
}
Code Block
{
	response: {
		action: "Test Source"
		success: Falsefalse
		message: "UnableSource toerror: update<error source.message>"
		time: 10
	}
}
Common error messages:
  • Unable to serialize Source JSON: indicates that the JSON object POSTED (or passed via URL parameter) is invalid. Try using JSON Lint or a similar tool to debug.
  • The source ID is invalid: For update requests, a source "_id" or "key" has been specified that does not match any in the database (this often happens with adding a source where the "_id" or "key" from a different system has been left in).
  • User does not have permissions to edit sources shared by this community: You are not the owner of this source, nor a community moderator, not a system adminSource error: A major problem occurred before enrichment started, eg authentication or path problems.
  • "successfully returned 0 docs <...>": (0, or < the number desired) Also normally indicates an error occurred during harvesting but on a per document basis rather than for the entire source.
  • "[...] urlerrors=<more than 0>": These errors have occurred during the enrichment process, eg the third party text/entity extractor failed, or there were errors in the structured/unstructured analysis handlers. The lines following this notification are error messages that should give some idea what went wrong. 
    • (Note that errors can appear even if "urlerrors=0", eg non-fatal problems that caused an entity/association to be dropped).
  • (curl/wget returns nothing: Normally an indication that a POST has been used with no URL parameter, or a POST has been used with no content. Alternatively if using a command-line json lint tool, it may indicate that there are characters not handled by the tool or it believes the JSON is invalid.)