Skip to end of metadata
Go to start of metadata

You are viewing an old version of this page. View the current version.

Compare with Current View Page History

« Previous Version 4 Next »

Overview

This toolkit element allows you to use regex or javascript to set the document metadata fields (eg title, description, publishedDate).

TODO

Format

TODO convert to JSON

{
	"display": string,
	"docMetadata": {} // see DocumentSpecPojo below
}
//////////////////////////////////
 
	public static class DocumentSpecPojo {
		public String title; // The string expression or $SCRIPT(...) specifying the document title
		public String description; // The string expression or $SCRIPT(...) specifying the document description
		public String publishedDate; // The string expression or $SCRIPT(...) specifying the document publishedDate
		public String fullText; // The string expression or $SCRIPT(...) specifying the document fullText
		public String displayUrl; // The string expression or $SCRIPT(...) specifying the document displayUrl
		public Boolean appendTagsToDocs; // if true (*NOT* default) source tags are appended to the document 
		public StructuredAnalysisConfigPojo.GeoSpecPojo geotag; // Specify a document level geo-tag
	}

Legacy documentation:

TODO

Description

Legacy documentation:

TODO

The following formats are currently supported:

		if (null == _allowedDatesArray_startsWithLetter) 
		{
			_allowedDatesArray_startsWithLetter = new String[] {
					DateFormatUtils.SMTP_DATETIME_FORMAT.getPattern(),
					
					"MMM d, yyyy hh:mm a",
					"MMM d, yyyy HH:mm",
					"MMM d, yyyy hh:mm:ss a",
					"MMM d, yyyy HH:mm:ss",
					"MMM d, yyyy hh:mm:ss.SS a",
					"MMM d, yyyy HH:mm:ss.SS",
					
					"EEE MMM dd HH:mm:ss zzz yyyy",
					"EEE MMM dd yyyy HH:mm:ss zzz",
					"EEE MMM dd yyyy HH:mm:ss 'GMT'Z (zzz)",					
			};					
			_allowedDatesArray_numeric_1 = new String[] {
					"yyyy-MM-dd'T'HH:mm:ss'Z'",
					DateFormatUtils.ISO_DATE_FORMAT.getPattern(),
					DateFormatUtils.ISO_DATE_TIME_ZONE_FORMAT.getPattern(),
					DateFormatUtils.ISO_DATETIME_FORMAT.getPattern(),
					DateFormatUtils.ISO_DATETIME_TIME_ZONE_FORMAT.getPattern()
			};
			_allowedDatesArray_numeric_2 = new String[] {					
					"yyyyMMdd",
					"yyyyMMdd hh:mm a",
					"yyyyMMdd HH:mm",
					"yyyyMMdd hh:mm:ss a",
					"yyyyMMdd HH:mm:ss",
					"yyyyMMdd hh:mm:ss.SS a",
					"yyyyMMdd HH:mm:ss.SS",
					// Julian, these are unlikely
					"yyyyDDD",
					"yyyyDDD hh:mm a",
					"yyyyDDD HH:mm",
					"yyyyDDD hh:mm:ss a",
					"yyyyDDD HH:mm:ss",
					"yyyyDDD hh:mm:ss.SS a",
					"yyyyDDD HH:mm:ss.SS",
				};
			_allowedDatesArray_stringMonth = new String[] {
					"dd MMM yy",
					"dd MMM yy hh:mm a",
					"dd MMM yy HH:mm",
					"dd MMM yy hh:mm:ss a",
					"dd MMM yy HH:mm:ss",
					"dd MMM yy hh:mm:ss.SS a",
					"dd MMM yy HH:mm:ss.SS",
				};
			_allowedDatesArray_numericMonth = new String[] {
					"MM dd yy",
					"MM dd yy hh:mm a",
					"MM dd yy HH:mm",
					"MM dd yy hh:mm:ss a",
					"MM dd yy HH:mm:ss",
					"MM dd yy hh:mm:ss.SS a",
					"MM dd yy HH:mm:ss.SS",
			};
		}

 

If the date doesn't match one of these formats, add a function along the following lines in the globals script:

// substitue YOUR.DATE.FIELD, and the date format
function createPubDate(metadata) {
    var date = metadata.YOUR.DATE.FIELD;
    var parsedDate = new java.text.SimpleDateFormat('MM/dd/yyyy hh:mm:ss a (zzz)').parse(date);
    return '' + parsedDate.toString();
}

and then you can call it from the docMetadata.publishedDate field like:

{
	"docMetadata": {
		//...	
		publishedDate: "$SCRIPT( createPubDate(_doc.metadata) );
		//...
	}
}

Examples

TODO

  • No labels