...
Usage of the "title" string impacts how the web extractor will generate documents. There is a dependency with the links
or splitter
elements which can be specified downstream in the source pipeline. For more information about links
and splitter
see Follow Web links.
Links or Splitter is Not Included:
When neither a links
or splitter
element is included downstream, specifying a "title" for extraUrls
will cause Web Extractor to process the included url as a web page. When no title is specified, the url is treated as an RSS feed. This functionality enables you to mix both RSS and web pages within the same source configuration.
Links or Splitter is Included:
If a links
element is included downstream, specifying a "title" will cause Web Extractor to treat the url as a web page. The original page will be preserved as a document, and links can still be followed based on how the links
element is setup.
...