...
- Plugin manager documentation
- Information about the built-in Javascript engine
- Developer information about building Java Hadoop plugins
- An IKANOW blog post discussing using jsfiddle to visualize custom analytics
- (contains links to some other relevant blog posts about running analytics on Infinit.e datasets, including this one about doing temporal/sentiment analytics on emails)
Exporting the data (and alerting, and backups)
...
Using the API
An important feature of the Infinit.e platform is that it wants data to be open: our User Interfaces and applications use our open RESTful API, so any other client can get the same data.
The primary method of getting at the data is via the query API call, and that linked page shows some examples of making the call in javascript and actionscript. In addition we have a beta (ie undocumented!) Java driver here (that we use internally, so is well supported). There are more, general, examples of using the API in different languages here.
In the context of using the query API to support bulk export of the data, this section of our knowledgebase describes how to use the "curl" command line utility (in Linux, or MacOS, or cygwin on Windows platforms) to script getting all the data out.
MongoDB dumps and backups
The underlying data store for Infinit.e is the popular NoSQL database called MongoDB. If you have ssh access to the server then you can use mongodump or mongoexport to get at the data. This image describes the database format.
It is worth noting whiile discussing Mongo that a nightly backup of the data is generated (at 1am) and stored at "/opt/db-home/" as "db_backup_<<hostname>>_most_recent.tgz". Currently nothing is done with this file (ie it is overwritten nightly). It is recommended that you upload this to S3 regularly (it was not possible to pre-configure this because of AWS restrictions). More details on the backup process are provided here.
GUI utilities
The main GUI provides three ways of saving the data or workspace state (see screenshot below):
- "Copy workspace link to clipboard": This copies a (long!) URL to the clipboard that will return you to the current query, community set, and widget set when pasted into a browser.
- Note this URL is too long for some applications to handle (eg gmail unfortunately) - a forthcoming release will use a link shortener.
- "Create PDF for current data view": This will open a new tab containing a PDF that contains screenshots of all the open widgets together with information about the query that was used.
- (Widgets can be programmed to write more detailed information into the PDF, though currently only the Doc Browser widget takes advantage of this.)
- As an alternative the second section of this blog post describes generating per-widget screenshots. This has been very popular for creating "quickview" presentations.
- "Export JSON for current data": This saves a file to local disk containing the JSON returned from the query. The format is described here.
TODO widget exports
Alerting using RSS
The final option in the "Options" screenshot above ("Create RSS feed for current query") has been very popular with our users
TODO
Further reading:
Importing other sources
TODO complex subject, lots of documentation (gui coming soon), this section just highlights a few of the most relevant possibilities to datasift
...