Information Security Analytics Terminology

Overview

Working with the Information Security Analytics Platform requires understanding the terminology used throughout the system. While this terminology is as industry-standard as possible, the particular way in which Information Security Analytics uses the key terminology is important in understanding how these concepts express themselves on an end-user level.

Definition of Key Terms

Functional Area	Term	Definition
Manager	Users	Users of the Information Security Analytics platform can be broken down into administrative users and additional users. Project creators can be administrators or additional users. The user experience per project is dictated by access to data groups and project items.
	User Groups	Group users together to ease administrative burden when assigning users to projects.
	Data Groups	Group sources together, in order to assign data sources to projects and ease administrative burden.
Sources	Data Sources	Data ingested into the platform is stored in a platform-specific JSON format to enable flexible transformation and enrichment. The Data sources can come from a variety of structured, unstructured or semi-structured sources as per below: Logstash Import Source A lighter weight IKANOW object format for storing logs, term/record volumes, or statistics. A very common use case for logs/records is in Dev. Ops environments where log files need to be filtered and appropriately analyzed. Using the record format it is easy to filter log files, define column names, and determine geo ip information for example. This record data can then be analyzed along with other documents for log analysis and big data cyber analytics within one platform. RSS with NLP You can use the Manager to connect to an RSS feed as an input source. Web pages with NLP Extracts documents from lists of URLs. Yahoo Search API You can use the Manager to connect to the Yahoo Search API. Provides a rich set of premium data APIs and tools that developers and entrepreneurs can use to build custom search engines and innovative experiences. Datasift from S3 Datasift is an aggregation service that streams and enriches tweets, posts, blogs, and news from a variety of social media and other Internet sources. Twitter Search API The Twitter Search API is part of Twitter’s REST API. It allows queries against the indices of recent or popular Tweets and behaves similarly to, but not exactly like the Search feature available in Twitter mobile or web clients. Lookup table builder You can use the Manager to build a lookup table in the Information Security Analytics platform. Building a lookup table describes the process of indicating the JSON share where the lookup table is located, and specifying the Key Field and Header Fields in your lookup table. Lookup Table Applier Applying a lookup table describes the process of indicating the lookup table name to apply, as well as the Data Group to apply the lookup table, and the record type (eg. Apache). Advanced Source Builder You can use the Advanced source builder to edit the raw JSON of the Source Pipeline Elements directly in the browser.
	Source Builder	UI elements enabling non-power users to easily import data sources with baseline configuration parameters.
	Advanced Source Builder	Enables the platform document JSON format to be directly edited for advanced configuration. Enables integration of advanced Javascripting techniques.
	Publish	Once the sources have been optimally configured they are Published. Publishing is the last stage of source ingestion. Published sources can be monitored and the data sources they comprise can be viewed using Dashboards. Once a source is published, the platform will begin harvesting. It is possible to view the status of the source to see if it has successfully been harvested (Possible Status codes: "success," "in_progress," "error," "unapproved").
Collaboration	Project Workspace	An individual or shared area that enables a user to create their own collection of analytics and metrics to perform analysis of external/internal information. Users can share, edit, save a copy or save as a workspace template for future use of the structure.
Collaboration	Share	Once a workspace has been created, it can be shared with others.
Search	Basic	Once data has been ingested and harvested, it can be queried/searched to return useful results. There are a number of different object formats that can be returned by an IKANOW query. Documents: Proprietary JSON formatted documents that are returned to the platform as the result of a query. The document is a representation of the source data after it has passed through the various stages in the source pipeline: Ingest, Transform, Enrich, and Output. Documents contain their own fields as well as sub-objects such as entities, associations, metadata, and query enrichment. These objects are useful for visualization using Dashboards and score cards. Logs/Records: A lighter weight object format for storing logs, term/record volumes, or statistics. A very common use case for logs/records is in Dev. Ops environments where log files need to be filtered and appropriately analyzed. Using the record format it is easy to filter log files, define column names, and determine geo ip information, for example. This record data can then be analyzed along with other documents for log analysis and big data cyber analytics within one platform. Events: Events are the real-world actions that lead to the creation of records. For the purpose of example, consider the following: In a Dev. Ops environment an incoming email message is accepted or rejected, an outgoing message is delivered or rejected (spam? bounce?), the message is opened or a link in the message is clicked on, the recipient wants to unsubscribe. Beside the kind of event, each of these events can have meta data such as message sender, recipient address, the message-id, SMTP error codes, link URLs, geographic information, etc. Every event basically comes down to a timestamp and a number of fields with their values. Search Results Filtering: Documents can be filtered using a diverse set of criteria. For example, using the Advanced Settings you can filter specific entities and associations (sub-objects of the document JSON type). Entities Search results can be easily filtered by entity type so that only documents including those entity types are returned to Dashboards, score cards etc. For example, filter by person, company, product, location etc. Tags When sources are added to the platform tags can be applied. These tags can then be used to limit a query to a subset of documents within a Project based on document tags. Verb Categories You can filter returned associations by using verb types. For example, you can only return associations with the verb category "travel," to encompass associations with verbs such as "flew" and "drove". Weightings In scoring, weightings enable you to further alter query output results. For example, you can weight the scoring algorithm towards returning more significant results, rather then relevant results. This is represented as a ratio between Significance:relevance.
	Advanced	Use the Advanced Options to set document analysis thresholds, scoring weights, geo decay, and other settings which impact the overall functionality of the platform and the widgets. Using advanced settings it is possible to exclude entire sets of entity types and association types from analysis.
	Advanced Query Builder	For more advanced searches you can use the Advanced Search interface. This will allow you to construct very complex search criteria, adding in additional search terms, associations, and geographic or date parameters. You can move or nest elements of the advanced query chain, and apply constraints around specific timelines and geographic locations.
	Shared Objects	Collections - share groups of documents, events, logs You can use Collections to help group and manage your documents. Collections (Buckets) can be used to group selected documents together, like a folder system. A queue is away of segmenting documents by query term in an automated fashion. When creating a queue, the system takes the given query term as the basis of a periodic job that will run and group documents resulting from that query. 2. Shared Searches Anybody with access to a specific workspace can share access to the collections/cues.