Advanced AnalyticsExabeam Advanced Analytics Release Notes

What's New

Google Pub/Sub Log Source Support

If your Advanced Analytics deployment is running on Google Cloud Platform and aggregates all of your logs into Google Pub/Sub, you can now configure Google Cloud Pub/Sub as a log source for Advanced Analytics.

For more information on adding log sources in Advanced Analytics, please refer to the Advanced Analytics Admin Guide.

Support for Granular Rule Reprocessing

When adding new or managing existing Exabeam rule on the Exabeam Rules page, you can choose to reload individual or all rules. You can now choose to apply rule changes and reprocess historic data. When applying and reprocessing rule changes to historic data, the reprocess is done in parallel with active, live processing. It does not impede or stop any real-time analysis.

Window confirming that you wish to reprocess rules, with options to reprocess historic data from a certain date.

Auditing User Activities within Exabeam

Advanced Analytics now logs specific activities related to administrators and users of the product, including activities within the UI as well as configuration and server changes, too. This is primarily needed to monitor administrator and analyst activities from a compliance perspective. Advanced Analytics activity data is collected and sent to the syslog destination of your choosing.

For more information on accessing Advanced AnalyticsData Lake log activity, please refer to the Audit Actions within Exabeam section within the Advanced AnalyticsData Lake Admin Guide.

Changes to Exabeam Analytics Engine

We've made several changes to improve the usability of the Exabeam Analytics Engine.

You can now cancel in-progress reprocessing jobs. This is especially helpful if a particularly large reprocessing job slows the entire system, if you accidentally initiated a reprocessing job, or if you simply want to cancel the job for any other reason.

The window confirming that you want to cancel the reprocessing job, with the Cancel Job button highlighted in red circle.

Additionally, we've updated the Reprocessing Jobs table to provide more details and control of your complete, pending, in progress, canceled, or failed reprocessing jobs.

Exabeam Engine Page.png

For more information on the Analytics Engine in Advanced Analytics, please refer to the Advanced Analytics Admin Guide.

Updated Exabeam Analytics Engine Reprocessing Table

We've improved the Exabeam Analytics page to provide better detail and control on reprocessing jobs. You can now view the status of jobs (for example, completed, in-progress, pending, and canceled), view specific changes and other details regarding a job, and cancel a pending or in-progress job.

Reprocessing_Table.png

For more information on reprocessing with the Exabeam Analytics Engine, please refer to Restart and Reprocess section of the Advanced Analytics Admin Guide.

Model Sizing Improvements

Exabeam helps manage the sizes of your models so that they can no longer cause Advanced Analytics to run out of memory. This is done by setting a max number of bins for categorical models. The new limit is 10 million bins, although some models, such as ones where the feature is "Country" have a lower limit. We have put many guardrails in place to make sure models do not consume excessive memory and impact overall system health and performance. These include setting a maximum limit on bins, enabling aging for models, and verifying data which goes into models to make sure it is valid. If a model is still consuming excessive amounts of memory then we will proceed to disable that model.

Disabled models are displayed on the System Optimization tab of the System Health page. There are two disabled models tables — the first contains a list of models that have been disabled for all entities (Global Models), while the second contains a list of models that have been disabled for specific entities (Model Instances) within Advanced Analytics.

The System Health page on the System Optimization tab.

You are also shown an indicator on the User or Asset Page when a model has been disabled for that profile.

AA-Profile-Page-Disabled-Profile.png

In addition, model aging is now configurable and enabled by default, with a window of 16 weeks. For some models that contain more sensitive or rare data, e.g. models that track executives or privileged users, the cycle is 32 weeks. Model aging considers data samples taken from a certain number of weeks instead of all points since the beginning of time. This process enhances system performance by cleaning out unused or underutilized models.

For more information on the Model Sizing Improvements, please refer to Model Aging in the Configuring Advanced Analytics section and Disabled Models in the Health Status section of the Advanced Analytics Admin Guide.

Top Users and Assets Improvements

As you continue to use Advanced Analytics, you may find that certain high volume users or assets within your organization amass a large number of events of certain event types. When this happens, Exabeam now takes preventive measures to protect the performance of Advanced Analytics by disabling these event types for a specific user or device.

Exabeam helps manage the event type volume by identifying and blacklisting top users and assets. When a specific activity type such as "web" for a certain user or device exceeds 10 million events and a specific event type within that activity type makes up 70% or more of the events in that total sum, then that event type for that user or device will be automatically disabled. If no single event type accounts for over 70% of the total event count in that activity type, then that entity is disabled. These thresholds are configurable.

Disabled event types are displayed on the System Optimization tab of the System Health page. You can see a list of all event types that have been disabled, along with the users and assets for which they have been disabled for.

Disabled Event Types settings under the System Health page's System Optimization tab.

You are also shown an indicator on the User or Asset Page when an event type has been disabled for that profile. The affected User/Asset Risk Trend and Timeline would account for the disabled event type by displaying statistics only for the remaining events.

AA-Profile-Page-Disabled-Profile.png

For more information on the Top Users and Assets Improvements, please refer to System Optimization in the Health Status section of the Advanced Analytics Admin Guide.

Parser Defensiveness

Advanced Analytics automatically identifies poor parser performance and disables such parsers in order to preserve the system health.

We determine the average parse time for each parser in a five minute period. We compare that to a configurable threshold variable in lime.conf. Then we divide each by the total time taken by all parsers in the same five minute period and compare the values to a configurable threshold variable in lime.conf. If the parsers average parse time exceeds the threshold and it exceeds the second threshold of being over a certain percentage of the overall parse time by all parsers then it becomes a candidate for disabling. We perform the same check during a second five minute period and if the same holds true then we proceed to disable the parser.

A slow parser is disabled if its average parsing time is above the parsing time threshold and makes up 50% or more of the total parsing time of all parsers.

Disabled parsers are displayed on the System Optimization tab of the System Health page. You can see a list of all parsers that have been disabled.

Disabled Parsers settings, under the System Health page's System Optimization tab.

You are also shown an indicator when Advanced Analytics determines that a parser is problematic and disables it.

For more information on parser defensiveness, please refer to System Optimization in the Health Status section of the Advanced Analytics Admin Guide.

System Load Redistribution

Exabeam can automatically identify overloaded worker nodes, and then take corrective action by evenly redistributing the event category load across the cluster.

You can enable automatic system load redistribution on the System Optimization tab of the System Health page. This option is enabled by default. Doing so allows the system to check the load distribution once a day.

System Load Redistribution settings, under the System Health page's System Optimization tab, with System Rebalancing toggle switched to enabled.

You are also shown an indicator in the UI when a redistribution of load is needed, is taking place, or has completed.

For more information on System Load Redistribution, please refer to System Optimization in the Health Status section of the Advanced Analytics Admin Guide.

Hadoop Distributed File System (HDFS) Namenode Storage Redundancy

A safeguard has been introduced in the HDFS NameNode (master node), storage to prevent data loss in the case of data corruption. Redundancy is automatically set up for you when you install or upgrade Advanced Analytics and include at least three nodes.

With this feature enabled in the case of the Master NameNode failing the system can still move forward without data loss. In such cases, you can use this redundancy to fix the state of Hadoop (such as installing a new SSD if there was an SSD failure) and successfully restart it.

For more information on HDFS redundancy in Advanced Analytics, please refer to the “HDFS Namenode Storage Redundancy” section within the Advanced Analytics Admin Guide.

Custom Configuration Validation

Any edits you make to your Exabeam custom configuration files are now validated before you are able to restart the analytics engine to apply them to your system. This will help to prevent Advanced Analytics system failures due to inadvertent errors introduced to the config files.

The system validates Human-Optimized Configuration Object Notation (HOCON) syntax, for example, missing a quotes or wrong caps ("SCOREMANAGER" instead of "ScoreManager"). The validation also checks for dependencies such as extended rules in custom config files that are missing dependencies within default config files.

If found, errors are listed by file name during the analytics engine restart attempt.

In addition to helping you troubleshoot your custom config edits, Advanced Analytics also saves the last known working config files. Every time the system successfully restarts, a backup is made and stored for you. Therefore, you can choose to rollback to the last backup if you run into configuration errors that you are unable to fix.

For more information on Custom Configuration Validation in Advanced Analytics, please refer to the “Restart the Analytics Engine” and “Custom Configuration Validation” sections within theAdvanced Analytics Admin Guide.

Calico Implementation

The underlying native docker overlay network technology has been replaced with Calico in Data LakeAdvanced Analytics.

Compared to the native docker overlay technology, Calico brings the following benefits to all Exabeam deployments:

  • Simplicity – Clusters on Docker overlay depend on a Linux kernel technology known as VXLAN which is subject to continuous updates. Calico removes this Linux dependency and uses layer 3 routing, just like non-container networks. Routes are shared using BGP, which is the de facto routing protocol of the Internet (used everywhere in autonomous systems like ISPs for example).

  • Stability – Docker overlay makes use of tunneling over VXLAN to fool the nodes into thinking they are on the same network. Untunneling and translating headers using this approach adds significant latency and unreliability to overlay, especially at scale. Calico removes this by having the first three BGP peers advertise container routes to all other peers creating a redundant routing framework. When these other peers receive the route information, they will update their routing tables in real time for stability.

  • Performance – Use of docker overlay requires encapsulating and decapsulating packets over UDP. This adds overhead to the network stack. It can be reduced by hardware acceleration but it cannot be fully removed. This overhead is completely avoided by Calico as it does not employ any encapsulation. Also, since Calico does not use tunneling, the network performance is equivalent to a native network stack.

  • Security – Calico offers built-in support for network policies that can be controlled by simple metadata on the primary host. To implement the equivalent security in docker overlay (particularly for port restrictions from external services attempting to reach internal containers), complicated firewalled rules must be created, maintained, and terraformed.

Adopting Calico resolves various overlay and network flapping issues with cluster containers that resulted in intermittent network connection breakdowns and UI restarts in Advanced AnalyticsData Lake deployments.

You must meet the following requirements before installing or upgrading to this release:

  • AWS deployments: All nodes MUST have src/dest (source/destination) checks turned off.

  • GCP deployments: Network open to IP protocol 4 (IP in IP) traffic within the cluster.

  • Nodes allow traffic to and from security group to itself.

  • Use a load balancer in front of your cluster and use TCP (not UDP) as a transmission protocol between the load balancer and the Data LakeAdvanced Analytics hosts. A load balancer is required (customer-provided) in front of Data LakeAdvanced Analytics in order to have no downtime for Syslog ingestion during the upgrade.

If you have questions about the prerequisites, please create a support ticket at Exabeam Community to connect with a technical representative who can assist you.

Disaster Recovery Enhancements

Previously, Advanced Analytics disaster recovery only supported a two-cluster (active and standby) deployment scenario. Now, you can configure a multi-cluster disaster recovery deployment involving three or more clusters. In this scenario, you can configure a source (primary) cluster and one or more destination clusters.

The primary cluster is responsible for fetching the logs from SIEM or receiving the logs via Syslog. The secondary cluster(s) are responsible for replicating data from the primary cluster.

For more information on configuring and managing disaster recovery, please refer to the Disaster Recovery section of the Advanced Analytics Admin Guide.

Additional Settings Link

Previously, you could not conveniently access certain tabs of the Advanced Analytics Admin Operations page. Now, we've added an Additional Settings link to the Advanced Analytics settings page to quickly and conveniently access the Admin Operations page.

AA Additional Settings.png

For more information on additional settings in Advanced Analytics, please refer to the Advanced Analytics Admin Guide.

Prevent Searches on Masked Fields

Previously, customers could search for and view non-masked fields, whether or not data masking was enabled for their deployment.

For example, users could successfully search for a non-masked username or account name in Threat Hunter. Additionally, users could copy the URL of a masked username (such as https://<aa_IP>/uba/#user/<obfuscated_name>/...) and simply change the URL to be a valid account name (such as, https://<aa_IP>/uba/#user/<valid_account_name>/...).

Now, users can no longer search for and view non-masked fields by any means if data masking is enabled for their deployment.

For more information on data masking in Advanced Analytics, please refer to the Advanced Analytics Admin Guide.

Reprocessing Job Notifications

You can now configure email and syslog notifications for certain reprocessing job status changes, including start, end, and failure.

Notifications by Product to select Job status changes, and Job failure to send notification.

For more information on configuring job notifications, please refer to the Restart and Reprocess section of the Advanced Analytics Admin Guide.

EDS Memory Enhancements

We have streamlined and optimized memory management in EDS by avoiding duplicates for enhanced performance in context lookup. This improves performance when performing searches, generating reports, and performing other tasks within the Data LakeAdvanced Analytics UI.

Exabeam Threat Intelligence Service Enhancements

We've added a new settings page to provide better control over Exabeam Threat Intelligence Service feeds in your Data LakeAdvanced Analytics deployment. Additionally, you can now easily assign or unassign threat intelligence feeds to/from individual or multiple context tables or create new context tables directly from the page.

Cloud Config in Settings to select Threat Intelligence Feeds.

For more information on configuring and managing Exabeam Threat Intelligence Service, please refer to the Exabeam Threat Intelligence Service Overview section of the Advanced AnalyticsData Lake Admin Guide.

Exabeam Cloud Telemetry Service

Telemetry data such as events, metrics, and environment data is collected from Data Lake and Advanced Analytics deployments and sent to Exabeam Cloud Platform to provide visibility into the overall system health, reduce system health false positives, and enable Exabeam to gain insight into common system issues, such as processing downtime (for example, processing delays and storage issues) and UI/application downtime.

The Exabeam Telemetry Service is enabled by default, following the installation of this version.

Note

If you do not wish to send any data to the Exabeam Cloud, please follow the opt-out instructions listed in the Disabling Telemetry Service in the Data LakeAdvanced Analytics Admin Guide before installing this version. You can also choose to opt-out at any future time in the future if you choose to.

For more information, please refer to Data LakeAdvanced Analytics Admin Guide > Exabeam Cloud Telemetry Service Overview.

Entity Analytics UI Performance Optimization

We have optimized the following UI aspects of Entity Analytics so that they load significantly faster:

  • Notable Assets and asset-based watchlists on the Homepage

  • Security Alerts under the asset profile icon

  • Entity Profile

  • Entity Timeline

Numerical Clustering Improvements

We’ve improved the speed of numerical model calculation, with the exception of calculations involving the time of the week, which should improve the performance of Advanced Analytics. The new algorithm for numerical histogram anomaly detection not only performs much faster, but it also has better accuracy than the old algorithm.

The old algorithm was highly computing intensive, especially since it used clustering to find the abnormal area of the historical data distribution. The new algorithm uses gamma distribution to first estimate where the majority of the data points lie, and then calculates how far the new event is from the boundary of the normal area. The further the new event is from the boundary, the riskier the event is.

Old Algorithm, O(n2)

New Algorithm, O(n)

2019-07-29_14-22-34.png
2019-07-29_14-23-25.png

As previously mentioned, the old algorithm used hierarchical clustering, O(n2) to cluster every point into bins. It then used sorting, O(nlog(n)), to identify the normal bin boundary, b as seen in the above graph example. On the other hand, the new algorithm simply fits the historical points into gamma distribution, O(n), and then calculates the normal area boundary, b, using O(1). The resulting abnormal area is identified as a in both graph examples.

SAML Configuration Settings Link

Previously, the Configure SAML link was located under the Admin Operations tile of the settings page. Now, the Configure SAML link is located under the User Management tile of the settings page.

Confgure SAML in User Management to set links and configure Identity provider.

For more information on configuring SAML, please refer to Advanced AnalyticsData Lake Admin Guide > Configuring SAML.

Added Additional Option for Watchlists Timeframe Filter

We've added Last 2 days to the timeframe filter for Notable Users, Notable Assets, Account Lockouts, and other Watchlists on the homepage.

AA-2-days.png

You can also configure the default timeframe filter value in the application_default.conf file.

For more information on the configurations for the watchlists timeframe filters, please review theAdvanced Analytics Admin Guide.

For more information on the configurations for the watchlists timeframe filters, please refer to the Watchlists Timeframe Filter section of the Advanced Analytics Admin Guide.

MongoDB Retention and Usage Improvements

You can now monitor and configure your Advanced Analytics MongoDB data retention and usage along with your HDFS storage. This information is visible in new panels on the Storage Usage tab within System Health. The MongoDB Data Retention and Usage panels include storage usage of events in the database.

Data on HDFS Data Retention, MongoDB retention, HDFS Usage, and MongoDB Usage on the System Health page's Storage Usage tab.

MongoDB data retention is enabled by default. You can disable/enable data retention and set the capacity used percentage threshold of MongoDB collections to help improve the performance and storage usage of Advanced Analytics. Exabeam does not recommend that you disable this feature.

In addition to the capacity used percentage, Advanced Analytics keeps six months of event data in MongoDB by default.

Note

Advanced Analytics maintains the previous default retention period for existing customers. Customized retention values are also retained.

For more information on the Mongo Retention Improvements, please refer to Storage Usage and Retention in the System Health Page section of the Advanced Analytics Admin Guide or the User Guide.

Retention Limits for Triggered Rules and Sessions Collections

You can now monitor and configure the data retention and usage of your triggered rules and sessions collections within Advanced Analytics MongoDB. This information is visible in an updated panel on the Storage Usage tab within System Health. The MongoDB Data Retention and Usage panels include storage usage of triggered rules and sessions and events in the database.

Data on HDFS Data Retention, MongoDB retention, HDFS Usage, and MongoDB Usage on the System Health page's Storage Usage tab.

MongoDB data retention is enabled by default. You can disable/enable data retention and set the capacity used percentage threshold of MongoDB collections to help improve the performance and storage usage of Advanced Analytics. The default capacity is 85%. Exabeam does not recommend that you disable this feature.

In addition to the capacity used percentage, Advanced Analytics keeps 365 days of triggered rules and containers data and 180 days of event data in MongoDB by default. This is also configurable through the UI.

Note

Advanced Analytics maintains the previous default or customized event retention period for existing customers or 1095 days, whichever is smaller.

For more information on the Retention Limits for Triggered Rules and Sessions Collections, please refer to “View Storage Usage and Retention Settings” in the “Health Status Page” section of the Advanced Analytics Admin Guide.

Improved Martini and Lime Coordination

We have improved how our Ingestion and Processing Engines coordinate together. If Advanced Analytics detects the need to have the Ingestion Engine be restarted along with the Processing Engine restart then the system will prompt the user accordingly.

Performing a coordinated restart will help correctly apply all log feeds and ensure that both systems are running the same configurations. This bundled restart will also help prevent Analytics Engine and Log Ingestion Engine processing failures.

If a Log Ingestion Engine restart is required when you attempt to restart the Analytics Engine, you will be prompted with a dialog box to also restart the Log Ingestion Engine. The below dialog will appear prompting the user to start the Log Ingestion Engine if and only if Advanced Analytics detects a need to have the Log Ingestion Engine restarted. You can choose to decline the restart if you would like the Log Ingestion Engine to finish its current process, but this will cancel the Analytics Engine restart procedure.

Message confirming whether you want to restart the Exabeam log ingestion engine.

For more information on restarting the Exabeam Analytics Engine, please refer to “Restart and Reprocess” section of the Advanced Analytics Admin Guide.

Draft/Published Modes for Log Feeds

We have improved and added a new functionality to the log feed creation capability. Now when you create a new log feed and complete the workflow, you will be asked if you would like to publish the feed. Publishing the feed lets the Analytics Processing Engine know that the respected feed is ready for consumption.

A list of log feeds in the Log Feeds settings.

If you choose to not publish the feed then it will be left in draft mode and will not be picked up by the processing engine. You can always publish a feed that is in draft mode at a later time.

For more information on publishing draft log feeds, please refer to “Log Feeds” section of the Advanced Analytics Admin Guide.

User Engagement Analytics Policy

Exabeam uses user engagement analytics to provide in-app walkthroughs and anonymously analyze user behavior, such as page views and clicks in the UI. This data informs user research and improves the overall user experience of the Exabeam Security Management Platform (SMP). Our user engagement analytics sends usage data from the web browser of the user to a cloud-based service called Pendo.

There are three types of data that our user engagement analytics receives from the web browser of the user. This data is sent to a cloud-based service called Pendo:

  • Metadata – User and account information that is explicitly provided when a user logs in to the Exabeam SMP, such as:

    • User ID or user email

    • Account name

    • IP address

    • Browser name and version

  • Page Load Data – Information on pages as users navigate to various parts of the Exabeam SMP, such as root paths of URLs and page titles.

  • UI Interactions Data – Information on how users interact with the Exabeam SMP, such as:

    • Clicking the Search button

    • Clicking inside a text box

    • Tabbing into a text box

Keep in mind, Exabeam user engagement analytics does not:

  • Record the screen

  • Perform “screen scraping”

  • Capture anything entered into any input field

For more information on the user engagement analytics policy—including instructions to opt out of user engagement analytics—please refer to the Advanced AnalyticsData Lake Administration Guide > Configuring Advanced AnalyticsData Lake > User Engagement Analytics Policy.

Opt Out of User Engagement Analytics

Note

For customers with a Federal license, we disable user analytics by default.

To prevent Exabeam SMP from sending your data to our user analytics:

  1. Access the config file at

    /opt/exabeam/config/common/web/custom/application.conf
  2. Add the following code snippet to the file:

    webcommon {
        app.tracker {
          appTrackerEnabled = false
          apiKey = ""
        }
    }
  3. Run the following command to restart Web Common and apply the changes:

    . /opt/exabeam/bin/shell-environment.bash web-common-restart

Host/IP and Host/Account Mapping Improvements

Advanced Analytics has improved host-to-IP and host-to-account mappings for mapping configuration changes in parallel. This optimization results in lower CPU usage while tying hosts to specific addresses or accounts.

Syslog Output Improvements

The incidents and alerts created by Advanced Analytics and sent out via syslog previously lacked parsed and enriched event details. These syslog notifications have now been enhanced to include additional event fields and reason rule templates.

The notifications include useful event fields that have values associated with them. This is unique to each event type. If the field does not have a value then it is not included in the syslog output.

Example syslog output after enhancement:

2019-12-12T20.Internal.syslog.log:<86> 2019-12-12T20:18:07.973569+00:00 2019-12-12T20:18:07.972Z exabeam-analytics-master Exabeam timestamp="2019-12-12T20:18:04.396Z" id="user745-20140528185224" score="10" user="user745" src_ip="46.117.121.157" event_time="2014-05-28 18:52:24" event_type="vpn-login" host="10.2.0.111" is_session_first="true" rawlog_time="1401313944000" time="1401303144000" source="VPN" vendor="Juniper VPN" lockout_id="NA" session_id="user745-20140528185224" getvalue('isp', src_ip)="013 NetVision Ltd" getvalue('country_code', src_ip)="IL" session_order="1" account="user745" src_network_type="WAN" event_code="Agent login succeeded" rule_id="AE-UA-F-VPN" rule_name="First VPN connection for user" rule_description="First VPN connection for this user" rule_reason="First VPN connection for user745"<86> 2019-10-11T23:05:04.002809+00:00 2019-10-11T23:05:03.998Z exabeam-analytics-master Exabeam timestamp="2019-10-11T23:04:30.556Z" domain="Unspecified" host="host36" score="5" rule_description="This rule is used to identify a user without a significant event history. A user without a history introduces some risk as they have not established a baseline for comparing activities" rule_id="NEW-USER-F" dest_ip="10.2.16.17" dest_host="host37" id="user38-20140504070000" event_time="2014-05-04 07:00:00" event_type="ntlm-logon" rule_name="User with no event history" user="user38"

For more information on syslog output improvements in Advanced Analytics, please refer to the “Set Up Incident Notification” section within the Advanced Analytics Admin Guide.