Follow

How do I know if Analytics Successed or Failed?

Question

How do I know if Analytics has succeeded or failed? 

Summary

This document will provide a method to verify if Analytics has succeeded. Moreover, if Analytics has failed, this KB article will outline how to identify where the failure occurred and extract the relevant error  so that it can searched on Interset Knowledgebase for resolution steps, or alternatively, send the to Interset Support (support@interset.com)

The following nodes will need to be accessed (via web or SSH to the respective nodes):

  • AMBARI (web)
  • MASTER (ANALYTICS) (SSH) 

Steps

Check interset.log

  1. SSH to the MASTER (ANALYTICS) NODE as the Interset User
  2. Type in the following to navigate to the Interset log (/opt/interset/log) directory:
    • cd /opt/interset/log
  3. Look at analytics.log to get the latest log output from analytics, here are some useful example commands:
    • If analytics is running, Type in the following command to print the progress to the console:
      • tail -f analytics.log
    • If analytics has finished, type in the following command to look for “Analytics finished”. The command will also print the 3 lines before and after “Analytics finished”:
      • cat analytics.log | grep -B 3 -A 3 finished
  4. More context on the jobs is provided in the following section.

Check YARN ResourceManagerUI

  1. Open up a web browser and navigate to the Ambari UI URL:
  2. Log in to the Ambari UI as the Ambari admin. The default credentials for the Ambari admin user are as follow:
    • Username: admin
    • Password: admin
  3. Once logged in, click on YARN (from components list)
  4. In YARN, click on Quick Links > ResourceManager UI
  5. The ResourceManager UI lists all spark jobs that have been submitted to the cluster in chronological order as well as their current state and final status.
  6. Below is a list of jobs that constitute a complete analytics run are (in order and by category):
    • Aggregation
      • com.interset.analytics.aggregation.EntityMappingJob
      • com.interset.analytics.scoring.BotClassifierJob
      • com.interset.analytics.aggregation.AggregateJob
    • Training
      • interset.analytics.training.EntityRelationStatsJob
      • interset.analytics.training.ClusteringJob
      • interset.analytics.aggregation.AggregateStatsJob
      • interset.analytics.training.WorkingDaysJob
      • interset.analytics.training.WorkingHoursJob
      • interset.analytics.training.HumanMachineClassifier
      • com.interset.analytics.training.UpdateTrainingPeriodJob
    • Scoring
      • interset.analytics.scoring.DirectAnomaliesJob
      • interset.analytics.scoring.AnomalyStatsJob
      • interset.analytics.scoring.WorkingDaysJob
      • interset.analytics.scoring.WorkingHoursJob
      • interset.analytics.scoring.GenerateAnomaliesJob
      • interset.analytics.scoring.HumanMachineClassifier
      • interset.analytics.scoring.AnomalousEntitiesJob
      • interset.analytics.scoring.AggregateStoryJob
      • com.interset.analytics.scoring.EntityVulnerabilityJob
    • Indexing
      • interset.analytics.indexing.EntityRelationStatsIndexerJob
      • interset.analytics.indexing.EntityStatsIndexerJob
      • interset.analytics.indexing.AnomaliesIndexerJob
      • interset.analytics.indexing.WorkingHoursIndexerJob
      • interset.analytics.indexing.RiskyEntitiesIndexerJob
      • interset.analytics.indexing.ReportingTagsIndexerJob
      • interset.analytics.indexing.RiskScoresIndexerJob
      • interset.analytics.indexing.EntityRelationCountsIndexerJob
      • interset.analytics.indexing.RelationTagsIndexerJob
      • com.interset.analytics.indexing.IndexingAliasAndCleanupJob
    • Total: 29 Jobs 

Check for Analytics failure

  1. If a specific job fails, the ResourceManager UI will mark the State and FinalState as FAILED.
  2. In this specific case, in the ResourceManager UI, click on FAILED, under the Cluster pane. This will filter only on FAILED jobs.
  3. Note the FAILED job names under the Names column, as each job in the latest Analytics run should be investigated accordingly. Each job name are appended with batch/windowed and tenant ID.
    • EXAMPLE: com.interset.analytics.scoring.EntityVulnerabilityJob-batch-<myTID>
  4. To explore the logs for the FAILED job, In the ResourceManager UI that is filtered by FAILED jobs, click on the link in the ID column that corresponds to the FAILED job.
  5. Scroll down to the bottom of the page, and a list of job attempts (Attempt ID column), along with node association are listed. Each attempt will have logs associated under the Logs column.
  6. To investigate further on why the jobs failed, one must click on the Logs link under the Logs column.
    • NOTE: If more than one attempt is listed, please click the Logs link for the latest attempt.
      • EXAMPLE: appattempt_1516727049801_0114_000002
        • appattempt_<epoch_in_milli_secds>_<job_submittd>_<attempt_#>
  7. The Logs page aggregates different log types. With a failing job, the best place to start is looking at the stderr log.
  8. In the Logs page, scroll down and look for the following:
    • Log Type: stderr
  9. Three lines below, it will display Showing XXXX bytes of XXXX total. Click here for the full log. Click on the here link.
  10. Scroll down in the stderr log until a stack trace error is thrown. Please take a look at the error and look up the stack trace error in the Interset Knowledgebase.
  11. If no KBs are found, please contact Interset Support (support@interset.com).

Applies To

  • Interset 5.4.x or higher

 

Was this article helpful?
0 out of 0 found this helpful
Have more questions? Submit a request

Comments

Powered by Zendesk