Upgrade to 7.11 for elastic and kibana resulted in a broken nsm [solved]

Today ‘yum update && yum upgrade’ was issued with upgrades available for elasticsearch, filebeat, kibana and others.

Kibana server is not ready yet

The below message suggests to delete indexes but this is reportedly the best way to destroy all dashboards etc.

“Another Kibana instance appears to be migrating the index. Waiting for that migration to complete. If no other Kibana instance is attempting migrations, you can get past this message by deleting index .kibana_2 and restarting Kibana.”

tail /var/log/kibana/kibana.log

{“type”:“log”,"@timestamp":“2021-02-13T14:36:40+00:00”,“tags”:[“warning”,“plugins”,“encryptedSavedObjects”,“config”],“pid”:17465,“message”:“Generating a random key for xpack.encryptedSavedObjects.encryptionKey. To decrypt encrypted saved objects attributes after restart, please set xpack.encryptedSavedObjects.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.”}
{“type”:“log”,"@timestamp":“2021-02-13T14:36:40+00:00”,“tags”:[“warning”,“plugins”,“fleet”],“pid”:17465,“message”:“Fleet APIs are disabled because the Encrypted Saved Objects plugin uses an ephemeral encryption key. Please set xpack.encryptedSavedObjects.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.”}
{“type”:“log”,"@timestamp":“2021-02-13T14:36:40+00:00”,“tags”:[“warning”,“plugins”,“actions”,“actions”],“pid”:17465,“message”:“APIs are disabled because the Encrypted Saved Objects plugin uses an ephemeral encryption key. Please set xpack.encryptedSavedObjects.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.”}
{“type”:“log”,"@timestamp":“2021-02-13T14:36:40+00:00”,“tags”:[“warning”,“plugins”,“alerts”,“plugins”,“alerting”],“pid”:17465,“message”:“APIs are disabled because the Encrypted Saved Objects plugin uses an ephemeral encryption key. Please set xpack.encryptedSavedObjects.encryptionKey in the kibana.yml or use the bin/kibana-encryption-keys command.”}
{“type”:“log”,"@timestamp":“2021-02-13T14:36:40+00:00”,“tags”:[“info”,“plugins”,“monitoring”,“monitoring”],“pid”:17465,“message”:“config sourced from: production cluster”}
{“type”:“log”,"@timestamp":“2021-02-13T14:36:40+00:00”,“tags”:[“info”,“savedobjects-service”],“pid”:17465,“message”:“Waiting until all Elasticsearch nodes are compatible with Kibana before starting saved objects migrations…”}
{“type”:“log”,"@timestamp":“2021-02-13T14:36:40+00:00”,“tags”:[“info”,“savedobjects-service”],“pid”:17465,“message”:“Starting saved objects migrations”}
{“type”:“log”,"@timestamp":“2021-02-13T14:36:40+00:00”,“tags”:[“info”,“savedobjects-service”],“pid”:17465,“message”:“Creating index .kibana_2.”}
{“type”:“log”,"@timestamp":“2021-02-13T14:36:40+00:00”,“tags”:[“warning”,“savedobjects-service”],“pid”:17465,“message”:“Unable to connect to Elasticsearch. Error: resource_already_exists_exception”}
{“type”:“log”,"@timestamp":“2021-02-13T14:36:40+00:00”,“tags”:[“warning”,“savedobjects-service”],“pid”:17465,“message”:“Another Kibana instance appears to be migrating the index. Waiting for that migration to complete. If no other Kibana instance is attempting migrations, you can get past this message by deleting index .kibana_2 and restarting Kibana.”}

I think was trying to process saved object changes and it got unhappy with one of them. To fix this, stop kibana using sudo systemctl stop kibana then run curl XDELETE localhost:9200/.kibana_2 to delete the migration index. You may have to delete one called task_manager also. The error log will let you know. From there restart kibana sudo systemctl start kibana.

Hope this helps. :slight_smile:

1 Like

Thanks. Unfamiliar with how rocknsm is developed i wanted to make sure i don’t loose all dashboards etc

that workes somewhat, now elastic is available but i get many red Error mentions, this due to [esaggs] > Unexpected token < in JSON at position 0

The list of indices (curl ‘localhost:9200/_cat/indices?v’ | grep kibana) show
.kibana_task_manager_1
.kibana_2
.kibana_1

According to multiple sources this [esaggs] > Unexpected token < in JSON at position 0 is due to a bug in elasticsearch 7.x

Since i upgraded to 7.11 and am now experiencing this issue i’m concerned this may affect many other users

My main question besides finding a fix for the now broken kibana/rocknsm is just what i can delete and what not.

Deleting all three indices appears risky, if there is no risk i can go ahead.

Even after i courageously deleted .kibana_task_manager_1 the Error remains.

The origin for error state mentioned everywhere is found when navigating ‘Inspect’ ‘View:Requests’ ‘Response’ all Error fields are found to be generated because of

{
“message”: “Unexpected token < in JSON at position 0”,
“code”: “STREAM”
}

this is most likely the initial failing request

{
“aggs”: {
“2”: {
“terms”: {
“field”: “event.dataset”,
“order”: {
“_count”: “desc”
},
“size”: 20
}
}
},
“size”: 0,
“fields”: [
{
“field”: “@timestamp”,
“format”: “date_time”
},
{
“field”: “event.created”,
“format”: “date_time”
},
{
“field”: “event.end”,
“format”: “date_time”
},
{
“field”: “event.start”,
“format”: “date_time”
},
{
“field”: “file.accessed”,
“format”: “date_time”
},
{
“field”: “file.changed”,
“format”: “date_time”
},
{
“field”: “file.created”,
“format”: “date_time”
},
{
“field”: “file.ctime”,
“format”: “date_time”
},
{
“field”: “file.modified”,
“format”: “date_time”
},
{
“field”: “file.mtime”,
“format”: “date_time”
}
],
“script_fields”: {
“pcap.query”: {
“script”: {
“source”: “if (doc[‘event.category’].value == ‘network’) { \n if (doc[‘client.ip’].size() != 0 && doc[‘client.port’].size() != 0 && doc[‘server.ip’].size() != 0 && doc[‘server.port’].size() != 0) {\n String api_endpoint = ‘/app/docket/api/uri/’;\n String host1 = doc[‘client.ip’].value;\n String host2 = doc[‘server.ip’].value;\n String port1 = doc[‘client.port’].value.toString();\n String port2 = doc[‘server.port’].value.toString();\n String timestamp = doc[’@timestamp’].value.toString();\n timestamp = timestamp.substring(0,timestamp.indexOf(‘Z’));\n String begin = timestamp.substring(0,timestamp.indexOf(’.’)) + ‘Z’;\n \n return api_endpoint + \n ‘host/’ + host1 + ‘/port/’ + port1 +\n ‘/host/’ + host2 + ‘/port/’ + port2 + \n ‘/after/’ + begin + ‘/’;\n }\n return ‘’;\n}\nreturn ‘’;”,
“lang”: “painless”
}
}
},
“stored_fields”: [
“*”
],
“_source”: {
“excludes”: []
},
“query”: {
“bool”: {
“must”: [],
“filter”: [
{
“match_all”: {}
},
{
“range”: {
@timestamp”: {
“gte”: “2021-02-15T16:49:46.698Z”,
“lte”: “2021-02-15T21:49:46.698Z”,
“format”: “strict_date_optional_time”
}
}
}
],
“should”: [],
“must_not”: []
}
}
}

More today, still no sight of any fix or proper approach to analyse.

What i did found is there remain empty stores all over since yesterday when everything went belly up.

curl -X GET “localhost:9200/_cat/recovery?pretty” | grep -v existing

% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 27840 100 27840 0 0 297k 0 --:–:-- --:–:-- --:–:-- 298k
.monitoring-kibana-7-mb-2021.02.16 0 74ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%
.monitoring-kibana-7-mb-2021.02.15 0 318ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%
ecs-suricata-network-2021.02.16 0 68ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%
ecs-zeek-file-2021.02.16 0 102ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%
ecs-zeek-miscellaneous-2021.02.16 0 180ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%
.monitoring-es-7-mb-2021.02.16 0 107ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%
ecs-zeek-observations-2021.02.16 0 263ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%
ecs-zeek-diagnostic-2021.02.16 0 268ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%
ecs-zeek-detection-2021.02.16 0 193ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%
fsf-2021.02.16 0 93ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%
ecs-zeek-network-2021.02.16 0 172ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%
.monitoring-logstash-7-2021.02.16 0 26ms empty_store done n/a n/a 127.0.0.1 donion n/a n/a 0 0 0.0% 0 0 0 0.0% 0 0 0 100.0%

Found a number of things.

  • Errors with kafka shutdown
  • Errors in logstash

After a very messy procedure of redeploying the configuration the elasticsearch stores are now no longer empty.

curl -X GET “localhost:9200/_cat/recovery?pretty” | grep -v existing

I now have a small success in that a different error emerged

Error: Unable to completely restore the URL, be sure to use the share functionality.
at throwUnableToRestoreUrlError (https://myhost/37897/bundles/plugin/kibanaUtils/kibanaUtils.plugin.js:1:131452)
at retrieveState (https://myhost/37897/bundles/plugin/kibanaUtils/kibanaUtils.plugin.js:1:131707)
at decodeState (https://myhost/37897/bundles/plugin/kibanaUtils/kibanaUtils.plugin.js:1:132744)
at https://myhost/37897/bundles/plugin/kibanaUtils/kibanaUtils.plugin.js:1:137702
at Array.forEach ()
at getStatesFromKbnUrl (https://myhost/37897/bundles/plugin/kibanaUtils/kibanaUtils.plugin.js:1:137669)
at getStateFromKbnUrl (https://myhost/37897/bundles/plugin/kibanaUtils/kibanaUtils.plugin.js:1:137803)
at Object.get (https://myhost/37897/bundles/plugin/kibanaUtils/kibanaUtils.plugin.js:1:144327)
at new dashboard_state_manager_DashboardStateManager (https://myhost/37897/bundles/plugin/dashboard/dashboard.chunk.1.js:1:166224)
at https://myhost/37897/bundles/plugin/dashboard/dashboard.chunk.1.js:1:181150

Logstash does have repeat errors which may be causing prior mentioned Errors

[2021-02-17T10:04:30,518][WARN ][logstash.outputs.elasticsearch][main][69cac04d8840be61aebbf67aa2bed8a1be4c0b3c64964875fa4227f5ff006f76] Could not index event to Elasticsearch. {:status=>400, :action=>[“index”, {:_id=>nil, :_index=>“ecs-suricata-network-2021.02.17”, :routing=>nil, :_type=>"_doc"}, #LogStash::Event:0x390e9473], :response=>{“index”=>{"_index"=>“ecs-suricata-network-2021.02.17”, “_type”=>"_doc", “_id”=>“AUVyr3cBMqyGnjK6xUu1”, “status”=>400, “error”=>{“type”=>“mapper_parsing_exception”, “reason”=>“failed to parse field [tls.client.ja3] of type [keyword] in document with id ‘AUVyr3cBMqyGnjK6xUu1’. Preview of field’s value: ‘{string=768,10-255, hash=79c9e26fe870347aca15a4b6b6aea6d0}’”, “caused_by”=>{“type”=>“illegal_state_exception”, “reason”=>“Can’t get text on a START_OBJECT at 1:282”}}}}}
[2021-02-17T10:04:30,520][WARN ][logstash.outputs.elasticsearch][main][69cac04d8840be61aebbf67aa2bed8a1be4c0b3c64964875fa4227f5ff006f76] Could not index event to Elasticsearch. {:status=>400, :action=>[“index”, {:_id=>nil, :_index=>“ecs-suricata-network-2021.02.17”, :routing=>nil, :_type=>"_doc"}, #LogStash::Event:0x75e9a86], :response=>{“index”=>{"_index"=>“ecs-suricata-network-2021.02.17”, “_type”=>"_doc", “_id”=>“A0Vyr3cBMqyGnjK6xUu3”, “status”=>400, “error”=>{“type”=>“mapper_parsing_exception”, “reason”=>“failed to parse field [tls.client.ja3] of type [keyword] in document with id ‘A0Vyr3cBMqyGnjK6xUu3’. Preview of field’s value: ‘{string=768,10-255, hash=79c9e26fe870347aca15a4b6b6aea6d0}’”, “caused_by”=>{“type”=>“illegal_state_exception”, “reason”=>“Can’t get text on a START_OBJECT at 1:101”}}}}}
[2021-02-17T10:04:30,520][WARN ][logstash.outputs.elasticsearch][main][69cac04d8840be61aebbf67aa2bed8a1be4c0b3c64964875fa4227f5ff006f76] Could not index event to Elasticsearch. {:status=>400, :action=>[“index”, {:_id=>nil, :_index=>“ecs-suricata-network-2021.02.17”, :routing=>nil, :_type=>"_doc"}, #LogStash::Event:0xf25d110], :response=>{“index”=>{"_index"=>“ecs-suricata-network-2021.02.17”, “_type”=>"_doc", “_id”=>"_kVyr3cBMqyGnjK6xUq1", “status”=>400, “error”=>{“type”=>“mapper_parsing_exception”, “reason”=>“failed to parse field [tls.client.ja3] of type [keyword] in document with id ‘_kVyr3cBMqyGnjK6xUq1’. Preview of field’s value: ‘{string=768,10-255, hash=79c9e26fe870347aca15a4b6b6aea6d0}’”, “caused_by”=>{“type”=>“illegal_state_exception”, “reason”=>“Can’t get text on a START_OBJECT at 1:101”}}}}}
[2021-02-17T10:04:30,520][WARN ][logstash.outputs.elasticsearch][main][69cac04d8840be61aebbf67aa2bed8a1be4c0b3c64964875fa4227f5ff006f76] Could not index event to Elasticsearch. {:status=>400, :action=>[“index”, {:_id=>nil, :_index=>“ecs-suricata-network-2021.02.17”, :routing=>nil, :_type=>"_doc"}, #LogStash::Event:0x50e851db], :response=>{“index”=>{"_index"=>“ecs-suricata-network-2021.02.17”, “_type”=>"_doc", “_id”=>“BEVyr3cBMqyGnjK6xUu4”, “status”=>400, “error”=>{“type”=>“mapper_parsing_exception”, “reason”=>“failed to parse field [tls.client.ja3] of type [keyword] in document with id ‘BEVyr3cBMqyGnjK6xUu4’. Preview of field’s value: ‘{string=768,10-255, hash=79c9e26fe870347aca15a4b6b6aea6d0}’”, “caused_by”=>{“type”=>“illegal_state_exception”, “reason”=>“Can’t get text on a START_OBJECT at 1:101”}}}}}
[2021-02-17T10:04:30,521][WARN ][logstash.outputs.elasticsearch][main][69cac04d8840be61aebbf67aa2bed8a1be4c0b3c64964875fa4227f5ff006f76] Could not index event to Elasticsearch. {:status=>400, :action=>[“index”, {:_id=>nil, :_index=>“ecs-suricata-network-2021.02.17”, :routing=>nil, :_type=>"_doc"}, #LogStash::Event:0x295819b2], :response=>{“index”=>{"_index"=>“ecs-suricata-network-2021.02.17”, “_type”=>"_doc", “_id”=>“BUVyr3cBMqyGnjK6xUu4”, “status”=>400, “error”=>{“type”=>“mapper_parsing_exception”, “reason”=>“failed to parse field [tls.client.ja3] of type [keyword] in document with id ‘BUVyr3cBMqyGnjK6xUu4’. Preview of field’s value: ‘{string=768,10-255, hash=79c9e26fe870347aca15a4b6b6aea6d0}’”, “caused_by”=>{“type”=>“illegal_state_exception”, “reason”=>“Can’t get text on a START_OBJECT at 1:282”}}}}}
[2021-02-17T10:05:28,166][WARN ][logstash.outputs.elasticsearch][main][69cac04d8840be61aebbf67aa2bed8a1be4c0b3c64964875fa4227f5ff006f76] Could not index event to Elasticsearch. {:status=>400, :action=>[“index”, {:_id=>nil, :_index=>“ecs-suricata-network-2021.02.17”, :routing=>nil, :_type=>"_doc"}, #LogStash::Event:0x100cbfa4], :response=>{“index”=>{"_index"=>“ecs-suricata-network-2021.02.17”, “_type”=>"_doc", “_id”=>“dkVzr3cBMqyGnjK6pnvl”, “status”=>400, “error”=>{“type”=>“mapper_parsing_exception”, “reason”=>“failed to parse field [tls.client.ja3] of type [keyword] in document with id ‘dkVzr3cBMqyGnjK6pnvl’. Preview of field’s value: ‘{string=771,49188-49192-61-49190-49194-107-106-49162-49172-53-49157-49167-57-56-49187-49191-60-49189-49193-103-64-49161-49171-47-49156-49166-51-50-49196-49195-49200-157-49198-49202-159-163-49199-156-49197-49201-158-162-255,10-11-13-23-0,23-24-25-9-10-11-12-13-14-22,0, hash=d2935c58fe676744fecc8614ee5356c7}’”, “caused_by”=>{“type”=>“illegal_state_exception”, “reason”=>“Can’t get text on a START_OBJECT at 1:295”}}}}}

After reading JavaScript Error Handling: Unexpected Token i am now convinced somewhere there is simply a SYNTAX ERROR

The Technical Rundown

All JavaScript error objects are descendants of the Error object, or an inherited object therein.

The SyntaxError object is inherited from the Error object.
The Unexpected Token error is a specific type of SyntaxError object.

After installing ROCK with 7.11.1 there is a separate issue that I think is related to the one mentioned above. In Discover there is a red bar in the search result window:

JSON Parse error: Unrecognized token '<'

There are Zeek and Suricata indexes in Elasticsearch.

yes, i think so too. Know it took a redeployment to have elastic index events again.

These do seem related indeed. I’ve yet to find an approach to analysing this, using the browser debugger is no fun since i don’t know where to look.

If you also see ‘Error’ in the dashboard or elsewhere, click on ‘Inspect’ and then select ‘view:Requests’ then select response.

Creating a dashboard shows [esaggs] > Unexpected token < in JSON at position 0 so i’m pretty sure this is something broken in kibana.

I think this query simulates what causes Error in Kibana but here it shows no error at all

curl -XGET “http://127.0.0.1:9200/_search” -H ‘Content-Type: application/json’ ‘copied request from inside Inspect’ | more

Enabled kibana log query by inserting the below into kibana.yml

### query logging
elasticsearch.logQueries: true
logging.verbose: true

found

{“type”:“log”,"@timestamp":“2021-02-18T23:03:20+00:00”,“tags”:[“error”,“elasticsearch”,“data”],“pid”:29521,“message”:“409\nPUT /.kibana_task_manager/_create/task%3AActions-actions_telemetry?refresh=false\n{“task”:{“taskType”:“actions_telemetry”,“state”:”{}",“params”:"{}",“attempts”:0,“scheduledAt”:“2021-02-18T23:03:19.512Z”,“startedAt”:null,“retryAt”:null,“runAt”:“2021-02-18T23:03:19.512Z”,“status”:“idle”},“type”:“task”,“references”:,“migrationVersion”:{“task”:“7.6.0”},“updated_at”:“2021-02-18T23:03:19.512Z”} [version_conflict_engine_exception]: [task:Actions-actions_telemetry]: version conflict, document already exists (current version [13])"}

Simulating requests shows they appear to be returned empty

Response

{
  "took" : 7,
  "timed_out" : false,
  "_shards" : {
    "total" : 1,
    "successful" : 1,
    "skipped" : 0,
    "failed" : 0
  },
  "hits" : {
    "total" : {
      "value" : 0,
      "relation" : "eq"
    },
    "max_score" : null,
    "hits" : [ ]
  },
  "aggregations" : {
    "table" : {
      "buckets" : [ ]
    }
  }
}

Query

 curl -X GET "localhost:9200/.kibana_2/_search?pretty" -H 'Content-Type: application/json' -d'{
  "size": 0,
  "aggs": {
    "table": {
      "composite": {
        "size": 10000,
        "sources": [
          {
            "stk1": {
              "terms": {
                "field": "client.ip"
              }
            }
          },
          {
            "stk2": {
              "terms": {
                "field": "server.ip"
              }
            }
          }
        ]
      }
    }
  },
  "query": {
    "bool": {
      "must": [
        {
          "range": {
            "@timestamp": {
              "gte": "2021-02-13T03:38:41.728Z",
              "lte": "2021-02-19T08:38:41.728Z",
              "format": "strict_date_optional_time"
            }
          }
        }
      ],
      "filter": [
        {
          "match_all": {}
        },
        {
          "match_phrase": {
            "event.dataset": {
              "query": "connection"
            }
          }
        }
      ],
      "should": [],
      "must_not": []
    }
  }
}'

when executing a redeployment i get

fatal: [donion]: FAILED! => {“msg”: “The conditional check ‘result.json.acknowledged | bool’ failed. The error was: error while evaluating conditional (result.json.acknowledged | bool): ‘dict object’ has no attribute ‘json’”}

mentions

Ingest Manager plugin renamed Fleetedit
Details: The ingestManager plugin has been renamed fleet.

with /fleet throwing a requirement for ‘superuser’ privileges

my assumption is this upgrade only breaks if you don’t have security configured yet, which i did not have since on a closed environment

Rather obvious solution to this was to downgrade Kibana to 7.10
Next deleting index .kibana_3 .kibana_2 and eventually .kibana_1 was required
However, this resulted in destruction of all dashboards.

These can comfortably be restored by running the scripts for the dashboard as documented on GitHub - rocknsm/rock-dashboards: Dashboards and loader for ROCK NSM dashboards

An issue to request providing a script (or documented procedure) to do so was filed here