Stenographer processes fail

I recently installed RockNSM and find I have everything running except stenographer. The main stenographer service functions. I have 3 cards I want to monitor and those stenographer@ens4f0 (etc) units fail repeatedly. I have gone through a lot of diagnostics and cannot find a solution.

Can someone help me fix and understand this issue?

I have 5 total NIC cards (a 4-port Intel i340-T4 card and a wireless card) I am only interested in monitoring 3 now, but configured all 5 interfaces. Each one fails with the same “Invalid argument <- open ABORTABORTABORT” error message.

I ran an strace on it, following another sites recommendation. I can rerun if that is useful.

Here are the error messages I am getting:
Jul 11 04:34:58 ruckus stenographer[19294]: 2020/07/11 04:34:58 Deleted stale output file “/data/stenographer/ens4f3/thread0/packets/.1594441924623273”

Jul 11 04:34:58 ruckus stenographer[19299]: 2020/07/11 04:34:58 Deleted stale output file “/data/stenographer/ens4f0/thread0/packets/.1594441924623273”

Jul 11 04:34:58 ruckus stenographer[19295]: 2020/07/11 04:34:58 Deleted stale output file “/data/stenographer/ens4f1/thread0/packets/.1594441925952102”

Jul 11 04:34:58 ruckus stenographer[19315]: 2020/07/11 04:34:58 Deleted stale output file “/data/stenographer/ens4f2/thread0/packets/.1594441926139261”

Jul 11 04:34:58 ruckus stenographer[19315]: 2020-07-11T04:34:58.651785Z T:d6b147 [stenotype.cc:549] Starting, page size is 4096

Jul 11 04:34:58 ruckus stenographer[19295]: 2020-07-11T04:34:58.651784Z T:8b1837 [stenotype.cc:549] Starting, page size is 4096

Jul 11 04:34:58 ruckus stenographer[19295]: 2020-07-11T04:34:58.651921Z T:8b1837 [stenotype.cc:576] Setting up AF_PACKET sockets for packet reading

Jul 11 04:34:58 ruckus stenographer[19315]: 2020-07-11T04:34:58.651921Z T:d6b147 [stenotype.cc:576] Setting up AF_PACKET sockets for packet reading

Jul 11 04:34:58 ruckus stenographer[19294]: 2020-07-11T04:34:58.651784Z T:e53747 [stenotype.cc:549] Starting, page size is 4096

Jul 11 04:34:58 ruckus stenographer[19294]: 2020-07-11T04:34:58.652043Z T:e53747 [stenotype.cc:576] Setting up AF_PACKET sockets for packet reading

Jul 11 04:34:58 ruckus stenographer[19299]: 2020-07-11T04:34:58.651784Z T:e90d37 [stenotype.cc:549] Starting, page size is 4096

Jul 11 04:34:58 ruckus stenographer[19299]: 2020-07-11T04:34:58.651924Z T:e90d37 [stenotype.cc:576] Setting up AF_PACKET sockets for packet reading

Jul 11 04:34:58 ruckus stenographer[19324]: 2020/07/11 04:34:58 Deleted stale output file “/data/stenographer/wlp9s0/thread0/packets/.1594441926127428”

Jul 11 04:34:58 ruckus stenographer[19324]: 2020-07-11T04:34:58.737495Z T:7aa287 [stenotype.cc:549] Starting, page size is 4096#0122020-07-11T04:34:58.740724Z T:7aa287 [stenotype.cc:576] Setting up AF_PACKET sockets for packet reading

Jul 11 04:35:33 ruckus stenographer[19315]: 2020-07-11T04:35:33.254671Z T:d6b147 [stenotype.cc:248] Dropping privileges

Jul 11 04:35:33 ruckus stenographer[19315]: 2020-07-11T04:35:33.319273Z T:548557 [stenotype.cc:450] Thread 0 starting to process packets#0122020-07-11T04:35:33.323823Z T:548557 [aio.cc:190] Opening packet file /tmp/stenographer139033336/PKT0/.1594442133319902: -1#0122020-07-11T04:35:33.323955Z T:548557 [stenotype.cc:462] CHECK(SUCCEEDED(check_success_error)) output.Rotate(file_dirname, micros, flag_preallocate_file_mb << 20): Invalid argument <- open#012ABORTABORTABORT

Jul 11 04:35:33 ruckus stenographer[19315]: /usr/bin/stenotype() [0x408258]

Jul 11 04:35:33 ruckus stenographer[19315]: #012/usr/bin/stenotype() [0x42fe07]#012/lib64/libstdc++.so.6(+0xb5070) [0x7f26d59f2070]#012/lib64/libpthread.so.0(+0x7ea5) [0x7f26d6092ea5]#012/lib64/libc.so.6(clone+0x6d) [0x7f26d51558dd]

Jul 11 04:35:34 ruckus stenographer[19315]: 2020/07/11 04:35:34 Stenotype stopped after 35.621040387s: stenotype wait failed: signal: aborted

Jul 11 04:35:34 ruckus stenographer[19315]: 2020/07/11 04:35:34 Stenotype ran for too little time, crashing to avoid stenotype crash loop

Jul 11 04:35:34 ruckus systemd: stenographer@ens4f2.service: main process exited, code=exited, status=1/FAILURE

Jul 11 04:35:35 ruckus systemd: Unit stenographer@ens4f2.service entered failed state.

Jul 11 04:35:35 ruckus systemd: stenographer@ens4f2.service failed.

Jul 11 04:35:35 ruckus stenographer[19324]: 2020/07/11 04:35:35 Stenotype stopped after 36.942529225s: stenotype wait failed: signal: killed

Jul 11 04:35:35 ruckus stenographer[19324]: 2020/07/11 04:35:35 Stenotype ran for too little time, crashing to avoid stenotype crash loop

Jul 11 04:35:35 ruckus systemd: stenographer@wlp9s0.service: main process exited, code=exited, status=1/FAILURE

Jul 11 04:35:35 ruckus systemd: Unit stenographer@wlp9s0.service entered failed state.

Jul 11 04:35:35 ruckus systemd: stenographer@wlp9s0.service failed.

Jul 11 04:35:35 ruckus stenographer[19295]: 2020/07/11 04:35:35 Stenotype stopped after 37.143004302s: stenotype wait failed: signal: killed

Jul 11 04:35:35 ruckus stenographer[19295]: 2020/07/11 04:35:35 Stenotype ran for too little time, crashing to avoid stenotype crash loop

Jul 11 04:35:35 ruckus systemd: stenographer@ens4f1.service: main process exited, code=exited, status=1/FAILURE

Jul 11 04:35:35 ruckus stenographer[19299]: 2020/07/11 04:35:35 Stenotype stopped after 37.148078228s: stenotype wait failed: signal: killed

Jul 11 04:35:35 ruckus stenographer[19299]: 2020/07/11 04:35:35 Stenotype ran for too little time, crashing to avoid stenotype crash loop

Jul 11 04:35:35 ruckus stenographer[19294]: 2020/07/11 04:35:35 Stenotype stopped after 37.147719281s: stenotype wait failed: signal: killed

Jul 11 04:35:35 ruckus stenographer[19294]: 2020/07/11 04:35:35 Stenotype ran for too little time, crashing to avoid stenotype crash loop

Jul 11 04:35:35 ruckus systemd: stenographer@ens4f0.service: main process exited, code=exited, status=1/FAILURE

Jul 11 04:35:35 ruckus systemd: stenographer@ens4f3.service: main process exited, code=exited, status=1/FAILURE

Jul 11 04:35:35 ruckus systemd: stenographer@ens4f3.service: control process exited, code=exited status=1

Jul 11 04:35:35 ruckus systemd: Unit stenographer@ens4f3.service entered failed state.

Jul 11 04:35:35 ruckus systemd: stenographer@ens4f3.service failed.

Jul 11 04:35:35 ruckus systemd: stenographer@ens4f1.service: control process exited, code=exited status=1

Jul 11 04:35:35 ruckus systemd: Unit stenographer@ens4f1.service entered failed state.

Jul 11 04:35:35 ruckus systemd: stenographer@ens4f1.service failed.

Jul 11 04:35:35 ruckus systemd: stenographer@ens4f0.service: control process exited, code=exited status=1

Jul 11 04:35:35 ruckus systemd: Unit stenographer@ens4f0.service entered failed state.

Jul 11 04:35:35 ruckus systemd: stenographer@ens4f0.service failed.
Jul 11 04:34:58 ruckus stenographer[19294]: 2020/07/11 04:34:58 Deleted stale output file “/data/stenographer/ens4f3/thread0/packets/.1594441924623273”
Jul 11 04:34:58 ruckus stenographer[19299]: 2020/07/11 04:34:58 Deleted stale output file “/data/stenographer/ens4f0/thread0/packets/.1594441924623273”
Jul 11 04:34:58 ruckus stenographer[19295]: 2020/07/11 04:34:58 Deleted stale output file “/data/stenographer/ens4f1/thread0/packets/.1594441925952102”
Jul 11 04:34:58 ruckus stenographer[19315]: 2020/07/11 04:34:58 Deleted stale output file “/data/stenographer/ens4f2/thread0/packets/.1594441926139261”
Jul 11 04:34:58 ruckus stenographer[19315]: 2020-07-11T04:34:58.651785Z T:d6b147 [stenotype.cc:549] Starting, page size is 4096
Jul 11 04:34:58 ruckus stenographer[19295]: 2020-07-11T04:34:58.651784Z T:8b1837 [stenotype.cc:549] Starting, page size is 4096
Jul 11 04:34:58 ruckus stenographer[19295]: 2020-07-11T04:34:58.651921Z T:8b1837 [stenotype.cc:576] Setting up AF_PACKET sockets for packet reading
Jul 11 04:34:58 ruckus stenographer[19315]: 2020-07-11T04:34:58.651921Z T:d6b147 [stenotype.cc:576] Setting up AF_PACKET sockets for packet reading
Jul 11 04:34:58 ruckus stenographer[19294]: 2020-07-11T04:34:58.651784Z T:e53747 [stenotype.cc:549] Starting, page size is 4096
Jul 11 04:34:58 ruckus stenographer[19294]: 2020-07-11T04:34:58.652043Z T:e53747 [stenotype.cc:576] Setting up AF_PACKET sockets for packet reading
Jul 11 04:34:58 ruckus stenographer[19299]: 2020-07-11T04:34:58.651784Z T:e90d37 [stenotype.cc:549] Starting, page size is 4096
Jul 11 04:34:58 ruckus stenographer[19299]: 2020-07-11T04:34:58.651924Z T:e90d37 [stenotype.cc:576] Setting up AF_PACKET sockets for packet reading
Jul 11 04:34:58 ruckus stenographer[19324]: 2020/07/11 04:34:58 Deleted stale output file “/data/stenographer/wlp9s0/thread0/packets/.1594441926127428”
Jul 11 04:34:58 ruckus stenographer[19324]: 2020-07-11T04:34:58.737495Z T:7aa287 [stenotype.cc:549] Starting, page size is 4096#0122020-07-11T04:34:58.740724Z T:7aa287 [stenotype.cc:576] Setting up AF_PACKET sockets for packet reading
Jul 11 04:35:33 ruckus stenographer[19315]: 2020-07-11T04:35:33.254671Z T:d6b147 [stenotype.cc:248] Dropping privileges
Jul 11 04:35:33 ruckus stenographer[19315]: 2020-07-11T04:35:33.319273Z T:548557 [stenotype.cc:450] Thread 0 starting to process packets#0122020-07-11T04:35:33.323823Z T:548557 [aio.cc:190] Opening packet file /tmp/stenographer139033336/PKT0/.1594442133319902: -1#0122020-07-11T04:35:33.323955Z T:548557 [stenotype.cc:462] CHECK(SUCCEEDED(check_success_error)) output.Rotate(file_dirname, micros, flag_preallocate_file_mb << 20): Invalid argument <- open#012ABORTABORTABORT
Jul 11 04:35:33 ruckus stenographer[19315]: /usr/bin/stenotype() [0x408258]
Jul 11 04:35:33 ruckus stenographer[19315]: #012/usr/bin/stenotype() [0x42fe07]#012/lib64/libstdc++.so.6(+0xb5070) [0x7f26d59f2070]#012/lib64/libpthread.so.0(+0x7ea5) [0x7f26d6092ea5]#012/lib64/libc.so.6(clone+0x6d) [0x7f26d51558dd]
Jul 11 04:35:34 ruckus stenographer[19315]: 2020/07/11 04:35:34 Stenotype stopped after 35.621040387s: stenotype wait failed: signal: aborted
Jul 11 04:35:34 ruckus stenographer[19315]: 2020/07/11 04:35:34 Stenotype ran for too little time, crashing to avoid stenotype crash loop
Jul 11 04:35:34 ruckus systemd: stenographer@ens4f2.service: main process exited, code=exited, status=1/FAILURE
Jul 11 04:35:35 ruckus systemd: Unit stenographer@ens4f2.service entered failed state.
Jul 11 04:35:35 ruckus systemd: stenographer@ens4f2.service failed.
Jul 11 04:35:35 ruckus stenographer[19324]: 2020/07/11 04:35:35 Stenotype stopped after 36.942529225s: stenotype wait failed: signal: killed
Jul 11 04:35:35 ruckus stenographer[19324]: 2020/07/11 04:35:35 Stenotype ran for too little time, crashing to avoid stenotype crash loop
Jul 11 04:35:35 ruckus systemd: stenographer@wlp9s0.service: main process exited, code=exited, status=1/FAILURE
Jul 11 04:35:35 ruckus systemd: Unit stenographer@wlp9s0.service entered failed state.
Jul 11 04:35:35 ruckus systemd: stenographer@wlp9s0.service failed.
Jul 11 04:35:35 ruckus stenographer[19295]: 2020/07/11 04:35:35 Stenotype stopped after 37.143004302s: stenotype wait failed: signal: killed
Jul 11 04:35:35 ruckus stenographer[19295]: 2020/07/11 04:35:35 Stenotype ran for too little time, crashing to avoid stenotype crash loop
Jul 11 04:35:35 ruckus systemd: stenographer@ens4f1.service: main process exited, code=exited, status=1/FAILURE
Jul 11 04:35:35 ruckus stenographer[19299]: 2020/07/11 04:35:35 Stenotype stopped after 37.148078228s: stenotype wait failed: signal: killed
Jul 11 04:35:35 ruckus stenographer[19299]: 2020/07/11 04:35:35 Stenotype ran for too little time, crashing to avoid stenotype crash loop
Jul 11 04:35:35 ruckus stenographer[19294]: 2020/07/11 04:35:35 Stenotype stopped after 37.147719281s: stenotype wait failed: signal: killed
Jul 11 04:35:35 ruckus stenographer[19294]: 2020/07/11 04:35:35 Stenotype ran for too little time, crashing to avoid stenotype crash loop
Jul 11 04:35:35 ruckus systemd: stenographer@ens4f0.service: main process exited, code=exited, status=1/FAILURE
Jul 11 04:35:35 ruckus systemd: stenographer@ens4f3.service: main process exited, code=exited, status=1/FAILURE
Jul 11 04:35:35 ruckus systemd: stenographer@ens4f3.service: control process exited, code=exited status=1
Jul 11 04:35:35 ruckus systemd: Unit stenographer@ens4f3.service entered failed state.
Jul 11 04:35:35 ruckus systemd: stenographer@ens4f3.service failed.
Jul 11 04:35:35 ruckus systemd: stenographer@ens4f1.service: control process exited, code=exited status=1
Jul 11 04:35:35 ruckus systemd: Unit stenographer@ens4f1.service entered failed state.
Jul 11 04:35:35 ruckus systemd: stenographer@ens4f1.service failed.
Jul 11 04:35:35 ruckus systemd: stenographer@ens4f0.service: control process exited, code=exited status=1
Jul 11 04:35:35 ruckus systemd: Unit stenographer@ens4f0.service entered failed state.
Jul 11 04:35:35 ruckus systemd: stenographer@ens4f0.service failed.

Thank you.

The version of RockNSM is 2.5.0-2005.

Thanks.

This issue is resolved. The issue is with the underlying ZFS partition. ZFS on centos 7 is not creating /dev entries. Need to fix the ZFS issue. ZFS is performing great for all other applications (elastic, etc).

Thanks for following up, @water