Overview
Exinda device sometimes reboots unexpectedly. If the reboot is on any of the 8060, 10060, or 6060 series hardware model, such reboot could be related to the serial speed settings of the device causing issues with the kernel. This article provides relevant information and steps to resolve this problem.
Root Cause
Sometimes, the default serial speed on the devices mentioned above causes the kernel to reach an unexpected unstable state. To prevent this behavior, Exinda devices trigger bypass on its bridges, leading the device to reboot; most of the time, this should not impact performance, since the bridges go in bypass mode, and allow traffic to pass through without any interruption
Resolution
- Enable kernel dump and disable watchdog, which will require a reboot for the changes to be adequately added:
en
conf t
debug kdump enable
no watchdog enable
WARNING: Although very unlikely, it is possible that there is a kernel deadlock in such a way that the kdump or the deadlock detector will not initiate. In this case, the Exinda device will be dead, and you will have to power cycle it.
- To save the changes and reboot the device, execute the below commands:
write memory
reload
- Increase the Serial Speed on the device:
serial speed 115200
Note: This change requires a reboot to take effect, and it is only needed for boxes different than 6062, 8062 and 10062.
reload
- Change the serial console logging-level (for all the appliances). For this step, you may have to contact Exinda Support, since an exclusive license is needed to access the shell. In the shell (to take effect immediately), execute the following commands:
remountrw
echo "3 4 1 7" > /proc/sys/kernel/printk
This changes, however, will not be preserved after a reboot, so you need to add that same line to the /etc/rc.local directory. Execute the following commands for that purpose:vi /etc/rc.local
You will see the content of the file. Click "i" to have access to Insert Mode and add theecho "3 4 1 7" > /proc/sys/kernel/printk
line in the file without deleting what is currently there.
Additional Information
The watchdog feature regularly shutdowns the Exinda device to protect the network in cases where specific processes are overflowing the capacity of the appliance. The most common causes of system watchdog reboots are:
-
Kernel deadlock: Some timing issue result in the lockup of one (or more) cores in the kernel. Unfortunately, the reboot wipes out any information that may explain what went wrong. If the system's watch is disabled, and kdump is enabled, it may detect this and log partially when it happens. In other cases, the box may end up dead and require power cycling.
- Too much output to the serial console. The CPU is blocked while writing this information out the serial port, which blocks of the CPU. The Serial Speed can be changed from 9600 to 115K (twelve times faster), which will reduce the possibilities of this happening (the 6062, 8062, and 10062 models use 115K by default for this reason). Another measure to address this is to decrease the default level of messages going to the serial port.