Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug which hangs the test if warmboot has an issue #2994

Merged
merged 1 commit into from
Feb 17, 2021

Conversation

vaibhavhd
Copy link
Contributor

Description of PR

Summary: Fix the test bug which hangs the test execution, if warmboot has an issue (either in shutdown or boot-up path).

Fixes # (issue)

Type of change

  • Bug fix
  • Testbed and Framework(new/improvement)
  • Test case(new/improvement)

Approach

Replace the duthost.get_up_time() with duthost.get_now_time()
This change is needed as the duthost.get_up_time() call always returns the same value (the time since the DUT was last UP).
As a result, the time_passed value always remains a constant int.

With get_now_time, time_passed gets updated every iteration, and the loop exits when timout occurs.

What is the motivation for this PR?

17/02/2021 15:21:51 INFO reboot.py:reboot:156: ssh has started up
17/02/2021 15:21:51 INFO reboot.py:reboot:158: waiting for switch to initialize
17/02/2021 15:21:51 INFO reboot.py:reboot:161: waiting for warmboot-finalizer service to become activating
17/02/2021 15:21:51 DEBUG devices.py:_run:81: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/reboot.py::get_warmboot_finalizer_state#72: [str2-7050cx3-acs-01] AnsibleModule::command, args=["systemctl is-active warmboot-finalizer.service"], kwargs={"module_ignore_errors": true}
17/02/2021 15:21:56 DEBUG devices.py:_run:95: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/reboot.py::get_warmboot_finalizer_state#72: [str2-7050cx3-acs-01] AnsibleModule::command Result => {"stderr_lines": [], "cmd": ["systemctl", "is-active", "warmboot-finalizer.service"], "end": "2021-02-17 15:21:56.371065", "_ansible_no_log": false, "stdout": "inactive", "changed": true, "failed": true, "delta": "0:00:00.011023", "stderr": "", "rc": 3, "invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": false, "strip_empty_ends": true, "_raw_params": "systemctl is-active warmboot-finalizer.service", "removes": null, "argv": null, "creates": null, "chdir": null, "stdin_add_newline": true, "stdin": null}}, "stdout_lines": ["inactive"], "start": "2021-02-17 15:21:56.360042", "msg": "non-zero return code"}
17/02/2021 15:21:56 DEBUG devices.py:_run:81: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/devices.py::get_up_time#669: [str2-7050cx3-acs-01] AnsibleModule::command, args=["uptime -s"], kwargs={}
17/02/2021 15:21:56 DEBUG devices.py:_run:95: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/devices.py::get_up_time#669: [str2-7050cx3-acs-01] AnsibleModule::command Result => {"stderr_lines": [], "cmd": ["uptime", "-s"], "end": "2021-02-17 15:21:57.968906", "_ansible_no_log": false, "stdout": "2021-02-17 15:20:56", "changed": true, "rc": 0, "start": "2021-02-17 15:21:57.959445", "stderr": "", "delta": "0:00:00.009461", "invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": false, "strip_empty_ends": true, "_raw_params": "uptime -s", "removes": null, "argv": null, "creates": null, "chdir": null, "stdin_add_newline": true, "stdin": null}}, "stdout_lines": ["2021-02-17 15:20:56"], "failed": false}
17/02/2021 15:21:57 DEBUG devices.py:_run:81: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/reboot.py::get_warmboot_finalizer_state#72: [str2-7050cx3-acs-01] AnsibleModule::command, args=["systemctl is-active warmboot-finalizer.service"], kwargs={"module_ignore_errors": true}
17/02/2021 15:21:58 DEBUG devices.py:_run:95: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/reboot.py::get_warmboot_finalizer_state#72: [str2-7050cx3-acs-01] AnsibleModule::command Result => {"stderr_lines": [], "cmd": ["systemctl", "is-active", "warmboot-finalizer.service"], "end": "2021-02-17 15:21:59.476516", "_ansible_no_log": false, "stdout": "inactive", "changed": true, "failed": true, "delta": "0:00:00.008238", "stderr": "", "rc": 3, "invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": false, "strip_empty_ends": true, "_raw_params": "systemctl is-active warmboot-finalizer.service", "removes": null, "argv": null, "creates": null, "chdir": null, "stdin_add_newline": true, "stdin": null}}, "stdout_lines": ["inactive"], "start": "2021-02-17 15:21:59.468278", "msg": "non-zero return code"}
17/02/2021 15:21:58 DEBUG devices.py:_run:81: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/devices.py::get_up_time#669: [str2-7050cx3-acs-01] AnsibleModule::command, args=["uptime -s"], kwargs={}
17/02/2021 15:21:58 DEBUG devices.py:_run:95: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/devices.py::get_up_time#669: [str2-7050cx3-acs-01] AnsibleModule::command Result => {"stderr_lines": [], "cmd": ["uptime", "-s"], "end": "2021-02-17 15:21:59.957667", "_ansible_no_log": false, "stdout": "2021-02-17 15:20:56", "changed": true, "rc": 0, "start": "2021-02-17 15:21:59.953171", "stderr": "", "delta": "0:00:00.004496", "invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": false, "strip_empty_ends": true, "_raw_params": "uptime -s", "removes": null, "argv": null, "creates": null, "chdir": null, "stdin_add_newline": true, "stdin": null}}, "stdout_lines": ["2021-02-17 15:20:56"], "failed": false}
17/02/2021 15:21:59 DEBUG devices.py:_run:81: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/reboot.py::get_warmboot_finalizer_state#72: [str2-7050cx3-acs-01] AnsibleModule::command, args=["systemctl is-active warmboot-finalizer.service"], kwargs={"module_ignore_errors": true}
17/02/2021 15:22:00 DEBUG devices.py:_run:95: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/reboot.py::get_warmboot_finalizer_state#72: [str2-7050cx3-acs-01] AnsibleModule::command Result => {"stderr_lines": [], "cmd": ["systemctl", "is-active", "warmboot-finalizer.service"], "end": "2021-02-17 15:22:01.454868", "_ansible_no_log": false, "stdout": "inactive", "changed": true, "failed": true, "delta": "0:00:00.009706", "stderr": "", "rc": 3, "invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": false, "strip_empty_ends": true, "_raw_params": "systemctl is-active warmboot-finalizer.service", "removes": null, "argv": null, "creates": null, "chdir": null, "stdin_add_newline": true, "stdin": null}}, "stdout_lines": ["inactive"], "start": "2021-02-17 15:22:01.445162", "msg": "non-zero return code"}
17/02/2021 15:22:00 DEBUG devices.py:_run:81: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/devices.py::get_up_time#669: [str2-7050cx3-acs-01] AnsibleModule::command, args=["uptime -s"], kwargs={}
17/02/2021 15:22:00 DEBUG devices.py:_run:95: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/devices.py::get_up_time#669: [str2-7050cx3-acs-01] AnsibleModule::command Result => {"stderr_lines": [], "cmd": ["uptime", "-s"], "end": "2021-02-17 15:22:01.933692", "_ansible_no_log": false, "stdout": "2021-02-17 15:20:56", "changed": true, "rc": 0, "start": "2021-02-17 15:22:01.928974", "stderr": "", "delta": "0:00:00.004718", "invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": false, "strip_empty_ends": true, "_raw_params": "uptime -s", "removes": null, "argv": null, "creates": null, "chdir": null, "stdin_add_newline": true, "stdin": null}}, "stdout_lines": ["2021-02-17 15:20:56"], "failed": false}
17/02/2021 15:22:01 DEBUG devices.py:_run:81: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/reboot.py::get_warmboot_finalizer_state#72: [str2-7050cx3-acs-01] AnsibleModule::command, args=["systemctl is-active warmboot-finalizer.service"], kwargs={"module_ignore_errors": true}
17/02/2021 15:22:02 DEBUG devices.py:_run:95: /var/sonicbld/workspace/NewTests/TEMPLATE_PYTEST_T0_A7050CX3/tests/common/reboot.py::get_warmboot_finalizer_state#72: [str2-7050cx3-acs-01] AnsibleModule::command Result => {"stderr_lines": [], "cmd": ["systemctl", "is-active", "warmboot-finalizer.service"], "end": "2021-02-17 15:22:03.612933", "_ansible_no_log": false, "stdout": "inactive", "changed": true, "failed": true, "delta": "0:00:00.008183", "stderr": "", "rc": 3, "invocation": {"module_args": {"warn": true, "executable": null, "_uses_shell": false, "strip_empty_ends": true, "_raw_params": "systemctl is-active warmboot-finalizer.service", "removes": null, "argv": null, "creates": null, "chdir": null, "stdin_add_newline": true, "stdin": null}}, "stdout_lines": ["inactive"], "start": "2021-02-17 15:22:03.604750", "msg": "non-zero return code"}

How did you do it?

How did you verify/test it?

Tested on a DUT where the issue was seen:

====================================================================================================== test session starts ======================================================================================================
platform_tests/test_reboot.py::test_warm_reboot 
--------------------------------------------------------------------------------------------------------- live log call ---------------------------------------------------------------------------------------------------------
23:01:02 INFO test_reboot.py:reboot_and_check:58: Run warm reboot on DUT
23:01:04 INFO reboot.py:reboot:124: waiting for ssh to drop
23:01:04 INFO reboot.py:execute_reboot_command:108: rebooting with command "warm-reboot"
23:02:04 INFO reboot.py:reboot:145: waiting for ssh to startup
23:03:02 INFO reboot.py:reboot:156: ssh has started up
23:03:02 INFO reboot.py:reboot:158: waiting for switch to initialize
23:03:02 INFO reboot.py:reboot:161: waiting for warmboot-finalizer service to become activating
FAILED                                                                                                                                                                                                                    [100%]
------------------------------------------------------------------------------------------------------- live log teardown -------------------------------------------------------------------------------------------------------
23:03:08 INFO test_reboot.py:teardown_module:41: Tearing down: to make sure all the critical services, interfaces and transceivers are good
platform_tests/test_reboot.py::test_warm_reboot ERROR                                                                                                                                                                     [100%]

============================================================================================================ ERRORS =============================================================================================================
_____________________________________________________________________________________________ ERROR at teardown of test_warm_reboot _____________________________________________________________________________________________
                if time_passed > wait:
>                   raise Exception('warmboot-finalizer never reached state "activating"')
E                   Exception: warmboot-finalizer never reached state "activating"

common/reboot.py:167: Exception
============================================================================================== 1 failed, 1 error in 153.13 seconds ==============================================================================================

Any platform specific information?

Supported testbed topology if it's a new test case?

Documentation

@yxieca yxieca merged commit 4f88aa2 into sonic-net:master Feb 17, 2021
@vaibhavhd vaibhavhd deleted the test-warmboot-stuck-fix branch February 17, 2021 23:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants