-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[oneimage] Fix race condition in systemd container services #421
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
thanks for the fix. is there a way to ensure that this fix is working? |
taoyl-ms
added a commit
to taoyl-ms/files
that referenced
this pull request
Mar 22, 2017
What do you mean by "ensure the fix is working"? I have tested the fix and the time sequence is correct now: |
stcheng
approved these changes
Mar 22, 2017
stcheng
pushed a commit
that referenced
this pull request
Mar 22, 2017
When Type=simple, systemd will consider the service activated immediately after specified in ExecStart process is started. If there is downstream service depending on the state prepared in ExecStart, there will be race condition. For example, issue #390. In this case, database.service calls database.sh, which calls docker run or docker start -a to start database container. However, systemd considers database.service successfully started at the time database.sh begins, not after docker run finishes. As database.service is consider started, bgp.service can be started. The redis database, which bgp service depends on, might or might not have been started at this time point. To fix this issue (and still keeping the functionality to monitor docker status with systemd), we split the ExecStart process into an ExecStartPre part and an ExecStart part. docker run is splitted into docker run -d then docker attach , while docker start -a is splitted into docker start and then docker attach. In this way, we make sure the downstream services are blocked until container is successfully started.
yxieca
added a commit
to yxieca/sonic-buildimage
that referenced
this pull request
Feb 14, 2019
…ules PR#2538 cannot merge due to master branch status. It has been tested against 201811 branch. Submodule src/sonic-sairedis 21f4a49..d57222a: > Add more specific logic for ingress ACL and buffer profile (sonic-net#421) > Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (sonic-net#418) > Add support for vlan tagged frames in virtual switch (sonic-net#417) Submodule src/sonic-swss 1590030..584490c: > Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (sonic-net#786) > [vstest]: Potential fix for timing issue in warm_reboot's routing UT (sonic-net#788) Submodule src/sonic-swss-common 594f4e8..286ef34: > Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (sonic-net#260) Submodule src/sonic-utilities c6666e2..b44b462: > Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABL… (sonic-net#458) > [aclshow] output only counters per table/rule (sonic-net#442) Signed-off-by: Ying Xie <[email protected]> [PR 2538] Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE Signed-off-by: Jipan Yang <[email protected]>
yxieca
added a commit
that referenced
this pull request
Feb 14, 2019
PR#2538 cannot merge due to master branch status. It has been tested against 201811 branch. Submodule src/sonic-sairedis 21f4a49..d57222a: > Add more specific logic for ingress ACL and buffer profile (#421) > Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (#418) > Add support for vlan tagged frames in virtual switch (#417) Submodule src/sonic-swss 1590030..584490c: > Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (#786) > [vstest]: Potential fix for timing issue in warm_reboot's routing UT (#788) Submodule src/sonic-swss-common 594f4e8..286ef34: > Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (#260) Submodule src/sonic-utilities c6666e2..b44b462: > Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABL… (#458) > [aclshow] output only counters per table/rule (#442) Signed-off-by: Ying Xie <[email protected]> [PR 2538] Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE Signed-off-by: Jipan Yang <[email protected]>
lguohan
added a commit
to yxieca/sonic-buildimage
that referenced
this pull request
Feb 16, 2019
swss * a6d60f2 2019-02-15 | Create egress ACL table group during the PFCWD stats list installment (sonic-net#787) (HEAD, origin/master, origin/HEAD) [Wenda Ni] * 52de963 2019-02-15 | [fpmsyncd] Add VNET routes support (sonic-net#772) [Wei Bai] * d27f49e 2019-02-13 | Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (sonic-net#786) [Jipan Yang] * 6363985 2019-02-08 | [vstest]: Potential fix for timing issue in warm_reboot's routing UT (sonic-net#788) [Rodny Molina] * 6d5424d 2019-02-07 | VNet/Vxlan delete handling (sonic-net#766) [Prince Sunny] * d680ce2 2019-02-07 | [neighsyncd] increase neighbor syncd restore timeout to 110 seconds (sonic-net#745) [Ying Xie] * b78cc8d 2019-02-01 | support 8 lanes for a physical port (sonic-net#778) [lguohan] * 73b620c 2019-02-01 | Increase the watermark polling interval to 10s (sonic-net#777) [Wenda Ni] * a2b987b 2019-02-01 | [vstest]: fix test_speed.py (sonic-net#780) [lguohan] * cef4bd0 2019-02-01 | [vstest]: fix test_port_an_warm.py test (sonic-net#779) [lguohan] * 9f20eda 2019-02-01 | fix a unstable swss egress acl test (sonic-net#776) [Kebo Liu] * 316ae6c 2019-01-30 | portsorch ports init done flag should means buffer, autoneg, speed, m… (sonic-net#747) [Jipan Yang] * 4280036 2019-01-30 | [teammgrd] Fix inconsistent port admin status (sonic-net#755) [Jipan Yang] * cf12bdf 2019-01-30 | Remove AclTableGroup upon removal of port/lag/vlan (sonic-net#751) [Jipan Yang] * 5779c1a 2019-01-29 | [aclorch] Remove L4 port range support limitation on egress ACL table and add new SWSS virtual test. (sonic-net#741) [Kebo Liu] * 36e85eb 2019-01-29 | On a routing vlan, the neighbor entry in the /31 subnet is not added to hardware (sonic-net#771) [Kiran Kumar Kella] * 882ccc6 2019-01-24 | [vnetorch] Change logic for adding VNet interface (sonic-net#761) [Marian Pritsak] * f637557 2019-01-25 | [vrfmgrd] Fix VRF is not set to VRF_TABLE in APP_DB correctly (sonic-net#768) [yorke] * e84a6ab 2019-01-24 | use sai_stat_id_t for new SAI header file (sonic-net#769) [lguohan] sairedis * d685e35 2019-02-15 | Add support for fdb_event MOVE and check fdb event oids (sonic-net#420) (HEAD, origin/master, origin/HEAD) [Kamil Cudnik] * 2b91013 2019-02-15 | [vslib] add missing port attributes for virtual switch (sonic-net#419) [Stepan Blyshchak] * dcc8688 2019-02-14 | Add more specific logic for ingress ACL and buffer profile (sonic-net#421) [Kamil Cudnik] * c0b39ea 2019-02-12 | Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (sonic-net#418) [Jipan Yang] * ab35dfa 2019-02-11 | Add support for vlan tagged frames in virtual switch (sonic-net#417) [Kamil Cudnik] * 145ea44 2019-02-02 | [flex counter] handle router interface stats (sonic-net#410) [Mykola F] * c03d639 2019-02-02 | Add more information on failed map sizes (sonic-net#416) [Kamil Cudnik] * 29f1e3c 2019-01-31 | Update SAI pointer (sonic-net#414) [Marian Pritsak] * c0a948d 2019-01-30 | Add WRED specific comparison logic (sonic-net#413) [Kamil Cudnik] * 1b6a661 2019-01-24 | install SAI extension header files into /usr/include/sai (sonic-net#412) [lguohan] * 849525a 2019-01-24 | Initialize notification queue pointer before switch create (sonic-net#411) [Kamil Cudnik] * 02d92f1 2019-01-23 | Add log info for not matching SG/IPG/QUEUES (sonic-net#409) [Kamil Cudnik] * 8793562 2019-01-18 | Update SAI pointer to latest master (sonic-net#408) [Marian Pritsak] swss-common * ec04a5a 2019-02-14 | Add support for WarmStart::setDataCheckState() (sonic-net#242) [Jipan Yang] * 56bd73f 2019-02-13 | Force only supported commands on consumer table (sonic-net#261) [Kamil Cudnik] * 414de0f 2019-02-12 | Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (sonic-net#260) [Jipan Yang] * 88de725 2019-02-05 | [pyext] enable types in stdint.h (sonic-net#259) [Ying Xie] * f457ae8 2019-02-05 | Optimized ProducerStateTable set/del notification processing to avoid… (sonic-net#257) [Jipan Yang] * e5286fd 2019-01-30 | [rif counters] Rif counter schema update (sonic-net#256) [Mykola F] sonic-utilities * b44b462 2019-02-14 | Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABL… (sonic-net#458) (HEAD, origin/master, origin/HEAD) [Jipan Yang] * e856b8b 2019-02-11 | [aclshow] output only counters per table/rule (sonic-net#442) [Roman Kachur] Signed-off-by: Guohan Lu <[email protected]>
lguohan
pushed a commit
that referenced
this pull request
Feb 16, 2019
…g Broadcom SAI build (#2488) * [Broadcom SAI] upgrade Broadcom SAI to 3.3.4.3m-3 This is SAI 3.3.4.3m-3 compiled with SAI header file at commit ID 6ad3382217ec22f64cd268faefcbc2ff7caba4fd of SAI repo. Signed-off-by: Ying Xie <[email protected]> * change libsaithrift version to 0.9.4 Signed-off-by: Guohan Lu <[email protected]> * [submodule]: update swss, sairedis, swss-common, sonic-utilities swss * a6d60f2 2019-02-15 | Create egress ACL table group during the PFCWD stats list installment (#787) (HEAD, origin/master, origin/HEAD) [Wenda Ni] * 52de963 2019-02-15 | [fpmsyncd] Add VNET routes support (#772) [Wei Bai] * d27f49e 2019-02-13 | Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (#786) [Jipan Yang] * 6363985 2019-02-08 | [vstest]: Potential fix for timing issue in warm_reboot's routing UT (#788) [Rodny Molina] * 6d5424d 2019-02-07 | VNet/Vxlan delete handling (#766) [Prince Sunny] * d680ce2 2019-02-07 | [neighsyncd] increase neighbor syncd restore timeout to 110 seconds (#745) [Ying Xie] * b78cc8d 2019-02-01 | support 8 lanes for a physical port (#778) [lguohan] * 73b620c 2019-02-01 | Increase the watermark polling interval to 10s (#777) [Wenda Ni] * a2b987b 2019-02-01 | [vstest]: fix test_speed.py (#780) [lguohan] * cef4bd0 2019-02-01 | [vstest]: fix test_port_an_warm.py test (#779) [lguohan] * 9f20eda 2019-02-01 | fix a unstable swss egress acl test (#776) [Kebo Liu] * 316ae6c 2019-01-30 | portsorch ports init done flag should means buffer, autoneg, speed, m… (#747) [Jipan Yang] * 4280036 2019-01-30 | [teammgrd] Fix inconsistent port admin status (#755) [Jipan Yang] * cf12bdf 2019-01-30 | Remove AclTableGroup upon removal of port/lag/vlan (#751) [Jipan Yang] * 5779c1a 2019-01-29 | [aclorch] Remove L4 port range support limitation on egress ACL table and add new SWSS virtual test. (#741) [Kebo Liu] * 36e85eb 2019-01-29 | On a routing vlan, the neighbor entry in the /31 subnet is not added to hardware (#771) [Kiran Kumar Kella] * 882ccc6 2019-01-24 | [vnetorch] Change logic for adding VNet interface (#761) [Marian Pritsak] * f637557 2019-01-25 | [vrfmgrd] Fix VRF is not set to VRF_TABLE in APP_DB correctly (#768) [yorke] * e84a6ab 2019-01-24 | use sai_stat_id_t for new SAI header file (#769) [lguohan] sairedis * d685e35 2019-02-15 | Add support for fdb_event MOVE and check fdb event oids (#420) (HEAD, origin/master, origin/HEAD) [Kamil Cudnik] * 2b91013 2019-02-15 | [vslib] add missing port attributes for virtual switch (#419) [Stepan Blyshchak] * dcc8688 2019-02-14 | Add more specific logic for ingress ACL and buffer profile (#421) [Kamil Cudnik] * c0b39ea 2019-02-12 | Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (#418) [Jipan Yang] * ab35dfa 2019-02-11 | Add support for vlan tagged frames in virtual switch (#417) [Kamil Cudnik] * 145ea44 2019-02-02 | [flex counter] handle router interface stats (#410) [Mykola F] * c03d639 2019-02-02 | Add more information on failed map sizes (#416) [Kamil Cudnik] * 29f1e3c 2019-01-31 | Update SAI pointer (#414) [Marian Pritsak] * c0a948d 2019-01-30 | Add WRED specific comparison logic (#413) [Kamil Cudnik] * 1b6a661 2019-01-24 | install SAI extension header files into /usr/include/sai (#412) [lguohan] * 849525a 2019-01-24 | Initialize notification queue pointer before switch create (#411) [Kamil Cudnik] * 02d92f1 2019-01-23 | Add log info for not matching SG/IPG/QUEUES (#409) [Kamil Cudnik] * 8793562 2019-01-18 | Update SAI pointer to latest master (#408) [Marian Pritsak] swss-common * ec04a5a 2019-02-14 | Add support for WarmStart::setDataCheckState() (#242) [Jipan Yang] * 56bd73f 2019-02-13 | Force only supported commands on consumer table (#261) [Kamil Cudnik] * 414de0f 2019-02-12 | Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABLE_TABLE (#260) [Jipan Yang] * 88de725 2019-02-05 | [pyext] enable types in stdint.h (#259) [Ying Xie] * f457ae8 2019-02-05 | Optimized ProducerStateTable set/del notification processing to avoid… (#257) [Jipan Yang] * e5286fd 2019-01-30 | [rif counters] Rif counter schema update (#256) [Mykola F] sonic-utilities * b44b462 2019-02-14 | Move warm_restart enable/disable config to stateDB WARM_RESTART_ENABL… (#458) (HEAD, origin/master, origin/HEAD) [Jipan Yang] * e856b8b 2019-02-11 | [aclshow] output only counters per table/rule (#442) [Roman Kachur] Signed-off-by: Guohan Lu <[email protected]> * [mlnx] update mellanox sai Signed-off-by: Stepan Blyschak <[email protected]>
dmytroxshevchuk
pushed a commit
to dmytroxshevchuk/sonic-buildimage
that referenced
this pull request
Aug 31, 2020
…#421) * Add more specific logic for ingress ACL and buffer profile * Address comments
tahmed-dev
added a commit
to tahmed-dev/sonic-buildimage
that referenced
this pull request
Feb 18, 2021
Change in this update: b75aab7 [swss-common] Add LINKMGR CFG and MUX LINKMGR state table names (sonic-net#421) 4a77d1c [ci]: add vstest (sonic-net#459) 07258a6 [ci]: use build template (sonic-net#457) ddcae3e runRedisScript api to process integer returned by script run in the redis (sonic-net#447) 33d89c7 [systemlag] Schema defs for system lag (sonic-net#448) af01f37 spell check fixes (sonic-net#456) 7afd43d Update to make getNamespaces() API at par with the get_ns_list() swssdk-py API. (sonic-net#455) signed-off-by: Tamer Ahmed <[email protected]>
4 tasks
tahmed-dev
added a commit
that referenced
this pull request
Feb 18, 2021
Change in this update: b75aab7 [swss-common] Add LINKMGR CFG and MUX LINKMGR state table names (#421) 4a77d1c [ci]: add vstest (#459) 07258a6 [ci]: use build template (#457) ddcae3e runRedisScript api to process integer returned by script run in the redis (#447) 33d89c7 [systemlag] Schema defs for system lag (#448) af01f37 spell check fixes (#456) 7afd43d Update to make getNamespaces() API at par with the get_ns_list() swssdk-py API. (#455) signed-off-by: Tamer Ahmed <[email protected]>
daall
pushed a commit
that referenced
this pull request
Feb 25, 2021
Change in this update: b75aab7 [swss-common] Add LINKMGR CFG and MUX LINKMGR state table names (#421) 4a77d1c [ci]: add vstest (#459) 07258a6 [ci]: use build template (#457) ddcae3e runRedisScript api to process integer returned by script run in the redis (#447) 33d89c7 [systemlag] Schema defs for system lag (#448) af01f37 spell check fixes (#456) 7afd43d Update to make getNamespaces() API at par with the get_ns_list() swssdk-py API. (#455) signed-off-by: Tamer Ahmed <[email protected]>
carl-nokia
pushed a commit
to carl-nokia/sonic-buildimage
that referenced
this pull request
Aug 7, 2021
Change in this update: b75aab7 [swss-common] Add LINKMGR CFG and MUX LINKMGR state table names (sonic-net#421) 4a77d1c [ci]: add vstest (sonic-net#459) 07258a6 [ci]: use build template (sonic-net#457) ddcae3e runRedisScript api to process integer returned by script run in the redis (sonic-net#447) 33d89c7 [systemlag] Schema defs for system lag (sonic-net#448) af01f37 spell check fixes (sonic-net#456) 7afd43d Update to make getNamespaces() API at par with the get_ns_list() swssdk-py API. (sonic-net#455) signed-off-by: Tamer Ahmed <[email protected]>
DavidZagury
pushed a commit
to DavidZagury/sonic-buildimage
that referenced
this pull request
Dec 7, 2024
* Add kernel patch to fix 57766 staying down after reset This is the workaround for incorrected detected DMA overflow that may result in NIC staying down after reset. The fix is to limit address space that can be used. * Add description and fix bookworm build * Fix subject in the patch --------- Co-authored-by: Saikrishna Arcot <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
When
Type=simple
,systemd
will consider the service activated immediately after specified inExecStart
process is started. If there is downstream service depending on the state prepared inExecStart,
there will be race condition.For example, issue #390. In this case,
database.service
callsdatabase.sh
, which callsdocker run
ordocker start -a
to start database container. However, systemd considersdatabase.service
successfully started at the timedatabase.sh
begins, not afterdocker run
finishes. Asdatabase.service
is consider started,bgp.service
can be started. The redis database, which bgp service depends on, might or might not have been started at this time point.To fix this issue (and still keeping the functionality to monitor docker status with systemd), we split the
ExecStart
process into anExecStartPre
part and anExecStart
part.docker run
is splitted intodocker run -d
thendocker attach
, whiledocker start -a
is splitted intodocker start
and thendocker attach
. In this way, we make sure the downstream services are blocked until container is successfully started.