Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: [benchmark][standalone] build index raise error: Connection reset by peer #14733

Closed
1 task done
wangting0128 opened this issue Jan 4, 2022 · 15 comments
Closed
1 task done
Assignees
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Milestone

Comments

@wangting0128
Copy link
Contributor

Is there an existing issue for this?

  • I have searched the existing issues

Environment

- Milvus version: master-20211231-9baa6e8
- Deployment mode(standalone or cluster): standalone
- SDK version(e.g. pymilvus v2.0.0rc2): 2.0.0rc9.dev22
- OS(Ubuntu or CentOS): 
- CPU/Memory: 
- GPU: 
- Others:

Current Behavior

client pod: benchmark-no-clean-qn4c7-2274554322

client log:

[2021-12-31 10:26:55,340] [   DEBUG] - Milvus insert run in 1.0092s (milvus_benchmark.client:53)
[2021-12-31 10:26:55,342] [   DEBUG] - Row count: 85974952 in collection: <sift_100m_128_l2> (milvus_benchmark.client:421)
[2021-12-31 10:26:55,342] [   DEBUG] - 85974952 (milvus_benchmark.runners.base:89)
[2021-12-31 10:26:55,795] [   DEBUG] - Start id: 99950000, end id: 100000000 (milvus_benchmark.runners.base:76)
[2021-12-31 10:26:56,775] [   DEBUG] - Milvus insert run in 0.9778s (milvus_benchmark.client:53)
[2021-12-31 10:26:56,777] [   DEBUG] - Row count: 85974952 in collection: <sift_100m_128_l2> (milvus_benchmark.client:421)
[2021-12-31 10:26:56,777] [   DEBUG] - 85974952 (milvus_benchmark.runners.base:89)
[2021-12-31 10:26:56,778] [    INFO] - {'total_time': 3312.194, 'rps': 30191.4683, 'ni_time': 1.6561} (milvus_benchmark.runners.base:151)
[2021-12-31 10:26:56,839] [   DEBUG] - Start flush. (milvus_benchmark.runners.locust:428)
[2021-12-31 10:32:25,137] [   DEBUG] - Milvus flush run in 328.2954s (milvus_benchmark.client:53)
[2021-12-31 10:32:25,137] [   DEBUG] - Fulsh done, during time: 328.2978 (milvus_benchmark.runners.locust:431)
[2021-12-31 10:32:25,139] [   DEBUG] - Row count: 100000000 in collection: <sift_100m_128_l2> (milvus_benchmark.client:421)
[2021-12-31 10:32:25,139] [   DEBUG] - 100000000 (milvus_benchmark.runners.locust:432)
[2021-12-31 10:32:25,139] [   DEBUG] - Start build index for last file (milvus_benchmark.runners.locust:434)
[2021-12-31 10:32:25,139] [    INFO] - Building index start, collection_name: sift_100m_128_l2, index_type: IVF_SQ8, metric_type: L2 (milvus_benchmark.client:274)
[2021-12-31 10:32:25,140] [    INFO] - {'nlist': 2048} (milvus_benchmark.client:276)
[2021-12-31 10:32:25,140] [   DEBUG] - collection: sift_100m_128_l2 Index params: {'index_type': 'IVF_SQ8', 'metric_type': 'L2', 'params': {'nlist': 2048}} (milvus_benchmark.client:282)
[2021-12-31 10:48:35,818] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] get_index_state
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640947715.817063522","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 10:48:33.578502', 'RPC start': '2021-12-31 10:48:33.578528', 'RPC error': '2021-12-31 10:48:35.817933'} (pymilvus.client.grpc_handler:84)
[2021-12-31 10:48:35,818] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] wait_for_creating_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640947715.817063522","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 10:32:25.145586', 'RPC start': '2021-12-31 10:32:25.145593', 'RPC error': '2021-12-31 10:48:35.818639'} (pymilvus.client.grpc_handler:84)
[2021-12-31 10:48:35,818] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] create_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640947715.817063522","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 10:32:25.140683', 'RPC start': '2021-12-31 10:32:25.140688', 'RPC error': '2021-12-31 10:48:35.818902'} (pymilvus.client.grpc_handler:84)
[2021-12-31 11:14:51,851] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] get_index_state
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "{"created":"@1640949291.850369878","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Socket closed","grpc_status":14}"
>
	{'API start': '2021-12-31 11:14:50.105647', 'RPC start': '2021-12-31 11:14:50.105676', 'RPC error': '2021-12-31 11:14:51.851168'} (pymilvus.client.grpc_handler:84)
[2021-12-31 11:14:51,851] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] wait_for_creating_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "{"created":"@1640949291.850369878","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Socket closed","grpc_status":14}"
>
	{'API start': '2021-12-31 10:48:52.208164', 'RPC start': '2021-12-31 10:48:52.208171', 'RPC error': '2021-12-31 11:14:51.851743'} (pymilvus.client.grpc_handler:84)
[2021-12-31 11:14:51,852] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] create_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "{"created":"@1640949291.850369878","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Socket closed","grpc_status":14}"
>
	{'API start': '2021-12-31 10:48:52.200610', 'RPC start': '2021-12-31 10:48:52.200629', 'RPC error': '2021-12-31 11:14:51.852007'} (pymilvus.client.grpc_handler:84)
[2021-12-31 11:38:57,096] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] get_index_state
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640950737.095372718","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 11:38:56.033340', 'RPC start': '2021-12-31 11:38:56.033378', 'RPC error': '2021-12-31 11:38:57.096084'} (pymilvus.client.grpc_handler:84)
[2021-12-31 11:38:57,158] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] wait_for_creating_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640950737.095372718","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 11:15:13.827199', 'RPC start': '2021-12-31 11:15:13.827207', 'RPC error': '2021-12-31 11:38:57.158591'} (pymilvus.client.grpc_handler:84)
[2021-12-31 11:38:57,160] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] create_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640950737.095372718","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 11:15:13.819777', 'RPC start': '2021-12-31 11:15:13.819790', 'RPC error': '2021-12-31 11:38:57.160625'} (pymilvus.client.grpc_handler:84)
[2021-12-31 12:03:07,866] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] get_index_state
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640952187.865728114","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 12:03:06.197313', 'RPC start': '2021-12-31 12:03:06.197347', 'RPC error': '2021-12-31 12:03:07.866491'} (pymilvus.client.grpc_handler:84)
[2021-12-31 12:03:07,867] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] wait_for_creating_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640952187.865728114","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 11:39:05.835937', 'RPC start': '2021-12-31 11:39:05.835944', 'RPC error': '2021-12-31 12:03:07.867098'} (pymilvus.client.grpc_handler:84)
[2021-12-31 12:03:07,867] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] create_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640952187.865728114","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 11:39:05.827354', 'RPC start': '2021-12-31 11:39:05.827386', 'RPC error': '2021-12-31 12:03:07.867367'} (pymilvus.client.grpc_handler:84)
[2021-12-31 12:24:22,983] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] get_index_state
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640953462.982926721","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 12:24:21.625613', 'RPC start': '2021-12-31 12:24:21.625640', 'RPC error': '2021-12-31 12:24:22.983646'} (pymilvus.client.grpc_handler:84)
[2021-12-31 12:24:22,984] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] wait_for_creating_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640953462.982926721","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 12:03:13.513834', 'RPC start': '2021-12-31 12:03:13.513841', 'RPC error': '2021-12-31 12:24:22.984392'} (pymilvus.client.grpc_handler:84)
[2021-12-31 12:24:22,984] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] create_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640953462.982926721","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 12:03:13.506393', 'RPC start': '2021-12-31 12:03:13.506409', 'RPC error': '2021-12-31 12:24:22.984650'} (pymilvus.client.grpc_handler:84)
[2021-12-31 12:48:44,526] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] get_index_state
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640954924.525401123","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 12:48:43.057646', 'RPC start': '2021-12-31 12:48:43.057673', 'RPC error': '2021-12-31 12:48:44.526163'} (pymilvus.client.grpc_handler:84)
[2021-12-31 12:48:44,526] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] wait_for_creating_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640954924.525401123","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 12:24:27.813955', 'RPC start': '2021-12-31 12:24:27.813962', 'RPC error': '2021-12-31 12:48:44.526879'} (pymilvus.client.grpc_handler:84)
[2021-12-31 12:48:44,527] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] create_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640954924.525401123","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 12:24:27.805468', 'RPC start': '2021-12-31 12:24:27.805489', 'RPC error': '2021-12-31 12:48:44.527137'} (pymilvus.client.grpc_handler:84)
[2021-12-31 13:13:50,011] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] get_index_state
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "{"created":"@1640956430.009793259","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Socket closed","grpc_status":14}"
>
	{'API start': '2021-12-31 13:13:48.175582', 'RPC start': '2021-12-31 13:13:48.175608', 'RPC error': '2021-12-31 13:13:50.011065'} (pymilvus.client.grpc_handler:84)
[2021-12-31 13:13:50,025] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] wait_for_creating_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "{"created":"@1640956430.009793259","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Socket closed","grpc_status":14}"
>
	{'API start': '2021-12-31 12:48:52.123560', 'RPC start': '2021-12-31 12:48:52.123604', 'RPC error': '2021-12-31 13:13:50.024963'} (pymilvus.client.grpc_handler:84)
[2021-12-31 13:13:50,025] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] create_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "{"created":"@1640956430.009793259","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Socket closed","grpc_status":14}"
>
	{'API start': '2021-12-31 12:48:52.114714', 'RPC start': '2021-12-31 12:48:52.114731', 'RPC error': '2021-12-31 13:13:50.025677'} (pymilvus.client.grpc_handler:84)
[2021-12-31 13:39:10,882] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] get_index_state
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640957950.881493695","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 13:39:09.274784', 'RPC start': '2021-12-31 13:39:09.274811', 'RPC error': '2021-12-31 13:39:10.882245'} (pymilvus.client.grpc_handler:84)
[2021-12-31 13:39:10,883] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] wait_for_creating_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640957950.881493695","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 13:13:57.015854', 'RPC start': '2021-12-31 13:13:57.015862', 'RPC error': '2021-12-31 13:39:10.882977'} (pymilvus.client.grpc_handler:84)
[2021-12-31 13:39:10,901] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] create_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640957950.881493695","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 13:13:57.008024', 'RPC start': '2021-12-31 13:13:57.008056', 'RPC error': '2021-12-31 13:39:10.901482'} (pymilvus.client.grpc_handler:84)
[2021-12-31 14:06:05,852] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] get_index_state
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640959565.851767204","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 14:06:05.038502', 'RPC start': '2021-12-31 14:06:05.038531', 'RPC error': '2021-12-31 14:06:05.852507'} (pymilvus.client.grpc_handler:84)
[2021-12-31 14:06:05,870] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] wait_for_creating_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640959565.851767204","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 13:39:22.995043', 'RPC start': '2021-12-31 13:39:22.995062', 'RPC error': '2021-12-31 14:06:05.870401'} (pymilvus.client.grpc_handler:84)
[2021-12-31 14:06:05,870] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] create_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640959565.851767204","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 13:39:22.986301', 'RPC start': '2021-12-31 13:39:22.986317', 'RPC error': '2021-12-31 14:06:05.870725'} (pymilvus.client.grpc_handler:84)
[2021-12-31 14:31:56,387] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] get_index_state
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640961116.386566928","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 14:31:55.374654', 'RPC start': '2021-12-31 14:31:55.374682', 'RPC error': '2021-12-31 14:31:56.387611'} (pymilvus.client.grpc_handler:84)
[2021-12-31 14:31:56,388] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] wait_for_creating_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640961116.386566928","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 14:06:12.806427', 'RPC start': '2021-12-31 14:06:12.806435', 'RPC error': '2021-12-31 14:31:56.388781'} (pymilvus.client.grpc_handler:84)
[2021-12-31 14:31:56,434] [   ERROR] - 
Addr [benchmark-no-clean-qn4c7-1-milvus.qa-milvus.svc.cluster.local:19530] create_index
RPC error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640961116.386566928","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
	{'API start': '2021-12-31 14:06:12.798839', 'RPC start': '2021-12-31 14:06:12.798848', 'RPC error': '2021-12-31 14:31:56.434207'} (pymilvus.client.grpc_handler:84)
[2021-12-31 14:31:56,436] [   ERROR] - <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640961116.386566928","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
> (milvus_benchmark.main:118)
[2021-12-31 14:31:56,441] [   ERROR] - Traceback (most recent call last):
  File "main.py", line 87, in run_suite
    runner.prepare(**cases[0])
  File "/src/milvus_benchmark/runners/locust.py", line 436, in prepare
    self.milvus.create_index(index_field_name, case_param["index_type"], case_param["metric_type"], index_param=case_param["index_param"])
  File "/src/milvus_benchmark/client.py", line 49, in wrapper
    result = func(*args, **kwargs)
  File "/src/milvus_benchmark/client.py", line 283, in create_index
    res = self._milvus.create_index(tmp_collection_name, field_name, index_params, _async=_async)
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/stub.py", line 54, in handler
    raise e
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/stub.py", line 42, in handler
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/stub.py", line 810, in create_index
    return handler.create_index(collection_name, field_name, params, timeout, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 85, in handler
    raise e
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 67, in handler
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 771, in create_index
    field_name=field_name, timeout=timeout)
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 85, in handler
    raise e
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 67, in handler
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 819, in wait_for_creating_index
    state, fail_reason = self.get_index_state(collection_name, field_name, timeout)
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 85, in handler
    raise e
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 67, in handler
    return func(self, *args, **kwargs)
  File "/usr/local/lib/python3.6/site-packages/pymilvus/client/grpc_handler.py", line 808, in get_index_state
    response = rf.result()
  File "/usr/local/lib/python3.6/site-packages/grpc/_channel.py", line 744, in result
    raise self
grpc._channel._MultiThreadedRendezvous: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1640961116.386566928","description":"Error received from peer ipv4:10.96.20.244:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
>
 (milvus_benchmark.main:119)
[2021-12-31 14:31:56,442] [   DEBUG] - {'_version': '0.1', '_type': 'metric', 'run_id': 1640939608, 'mode': 'local', 'server': <milvus_benchmark.metrics.models.server.Server object at 0x7f20838b7f28>, 'hardware': <milvus_benchmark.metrics.models.hardware.Hardware object at 0x7f20838b7d68>, 'env': <milvus_benchmark.metrics.models.env.Env object at 0x7f20838b7cc0>, 'status': 'RUN_FAILED', 'err_message': '', 'collection': {'dimension': 128, 'metric_type': 'l2', 'dataset_name': 'sift_100m_128_l2', 'collection_size': 100000000, 'other_fields': None, 'ni_per': 50000, 'shards_num': None}, 'index': {'index_type': 'ivf_sq8', 'index_param': {'nlist': 2048}}, 'search': None, 'run_params': {'task': {'types': [{'type': 'query', 'weight': 8, 'params': {'top_k': 10, 'nq': 10, 'search_param': {'nprobe': 16}}}, {'type': 'load', 'weight': 1}, {'type': 'get', 'weight': 8, 'params': {'ids_length': 10}}, {'type': 'scene_test', 'weight': 2}], 'connection_num': 1, 'clients_num': 20, 'spawn_rate': 2, 'during_time': 302400}, 'connection_type': 'single'}, 'metrics': {'type': 'locust_random_performance', 'value': {}}, 'datetime': '2021-12-31 08:33:28.521756', 'type': 'metric'} (milvus_benchmark.metric.api:29)

Expected Behavior

argo task: benchmark-no-clean-qn4c7

test yaml:
client-configmap:client-random-locust-100m-ddl-r8-w2
server-configmap:server-single-32c128m

server:

NAME                                                              READY   STATUS      RESTARTS   AGE     IP             NODE                      NOMINATED NODE   READINESS GATES
benchmark-no-clean-qn4c7-1-etcd-0                                 1/1     Running     0          3d17h   10.97.17.80    qa-node014.zilliz.local   <none>           <none>
benchmark-no-clean-qn4c7-1-milvus-standalone-794bd9b55f-p8ldk     1/1     Running     211        3d17h   10.97.20.129   qa-node018.zilliz.local   <none>           <none>
benchmark-no-clean-qn4c7-1-minio-777b849b79-nvx7g                 1/1     Running     0          3d17h   10.97.12.136   qa-node015.zilliz.local   <none>           <none>

Steps To Reproduce

1、create collation
2、build index of ivf_sq8
3、insert 100 million vectors
4、flush collection
5、build index with the same params <- raise error

Anything else?

client-random-locust-100m-ddl-r8-w2:

locust_random_performance:
      collections:
        -
          collection_name: sift_100m_128_l2
          ni_per: 50000
          build_index: true
          index_type: ivf_sq8
          index_param:
            nlist: 2048
          task:
            types:
              -
                type: query
                weight: 8
                params:
                  top_k: 10
                  nq: 10
                  search_param:
                    nprobe: 16
              -
                type: load
                weight: 1
              -
                type: get
                weight: 8
                params:
                  ids_length: 10
              -
                type: scene_test
                weight: 2
            connection_num: 1
            clients_num: 20
            spawn_rate: 2
            during_time: 302400
@wangting0128 wangting0128 added kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. test/benchmark benchmark test labels Jan 4, 2022
@yanliang567
Copy link
Contributor

it looks like the standalone pod is OOM, but don't know why, as it used to be working with 128GB memory.
/assign @czs007

/unassign

@sre-ci-robot sre-ci-robot assigned czs007 and unassigned yanliang567 Jan 4, 2022
@yanliang567 yanliang567 added this to the 2.0.0-GA milestone Jan 4, 2022
@yanliang567 yanliang567 added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Jan 4, 2022
@xiaocai2333
Copy link
Contributor

/assign

@xiaocai2333
Copy link
Contributor

minio crashed.

@xiaocai2333
Copy link
Contributor

/unassign

@wangting0128
Copy link
Contributor Author

argo task: benchmark-tag-jd2gd

test yaml:
client-configmap:client-random-locust-100m-ddl-r8-w2
server-configmap:server-single-32c128m

server:

NAME                                                         READY   STATUS      RESTARTS   AGE    IP             NODE                      NOMINATED NODE   READINESS GATES
benchmark-tag-jd2gd-1-etcd-0                                 1/1     Running     0          12h    10.97.17.218   qa-node014.zilliz.local   <none>           <none>
benchmark-tag-jd2gd-1-milvus-standalone-f78964988-h2l6w      1/1     Running     4          12h    10.97.20.23    qa-node018.zilliz.local   <none>           <none>
benchmark-tag-jd2gd-1-minio-544d765c7b-c6684                 1/1     Running     0          12h    10.97.12.2     qa-node015.zilliz.local   <none>           <none>

client pod: benchmark-tag-jd2gd-2832215959

clieng log:

[2022-01-13 15:56:05,079] [   DEBUG] - Milvus get run in 2.3422s (milvus_benchmark.client:53)
[2022-01-13 15:56:13,342] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1642089373.340958738","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
> (pymilvus.decorators:75)
[2022-01-13 15:56:13,343] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Connection reset by peer"
	debug_error_string = "{"created":"@1642089373.340958738","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Connection reset by peer","grpc_status":14}"
> (pymilvus.decorators:75)
[2022-01-13 15:56:52,802] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "{"created":"@1642089412.801394354","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Socket closed","grpc_status":14}"
> (pymilvus.decorators:75)
[2022-01-13 15:56:52,803] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "{"created":"@1642089412.801394354","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Socket closed","grpc_status":14}"
> (pymilvus.decorators:75)
[2022-01-13 15:56:52,805] [   DEBUG] - Ramping to {"MyUser": 4} (4 total users) (locust.runners:326)
[2022-01-13 15:56:52,805] [   DEBUG] - Spawning additional {"MyUser": 2} ({"MyUser": 2} already running)... (locust.runners:202)
[2022-01-13 15:56:52,807] [   DEBUG] - 4 users spawned (locust.runners:212)
[2022-01-13 15:56:52,807] [   DEBUG] - All users of class MyUser spawned (locust.runners:213)
[2022-01-13 15:56:52,808] [   DEBUG] - 0 users have been stopped, 4 still running (locust.runners:271)
[2022-01-13 15:56:52,811] [    INFO] -  Name                                                                              # reqs      # fails  |     Avg     Min     Max  Median  |   req/s failures/s (locust.stats_logger:725)
[2022-01-13 15:56:52,812] [    INFO] - ---------------------------------------------------------------------------------------------------------------------------------------------------------------- (locust.stats_logger:727)
[2022-01-13 15:56:52,814] [    INFO] -  grpc get                                                                               2     0(0.00%)  |    1764    1184    2344    1200  |    0.00    0.00 (locust.stats_logger:730)
[2022-01-13 15:56:52,814] [    INFO] - ---------------------------------------------------------------------------------------------------------------------------------------------------------------- (locust.stats_logger:731)
[2022-01-13 15:56:52,815] [    INFO] -  Aggregated                                                                             2     0(0.00%)  |    1764    1184    2344    1200  |    0.00    0.00 (locust.stats_logger:732)
[2022-01-13 15:56:52,816] [    INFO] -  (locust.stats_logger:733)
[2022-01-13 15:56:52,817] [   DEBUG] - [scene_test] Start scene test : scene_test_2619_675445 (milvus_benchmark.client:511)
[2022-01-13 15:57:20,631] [    INFO] - Create collection: <scene_test_2619_675445> successfully (milvus_benchmark.client:149)
[2022-01-13 15:57:20,632] [   DEBUG] - Milvus create_collection run in 27.8144s (milvus_benchmark.client:53)
[2022-01-13 15:58:20,632] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642089500.632006021","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 15:58:20,634] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642089500.632006021","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:02:20,635] [   DEBUG] - [scene_test] Start scene test : scene_test_8922_605798 (milvus_benchmark.client:511)
[2022-01-13 16:02:20,640] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642089740.634045710","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:02:20,647] [    INFO] - Create collection: <scene_test_8922_605798> successfully (milvus_benchmark.client:149)
[2022-01-13 16:02:20,649] [   DEBUG] - Milvus create_collection run in 0.0103s (milvus_benchmark.client:53)
[2022-01-13 16:02:20,651] [   DEBUG] - Ramping to {"MyUser": 6} (6 total users) (locust.runners:326)
[2022-01-13 16:02:20,653] [   DEBUG] - Spawning additional {"MyUser": 2} ({"MyUser": 4} already running)... (locust.runners:202)
[2022-01-13 16:02:20,654] [   DEBUG] - 6 users spawned (locust.runners:212)
[2022-01-13 16:02:20,655] [   DEBUG] - All users of class MyUser spawned (locust.runners:213)
[2022-01-13 16:02:20,656] [   DEBUG] - 0 users have been stopped, 6 still running (locust.runners:271)
[2022-01-13 16:02:20,658] [    INFO] -  Name                                                                              # reqs      # fails  |     Avg     Min     Max  Median  |   req/s failures/s (locust.stats_logger:725)
[2022-01-13 16:02:20,659] [    INFO] - ---------------------------------------------------------------------------------------------------------------------------------------------------------------- (locust.stats_logger:727)
[2022-01-13 16:02:20,661] [    INFO] -  grpc get                                                                               2     0(0.00%)  |    1764    1184    2344    1200  |    0.00    0.00 (locust.stats_logger:730)
[2022-01-13 16:02:20,662] [    INFO] -  grpc query                                                                             1   1(100.00%)  |  135556  135556  135556  135556  |    0.00    0.00 (locust.stats_logger:730)
[2022-01-13 16:02:20,663] [    INFO] - ---------------------------------------------------------------------------------------------------------------------------------------------------------------- (locust.stats_logger:731)
[2022-01-13 16:02:20,664] [    INFO] -  Aggregated                                                                             3    1(33.33%)  |   46361    1184  135556    2300  |    0.00    0.00 (locust.stats_logger:732)
[2022-01-13 16:02:20,665] [    INFO] -  (locust.stats_logger:733)
[2022-01-13 16:02:20,669] [   DEBUG] - [scene_test] Start scene test : scene_test_4663_808287 (milvus_benchmark.client:511)
[2022-01-13 16:02:20,674] [   DEBUG] - Milvus get_info run in 0.0074s (milvus_benchmark.client:53)
[2022-01-13 16:02:20,675] [   DEBUG] - [scene_test] Start insert : scene_test_2619_675445 (milvus_benchmark.client:518)
[2022-01-13 16:03:20,675] [   DEBUG] - Milvus load_collection run in 60.0071s (milvus_benchmark.client:53)
[2022-01-13 16:03:20,676] [    INFO] - Create collection: <scene_test_4663_808287> successfully (milvus_benchmark.client:149)
[2022-01-13 16:03:20,677] [   DEBUG] - Milvus create_collection run in 60.0069s (milvus_benchmark.client:53)
[2022-01-13 16:03:20,678] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642089800.675006682","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:03:20,679] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642089800.675006682","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:03:20,758] [   DEBUG] - Milvus insert run in 60.0822s (milvus_benchmark.client:53)
[2022-01-13 16:03:20,759] [   DEBUG] - [scene_test] Start flush : scene_test_2619_675445 (milvus_benchmark.client:520)
[2022-01-13 16:04:20,767] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642089860.767011674","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:04:20,768] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642089860.767011674","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:04:20,771] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642089860.769764642","description":"Deadline Exceeded","file":"src/core/ext/filters/deadline/deadline_filter.cc","file_line":81,"grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:04:20,771] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642089860.769764642","description":"Deadline Exceeded","file":"src/core/ext/filters/deadline/deadline_filter.cc","file_line":81,"grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:04:20,773] [   DEBUG] - Milvus flush run in 60.0135s (milvus_benchmark.client:53)
[2022-01-13 16:04:20,778] [   DEBUG] - Row count: 3000 in collection: <scene_test_2619_675445> (milvus_benchmark.client:421)
[2022-01-13 16:04:20,778] [   DEBUG] - [scene_test] Start create index : scene_test_2619_675445 (milvus_benchmark.client:525)
[2022-01-13 16:04:20,779] [    INFO] - Building index start, collection_name: scene_test_2619_675445, index_type: IVF_SQ8, metric_type: L2 (milvus_benchmark.client:274)
[2022-01-13 16:04:20,779] [    INFO] - {'nlist': 2048} (milvus_benchmark.client:276)
[2022-01-13 16:04:20,779] [   DEBUG] - collection: scene_test_2619_675445 Index params: {'index_type': 'IVF_SQ8', 'metric_type': 'L2', 'params': {'nlist': 2048}} (milvus_benchmark.client:282)
[2022-01-13 16:04:33,050] [   DEBUG] - Milvus get run in 12.276s (milvus_benchmark.client:53)
[2022-01-13 16:04:33,051] [   DEBUG] - Ramping to {"MyUser": 8} (8 total users) (locust.runners:326)
[2022-01-13 16:04:33,052] [   DEBUG] - Spawning additional {"MyUser": 2} ({"MyUser": 6} already running)... (locust.runners:202)
[2022-01-13 16:04:33,052] [   DEBUG] - 8 users spawned (locust.runners:212)
[2022-01-13 16:04:33,053] [   DEBUG] - All users of class MyUser spawned (locust.runners:213)
[2022-01-13 16:04:33,053] [   DEBUG] - 0 users have been stopped, 8 still running (locust.runners:271)
[2022-01-13 16:04:33,054] [    INFO] -  Name                                                                              # reqs      # fails  |     Avg     Min     Max  Median  |   req/s failures/s (locust.stats_logger:725)
[2022-01-13 16:04:33,054] [    INFO] - ---------------------------------------------------------------------------------------------------------------------------------------------------------------- (locust.stats_logger:727)
[2022-01-13 16:04:33,055] [    INFO] -  grpc get                                                                               3     0(0.00%)  |    5268    1184   12276    2300  |    0.00    0.00 (locust.stats_logger:730)
[2022-01-13 16:04:33,055] [    INFO] -  grpc load_collection                                                                   1     0(0.00%)  |   60008   60008   60008   60008  |    0.00    0.00 (locust.stats_logger:730)
[2022-01-13 16:04:33,056] [    INFO] -  grpc query                                                                             4   4(100.00%)  |  172833   60088  435597   60088  |    0.00    0.00 (locust.stats_logger:730)
[2022-01-13 16:04:33,056] [    INFO] - ---------------------------------------------------------------------------------------------------------------------------------------------------------------- (locust.stats_logger:731)
[2022-01-13 16:04:33,056] [    INFO] -  Aggregated                                                                             8    4(50.00%)  |   95893    1184  435597   60000  |    0.00    0.00 (locust.stats_logger:732)
[2022-01-13 16:04:33,057] [    INFO] -  (locust.stats_logger:733)
[2022-01-13 16:04:33,058] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "{"created":"@1642089873.047474571","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Socket closed","grpc_status":14}"
> (pymilvus.decorators:75)
[2022-01-13 16:04:33,058] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.UNAVAILABLE
	details = "Socket closed"
	debug_error_string = "{"created":"@1642089873.047474571","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Socket closed","grpc_status":14}"
> (pymilvus.decorators:75)
[2022-01-13 16:06:30,391] [   DEBUG] - [scene_test] Start scene test : scene_test_8837_571607 (milvus_benchmark.client:511)
[2022-01-13 16:06:30,397] [   DEBUG] - Milvus get_info run in 117.3457s (milvus_benchmark.client:53)
[2022-01-13 16:06:30,398] [   DEBUG] - [scene_test] Start insert : scene_test_8922_605798 (milvus_benchmark.client:518)
[2022-01-13 16:06:30,398] [   DEBUG] - Milvus get_info run in 117.3408s (milvus_benchmark.client:53)
[2022-01-13 16:06:30,398] [   DEBUG] - [scene_test] Start insert : scene_test_4663_808287 (milvus_benchmark.client:518)
[2022-01-13 16:11:30,401] [    INFO] - Create collection: <scene_test_8837_571607> successfully (milvus_benchmark.client:149)
[2022-01-13 16:11:30,403] [   DEBUG] - Milvus create_collection run in 300.0104s (milvus_benchmark.client:53)
[2022-01-13 16:11:30,405] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642090290.398277523","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:12:30,402] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642090350.402017813","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:12:30,404] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642090350.402017813","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:12:30,409] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642090350.405990065","description":"Deadline Exceeded","file":"src/core/ext/filters/deadline/deadline_filter.cc","file_line":81,"grpc_status":4}"
> (pymilvus.decorators:75)
[2022-01-13 16:17:30,488] [   DEBUG] - Milvus insert run in 660.0901s (milvus_benchmark.client:53)
[2022-01-13 16:17:30,490] [   DEBUG] - [scene_test] Start flush : scene_test_8922_605798 (milvus_benchmark.client:520)
[2022-01-13 16:17:30,491] [   DEBUG] - Milvus insert run in 660.0921s (milvus_benchmark.client:53)
[2022-01-13 16:17:30,492] [   DEBUG] - [scene_test] Start flush : scene_test_4663_808287 (milvus_benchmark.client:520)
[2022-01-13 16:17:30,492] [   ERROR] - Error: <_MultiThreadedRendezvous of RPC that terminated with:
	status = StatusCode.DEADLINE_EXCEEDED
	details = "Deadline Exceeded"
	debug_error_string = "{"created":"@1642090650.487085723","description":"Error received from peer ipv4:10.96.250.111:19530","file":"src/core/lib/surface/call.cc","file_line":1068,"grpc_message":"Deadline Exceeded","grpc_status":4}"
> (pymilvus.decorators:75)

@xiaocai2333
Copy link
Contributor

/assign

@xiaocai2333
Copy link
Contributor

QueryNode offline.

[2022/01/13 15:57:39.837 +00:00] [DEBUG] [session_util.go:379] ["watch services"] ["delete kv"="key:\"by-dev/meta/session/querynode-14\" create_revision:14802 mod_revision:14802 version:1 value:\"{\\\"ServerID\\\":14,\\\"ServerName\\\":\\\"querynode\\\",\\\"Address\\\":\\\"10.97.20.23:21123\\\",\\\"TriggerKill\\\":true}\" lease:313420547285115962 "]
[2022/01/13 15:57:39.837 +00:00] [DEBUG] [cluster.go:617] ["stopNode: queryNode offline"] [nodeID=14]

@xiaofan-luan
Copy link
Collaborator

@xiaocai2333 why is querynode offline affect getIndexState?

@xiaocai2333
Copy link
Contributor

@xiaocai2333 why is querynode offline affect getIndexState?

@xiaofan-luan The latest bug pasted by @wangting0128 is because of the search timeout, not the GetIndexState timeout.

@xiaocai2333
Copy link
Contributor

The real reason of timeout is same as #14077 , quertnode offline is the previous querynode that went offline after restarting.

@xiaocai2333
Copy link
Contributor

And there is another problem here. The address of the pod does not change after restart, QueryCoord can get the session of previous RootCoord to access the new RootCoord, even if the new RootCoord is not registerd successfully.

@xiaofan-luan
Copy link
Collaborator

@xiaocai2333 is there any fix need to be done for this issue?

@xiaofan-luan xiaofan-luan modified the milestones: 2.0.0-GA, 2.0.1 Jan 25, 2022
@xiaocai2333
Copy link
Contributor

The reason of timeout is same as #14077, and another problem has been fixed by #15261.

@xiaocai2333
Copy link
Contributor

Maybe this issue can be closed. @wangting0128

@xiaocai2333 xiaocai2333 removed their assignment Jan 25, 2022
@xiaocai2333 xiaocai2333 assigned wangting0128 and unassigned czs007 Jan 25, 2022
@wangting0128
Copy link
Contributor Author

not reproduced

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Issues or changes related a bug priority/critical-urgent Highest priority. Must be actively worked on as someone's top priority right now. test/benchmark benchmark test triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants