-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
evaluate_functional_correctness can't run #18
Comments
If you are on windows, this problem can be solved by declaring a global var of unsafe_execute just above the function. And after this you would encounter Attribute error for unsafe_execute, that can be solved by installing multiprocess library and then replacing "multiprocessing" with multiprocess instead. Another error on windows would be of pass@=0 instead of 0.4999, this can be solved by removing the timeout function in execution.py for sample sanity check only, for testing on generated samples you have to use timeout but have to find a way to use something different than setitimer() function which causes the pass@=0 issue. |
when I run on mac,I get the same error,can anyone help?thanks a lot |
reinstall multiprocess, and replace multiprocessing used in human_eval/execution.py, then here we go. |
so sad!!! I also have the same problem. Do u solve it, my friend? thx |
After importing and change from multiprocessing to multiprocess. What worked for me was to use this windows machine:
linux machine
|
On a Mac with Python 3.10 installing |
For Windows user, here is more things need to be done:
|
This works well for me on windows 11. Thank you so much! |
Thanks all! I have resolved the issue with the given advices. |
I created a conda environment with python3.7 using the exact same command in the doc. Then, I used openai's text-davinci-002 to generate a samples.jsonl file with 3 results for each problem.
Calling
evaluate_functional_correctness samples.jsonl
, I got the error message as below. I also tried to evaluate the example results withevaluate_functional_correctness data/example_samples.jsonl --problem_file=data/example_problem.jsonl
, and got the same error.I wonder how to fix it?
Error message:
Reading samples...
6it [00:00, 7427.93it/s]
Running test suites...
0%| | 0/6 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/opt/miniconda3/envs/codex/bin/evaluate_functional_correctness", line 33, in
sys.exit(load_entry_point('human-eval', 'console_scripts', 'evaluate_functional_correctness')())
File "/opt/miniconda3/envs/codex/bin/evaluate_functional_correctness", line 25, in importlib_load_entry_point
return next(matches).load()
File "/opt/miniconda3/envs/codex/lib/python3.8/importlib/metadata.py", line 77, in load
module = import_module(match.group('module'))
File "/opt/miniconda3/envs/codex/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 843, in exec_module
File "", line 219, in _call_with_frames_removed
File "/Users/boyuanchen/Desktop/human-eval/human_eval/evaluate_functional_correctness.py", line 30, in
sys.exit(main())
File "/Users/boyuanchen/Desktop/human-eval/human_eval/evaluate_functional_correctness.py", line 27, in main
fire.Fire(entry_point)
File "/opt/miniconda3/envs/codex/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/opt/miniconda3/envs/codex/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/opt/miniconda3/envs/codex/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/Users/boyuanchen/Desktop/human-eval/human_eval/evaluate_functional_correctness.py", line 22, in entry_point
results = evaluate_functional_correctness(sample_file, k, n_workers, timeout, problem_file)
File "/Users/boyuanchen/Desktop/human-eval/human_eval/evaluation.py", line 75, in evaluate_functional_correctness
result = future.result()
File "/opt/miniconda3/envs/codex/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/opt/miniconda3/envs/codex/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/opt/miniconda3/envs/codex/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/boyuanchen/Desktop/human-eval/human_eval/execution.py", line 77, in check_correctness
p.start()
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'check_correctness..unsafe_execute
`
The text was updated successfully, but these errors were encountered: