evaluate_functional_correctness can't run #18

BoyuanJackChen · 2022-11-28T16:26:21Z

I created a conda environment with python3.7 using the exact same command in the doc. Then, I used openai's text-davinci-002 to generate a samples.jsonl file with 3 results for each problem.

Calling evaluate_functional_correctness samples.jsonl, I got the error message as below. I also tried to evaluate the example results with evaluate_functional_correctness data/example_samples.jsonl --problem_file=data/example_problem.jsonl, and got the same error.

I wonder how to fix it?

Error message:
Reading samples...
6it [00:00, 7427.93it/s]
Running test suites...
0%| | 0/6 [00:00<?, ?it/s]
Traceback (most recent call last):
File "/opt/miniconda3/envs/codex/bin/evaluate_functional_correctness", line 33, in
sys.exit(load_entry_point('human-eval', 'console_scripts', 'evaluate_functional_correctness')())
File "/opt/miniconda3/envs/codex/bin/evaluate_functional_correctness", line 25, in importlib_load_entry_point
return next(matches).load()
File "/opt/miniconda3/envs/codex/lib/python3.8/importlib/metadata.py", line 77, in load
module = import_module(match.group('module'))
File "/opt/miniconda3/envs/codex/lib/python3.8/importlib/init.py", line 127, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File "", line 1014, in _gcd_import
File "", line 991, in _find_and_load
File "", line 975, in _find_and_load_unlocked
File "", line 671, in _load_unlocked
File "", line 843, in exec_module
File "", line 219, in _call_with_frames_removed
File "/Users/boyuanchen/Desktop/human-eval/human_eval/evaluate_functional_correctness.py", line 30, in
sys.exit(main())
File "/Users/boyuanchen/Desktop/human-eval/human_eval/evaluate_functional_correctness.py", line 27, in main
fire.Fire(entry_point)
File "/opt/miniconda3/envs/codex/lib/python3.8/site-packages/fire/core.py", line 141, in Fire
component_trace = _Fire(component, args, parsed_flag_args, context, name)
File "/opt/miniconda3/envs/codex/lib/python3.8/site-packages/fire/core.py", line 466, in _Fire
component, remaining_args = _CallAndUpdateTrace(
File "/opt/miniconda3/envs/codex/lib/python3.8/site-packages/fire/core.py", line 681, in _CallAndUpdateTrace
component = fn(*varargs, **kwargs)
File "/Users/boyuanchen/Desktop/human-eval/human_eval/evaluate_functional_correctness.py", line 22, in entry_point
results = evaluate_functional_correctness(sample_file, k, n_workers, timeout, problem_file)
File "/Users/boyuanchen/Desktop/human-eval/human_eval/evaluation.py", line 75, in evaluate_functional_correctness
result = future.result()
File "/opt/miniconda3/envs/codex/lib/python3.8/concurrent/futures/_base.py", line 437, in result
return self.__get_result()
File "/opt/miniconda3/envs/codex/lib/python3.8/concurrent/futures/_base.py", line 389, in __get_result
raise self._exception
File "/opt/miniconda3/envs/codex/lib/python3.8/concurrent/futures/thread.py", line 57, in run
result = self.fn(*self.args, **self.kwargs)
File "/Users/boyuanchen/Desktop/human-eval/human_eval/execution.py", line 77, in check_correctness
p.start()
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/process.py", line 121, in start
self._popen = self._Popen(self)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/context.py", line 224, in _Popen
return _default_context.get_context().Process._Popen(process_obj)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/context.py", line 284, in _Popen
return Popen(process_obj)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 32, in init
super().init(process_obj)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/popen_fork.py", line 19, in init
self._launch(process_obj)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/popen_spawn_posix.py", line 47, in _launch
reduction.dump(process_obj, fp)
File "/opt/miniconda3/envs/codex/lib/python3.8/multiprocessing/reduction.py", line 60, in dump
ForkingPickler(file, protocol).dump(obj)
AttributeError: Can't pickle local object 'check_correctness..unsafe_execute
`

The text was updated successfully, but these errors were encountered:

ghost · 2022-12-13T12:54:35Z

If you are on windows, this problem can be solved by declaring a global var of unsafe_execute just above the function.

And after this you would encounter Attribute error for unsafe_execute, that can be solved by installing multiprocess library and then replacing "multiprocessing" with multiprocess instead.

Another error on windows would be of pass@=0 instead of 0.4999, this can be solved by removing the timeout function in execution.py for sample sanity check only, for testing on generated samples you have to use timeout but have to find a way to use something different than setitimer() function which causes the pass@=0 issue.

WelliJohn · 2023-03-16T07:26:47Z

when I run on mac,I get the same error,can anyone help?thanks a lot

jacob1017 · 2023-05-30T09:51:47Z

reinstall multiprocess, and replace multiprocessing used in human_eval/execution.py， then here we go.

tianzhaotju · 2023-06-07T06:13:35Z

when I run on mac,I get the same error,can anyone help?thanks a lot

so sad!!! I also have the same problem. Do u solve it, my friend? thx

davide221 · 2023-06-27T08:46:41Z

After importing and change from multiprocessing to multiprocess. What worked for me was to use this

windows machine:

import threading

class TimeoutException(Exception):
    pass

@contextlib.contextmanager
def time_limit(seconds: float):
    timer = threading.Timer(seconds, lambda: (_ for _ in ()).throw(TimeoutException("Timed out!")))
    timer.start()
    try:
        yield
    finally:
        timer.cancel()

linux machine

@contextlib.contextmanager
def time_limit(seconds: float):
    def signal_handler(signum, frame):
        raise TimeoutException("Timed out!")
    signal.setitimer(signal.ITIMER_REAL, seconds)
    signal.signal(signal.SIGALRM, signal_handler)
    try:
        yield
    finally:
        signal.setitimer(signal.ITIMER_REAL, 0)
        pass

pnewhook · 2023-10-06T00:19:22Z

On a Mac with Python 3.10 installing multiprocess and removing imports and usages of multiprocessing worked for me.

adapted from: openai/human-eval#18 (comment)

RyanLoil · 2024-03-28T04:58:11Z

After importing and change from multiprocessing to multiprocess. What worked for me was to use this

windows machine:

import threading

class TimeoutException(Exception):
    pass

@contextlib.contextmanager
def time_limit(seconds: float):
    timer = threading.Timer(seconds, lambda: (_ for _ in ()).throw(TimeoutException("Timed out!")))
    timer.start()
    try:
        yield
    finally:
        timer.cancel()

linux machine

@contextlib.contextmanager
def time_limit(seconds: float):
    def signal_handler(signum, frame):
        raise TimeoutException("Timed out!")
    signal.setitimer(signal.ITIMER_REAL, seconds)
    signal.signal(signal.SIGALRM, signal_handler)
    try:
        yield
    finally:
        signal.setitimer(signal.ITIMER_REAL, 0)
        pass

For Windows user, here is more things need to be done:

addif __name__ == "__main__": before the sys.exit(main()) otherwise you will get a error which recommends you to use "freeze_support()"
install multiprocess and removing imports and usages of multiprocessing or you will get "Ran out of input" error. Please ignore the IDE's "can't find **()" warning as the package multiprocess doesn't announce them in init.py but it still works.
change the time_limit function is necessary as the origin one didn't work properly.
TimeoutException class defination is optional

NA-Wen · 2024-05-20T07:46:19Z

reinstall multiprocess, and replace multiprocessing used in human_eval/execution.py， then here we go.

This works well for me on windows 11. Thank you so much!

BoyuanJackChen · 2024-05-20T11:23:16Z

Thanks all! I have resolved the issue with the given advices.

haesleinhuepf added a commit to haesleinhuepf/human-eval-bia that referenced this issue Mar 23, 2024

fix windows-related signal issue

8d03cfe

adapted from: openai/human-eval#18 (comment)

BoyuanJackChen closed this as completed May 20, 2024

nextdoorUncleLiu mentioned this issue Sep 2, 2024

Error running evaluate_functional_correctness samples.json #48

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

evaluate_functional_correctness can't run #18

evaluate_functional_correctness can't run #18

BoyuanJackChen commented Nov 28, 2022 •

edited

Loading

ghost commented Dec 13, 2022

WelliJohn commented Mar 16, 2023

jacob1017 commented May 30, 2023

tianzhaotju commented Jun 7, 2023

davide221 commented Jun 27, 2023 •

edited

Loading

pnewhook commented Oct 6, 2023

RyanLoil commented Mar 28, 2024 •

edited

Loading

NA-Wen commented May 20, 2024

BoyuanJackChen commented May 20, 2024

evaluate_functional_correctness can't run #18

evaluate_functional_correctness can't run #18

Comments

BoyuanJackChen commented Nov 28, 2022 • edited Loading

ghost commented Dec 13, 2022

WelliJohn commented Mar 16, 2023

jacob1017 commented May 30, 2023

tianzhaotju commented Jun 7, 2023

davide221 commented Jun 27, 2023 • edited Loading

pnewhook commented Oct 6, 2023

RyanLoil commented Mar 28, 2024 • edited Loading

NA-Wen commented May 20, 2024

BoyuanJackChen commented May 20, 2024

BoyuanJackChen commented Nov 28, 2022 •

edited

Loading

davide221 commented Jun 27, 2023 •

edited

Loading

RyanLoil commented Mar 28, 2024 •

edited

Loading