[BUG] Unexpected keyword argument 'override_bs' #504

mdiazmel · 2025-01-21T08:54:25Z

Describe the bug

I got this error when running lighteval vllm

[rank0]: ╭─────────────────────────────── Traceback (most recent call last) ────────────────────────────────╮
[rank0]: │ /home/mdiazmel/code/lighteval/src/lighteval/main_vllm.py:146 in vllm                             │
[rank0]: │                                                                                                  │
[rank0]: │   143 │   │   model_config=model_config,                                                         │
[rank0]: │   144 │   )                                                                                      │
[rank0]: │   145 │                                                                                          │
[rank0]: │ ❱ 146 │   pipeline.evaluate()                                                                    │
[rank0]: │   147 │                                                                                          │
[rank0]: │   148 │   pipeline.show_results()                                                                │
[rank0]: │   149                                                                                            │
[rank0]: │                                                                                                  │
[rank0]: │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
[rank0]: │ │                 cache_dir = '/scratch'                                                       │ │
[rank0]: │ │              custom_tasks = None                                                             │ │
[rank0]: │ │ dataset_loading_processes = 1                                                                │ │
[rank0]: │ │                env_config = EnvConfig(cache_dir='/scratch', token=None)                      │ │
[rank0]: │ │        evaluation_tracker = <lighteval.logging.evaluation_tracker.EvaluationTracker object   │ │
[rank0]: │ │                             at 0x7f4f6e3b2ea0>                                               │ │
[rank0]: │ │                    job_id = 0                                                                │ │
[rank0]: │ │               max_samples = None                                                             │ │
[rank0]: │ │                model_args = 'pretrained=OpenLLM-France/Lucie-7B-Instruct,trust_remote_code=… │ │
[rank0]: │ │           model_args_dict = {                                                                │ │
[rank0]: │ │                             │   'pretrained': 'OpenLLM-France/Lucie-7B-Instruct',            │ │
[rank0]: │ │                             │   'trust_remote_code': 'False',                                │ │
[rank0]: │ │                             │   'dtype': 'float16',                                          │ │
[rank0]: │ │                             │   'max_model_length': '512'                                    │ │
[rank0]: │ │                             }                                                                │ │
[rank0]: │ │              model_config = VLLMModelConfig(                                                 │ │
[rank0]: │ │                             │   pretrained='OpenLLM-France/Lucie-7B-Instruct',               │ │
[rank0]: │ │                             │   gpu_memory_utilisation=0.9,                                  │ │
[rank0]: │ │                             │   revision='main',                                             │ │
[rank0]: │ │                             │   dtype='float16',                                             │ │
[rank0]: │ │                             │   tensor_parallel_size=1,                                      │ │
[rank0]: │ │                             │   pipeline_parallel_size=1,                                    │ │
[rank0]: │ │                             │   data_parallel_size=1,                                        │ │
[rank0]: │ │                             │   max_model_length='512',                                      │ │
[rank0]: │ │                             │   swap_space=4,                                                │ │
[rank0]: │ │                             │   seed=1234,                                                   │ │
[rank0]: │ │                             │   trust_remote_code='False',                                   │ │
[rank0]: │ │                             │   use_chat_template=False,                                     │ │
[rank0]: │ │                             │   add_special_tokens=True,                                     │ │
[rank0]: │ │                             │   multichoice_continuations_start_space=True,                  │ │
[rank0]: │ │                             │   pairwise_tokenization=False,                                 │ │
[rank0]: │ │                             │   generation_parameters=GenerationParameters(                  │ │
[rank0]: │ │                             │   │   early_stopping=None,                                     │ │
[rank0]: │ │                             │   │   repetition_penalty=None,                                 │ │
[rank0]: │ │                             │   │   frequency_penalty=None,                                  │ │
[rank0]: │ │                             │   │   length_penalty=None,                                     │ │
[rank0]: │ │                             │   │   presence_penalty=None,                                   │ │
[rank0]: │ │                             │   │   max_new_tokens=None,                                     │ │
[rank0]: │ │                             │   │   min_new_tokens=None,                                     │ │
[rank0]: │ │                             │   │   seed=None,                                               │ │
[rank0]: │ │                             │   │   stop_tokens=None,                                        │ │
[rank0]: │ │                             │   │   temperature=None,                                        │ │
[rank0]: │ │                             │   │   top_k=None,                                              │ │
[rank0]: │ │                             │   │   min_p=None,                                              │ │
[rank0]: │ │                             │   │   top_p=None,                                              │ │
[rank0]: │ │                             │   │   truncate_prompt=None                                     │ │
[rank0]: │ │                             │   ),                                                           │ │
[rank0]: │ │                             │   subfolder=None                                               │ │
[rank0]: │ │                             )                                                                │ │
[rank0]: │ │         num_fewshot_seeds = 1                                                                │ │
[rank0]: │ │                output_dir = 'results'                                                        │ │
[rank0]: │ │                  pipeline = <lighteval.pipeline.Pipeline object at 0x7f4c3da03da0>           │ │
[rank0]: │ │           pipeline_params = PipelineParameters(                                              │ │
[rank0]: │ │                             │   launcher_type=<ParallelismManager.VLLM: 5>,                  │ │
[rank0]: │ │                             │   env_config=EnvConfig(cache_dir='/scratch', token=None),      │ │
[rank0]: │ │                             │   job_id=0,                                                    │ │
[rank0]: │ │                             │   dataset_loading_processes=1,                                 │ │
[rank0]: │ │                             │   nanotron_checkpoint_path=None,                               │ │
[rank0]: │ │                             │   custom_tasks_directory=None,                                 │ │
[rank0]: │ │                             │   override_batch_size=-1,                                      │ │
[rank0]: │ │                             │   num_fewshot_seeds=1,                                         │ │
[rank0]: │ │                             │   max_samples=None,                                            │ │
[rank0]: │ │                             │   use_chat_template=False,                                     │ │
[rank0]: │ │                             │   system_prompt=None                                           │ │
[rank0]: │ │                             )                                                                │ │
[rank0]: │ │                public_run = False                                                            │ │
[rank0]: │ │               push_to_hub = False                                                            │ │
[rank0]: │ │       push_to_tensorboard = False                                                            │ │
[rank0]: │ │               results_org = None                                                             │ │
[rank0]: │ │              save_details = True                                                             │ │
[rank0]: │ │             system_prompt = None                                                             │ │
[rank0]: │ │                     tasks = 'lighteval|gpqa|0|1'                                             │ │
[rank0]: │ │                     TOKEN = None                                                             │ │
[rank0]: │ │         use_chat_template = False                                                            │ │
[rank0]: │ │                      yaml = <module 'yaml' from                                              │ │
[rank0]: │ │                             '/scratch/mdiazmel/miniconda3/envs/ligtheval/lib/python3.12/sit… │ │
[rank0]: │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
[rank0]: │                                                                                                  │
[rank0]: │ /home/mdiazmel/code/lighteval/src/lighteval/pipeline.py:248 in evaluate                          │
[rank0]: │                                                                                                  │
[rank0]: │   245 │   │   │   config=self.model_config,                                                      │
[rank0]: │   246 │   │   )                                                                                  │
[rank0]: │   247 │   │                                                                                      │
[rank0]: │ ❱ 248 │   │   sample_id_to_responses = self._run_model()                                         │
[rank0]: │   249 │   │   self._compute_metrics(sample_id_to_responses)                                      │
[rank0]: │   250 │   │                                                                                      │
[rank0]: │   251 │   │   if self.is_main_process():                                                         │
[rank0]: │                                                                                                  │
[rank0]: │ ╭─────────────────────────── locals ────────────────────────────╮                                │
[rank0]: │ │ self = <lighteval.pipeline.Pipeline object at 0x7f4c3da03da0> │                                │
[rank0]: │ ╰───────────────────────────────────────────────────────────────╯                                │
[rank0]: │                                                                                                  │
[rank0]: │ /home/mdiazmel/code/lighteval/src/lighteval/pipeline.py:273 in _run_model                        │
[rank0]: │                                                                                                  │
[rank0]: │   270 │   │   for request_type, requests in self.requests.items():                               │
[rank0]: │   271 │   │   │   logger.info(f"Running {request_type} requests")                                │
[rank0]: │   272 │   │   │   run_model = self.model.get_method_from_request_type(request_type=request_typ   │
[rank0]: │ ❱ 273 │   │   │   responses = run_model(requests, override_bs=self.pipeline_parameters.overrid   │
[rank0]: │   274 │   │   │                                                                                  │
[rank0]: │   275 │   │   │   # Storing the responses associated to the same samples together                │
[rank0]: │   276 │   │   │   for response, request in zip(responses, requests):                             │
[rank0]: │                                                                                                  │
[rank0]: │ ╭─────────────────────────────────────────── locals ───────────────────────────────────────────╮ │
[rank0]: │ │           request_type = <RequestType.LOGLIKELIHOOD_SINGLE_TOKEN: 2>                         │ │
[rank0]: │ │               requests = [                                                                   │ │
[rank0]: │ │                          │   LoglikelihoodSingleTokenRequest(                                │ │
[rank0]: │ │                          │   │   task_name='lighteval|gpqa|0',                               │ │
[rank0]: │ │                          │   │   sample_index='0_0',                                         │ │
[rank0]: │ │                          │   │   request_index=0,                                            │ │
[rank0]: │ │                          │   │   context='Select the correct answer to the following         │ │
[rank0]: │ │                          questions.\n\nQuestion: Identify the fi'+227,                       │ │
[rank0]: │ │                          │   │   metric_categories=[                                         │ │
[rank0]: │ │                          │   │   │   <MetricCategory.MULTICHOICE_ONE_TOKEN: '10'>            │ │
[rank0]: │ │                          │   │   ],                                                          │ │
[rank0]: │ │                          │   │   choices=['A', 'B', 'C', 'D'],                               │ │
[rank0]: │ │                          │   │   tokenized_context=None,                                     │ │
[rank0]: │ │                          │   │   tokenized_continuation=None                                 │ │
[rank0]: │ │                          │   ),                                                              │ │
[rank0]: │ │                          │   LoglikelihoodSingleTokenRequest(                                │ │
[rank0]: │ │                          │   │   task_name='lighteval|gpqa|0',                               │ │
[rank0]: │ │                          │   │   sample_index='1_0',                                         │ │
[rank0]: │ │                          │   │   request_index=0,                                            │ │
[rank0]: │ │                          │   │   context='Select the correct answer to the following         │ │
[rank0]: │ │                          questions.\n\nQuestion: "Consider the f'+287,                       │ │
[rank0]: │ │                          │   │   metric_categories=[                                         │ │
[rank0]: │ │                          │   │   │   <MetricCategory.MULTICHOICE_ONE_TOKEN: '10'>            │ │
[rank0]: │ │                          │   │   ],                                                          │ │
[rank0]: │ │                          │   │   choices=['A', 'B', 'C', 'D'],                               │ │
[rank0]: │ │                          │   │   tokenized_context=None,                                     │ │
[rank0]: │ │                          │   │   tokenized_continuation=None                                 │ │
[rank0]: │ │                          │   ),                                                              │ │
[rank0]: │ │                          │   LoglikelihoodSingleTokenRequest(                                │ │
[rank0]: │ │                          │   │   task_name='lighteval|gpqa|0',                               │ │
[rank0]: │ │                          │   │   sample_index='2_0',                                         │ │
[rank0]: │ │                          │   │   request_index=0,                                            │ │
[rank0]: │ │                          │   │   context='Select the correct answer to the following         │ │
[rank0]: │ │                          questions.\n\nQuestion: Some plants lac'+568,                       │ │
[rank0]: │ │                          │   │   metric_categories=[                                         │ │
[rank0]: │ │                          │   │   │   <MetricCategory.MULTICHOICE_ONE_TOKEN: '10'>            │ │
[rank0]: │ │                          │   │   ],                                                          │ │
[rank0]: │ │                          │   │   choices=['A', 'B', 'C', 'D'],                               │ │
[rank0]: │ │                          │   │   tokenized_context=None,                                     │ │
[rank0]: │ │                          │   │   tokenized_continuation=None                                 │ │
[rank0]: │ │                          │   ),                                                              │ │
[rank0]: │ │                          │   LoglikelihoodSingleTokenRequest(                                │ │
[rank0]: │ │                          │   │   task_name='lighteval|gpqa|0',                               │ │
[rank0]: │ │                          │   │   sample_index='3_0',                                         │ │
[rank0]: │ │                          │   │   request_index=0,                                            │ │
[rank0]: │ │                          │   │   context='Select the correct answer to the following         │ │
[rank0]: │ │                          questions.\n\nQuestion: Astronomers are'+405,                       │ │
[rank0]: │ │                          │   │   metric_categories=[                                         │ │
[rank0]: │ │                          │   │   │   <MetricCategory.MULTICHOICE_ONE_TOKEN: '10'>            │ │
[rank0]: │ │                          │   │   ],                                                          │ │
[rank0]: │ │                          │   │   choices=['A', 'B', 'C', 'D'],                               │ │
[rank0]: │ │                          │   │   tokenized_context=None,                                     │ │
[rank0]: │ │                          │   │   tokenized_continuation=None                                 │ │
[rank0]: │ │                          │   ),                                                              │ │
[rank0]: │ │                          │   LoglikelihoodSingleTokenRequest(                                │ │
[rank0]: │ │                          │   │   task_name='lighteval|gpqa|0',                               │ │
[rank0]: │ │                          │   │   sample_index='4_0',                                         │ │
[rank0]: │ │                          │   │   request_index=0,                                            │ │
[rank0]: │ │                          │   │   context='Select the correct answer to the following         │ │
[rank0]: │ │                          questions.\n\nQuestion: Diamond and gra'+710,                       │ │
[rank0]: │ │                          │   │   metric_categories=[                                         │ │
[rank0]: │ │                          │   │   │   <MetricCategory.MULTICHOICE_ONE_TOKEN: '10'>            │ │
[rank0]: │ │                          │   │   ],                                                          │ │
[rank0]: │ │                          │   │   choices=['A', 'B', 'C', 'D'],                               │ │
[rank0]: │ │                          │   │   tokenized_context=None,                                     │ │
[rank0]: │ │                          │   │   tokenized_continuation=None                                 │ │
[rank0]: │ │                          │   ),                                                              │ │
[rank0]: │ │                          │   LoglikelihoodSingleTokenRequest(                                │ │
[rank0]: │ │                          │   │   task_name='lighteval|gpqa|0',                               │ │
[rank0]: │ │                          │   │   sample_index='5_0',                                         │ │
[rank0]: │ │                          │   │   request_index=0,                                            │ │
[rank0]: │ │                          │   │   context='Select the correct answer to the following         │ │
[rank0]: │ │                          questions.\n\nQuestion: "Scientist aim '+876,                       │ │
[rank0]: │ │                          │   │   metric_categories=[                                         │ │
[rank0]: │ │                          │   │   │   <MetricCategory.MULTICHOICE_ONE_TOKEN: '10'>            │ │
[rank0]: │ │                          │   │   ],                                                          │ │
[rank0]: │ │                          │   │   choices=['A', 'B', 'C', 'D'],                               │ │
[rank0]: │ │                          │   │   tokenized_context=None,                                     │ │
[rank0]: │ │                          │   │   tokenized_continuation=None                                 │ │
[rank0]: │ │                          │   ),                                                              │ │
[rank0]: │ │                          │   LoglikelihoodSingleTokenRequest(                                │ │
[rank0]: │ │                          │   │   task_name='lighteval|gpqa|0',                               │ │
[rank0]: │ │                          │   │   sample_index='6_0',                                         │ │
[rank0]: │ │                          │   │   request_index=0,                                            │ │
[rank0]: │ │                          │   │   context='Select the correct answer to the following         │ │
[rank0]: │ │                          questions.\n\nQuestion: There is a spin'+430,                       │ │
[rank0]: │ │                          │   │   metric_categories=[                                         │ │
[rank0]: │ │                          │   │   │   <MetricCategory.MULTICHOICE_ONE_TOKEN: '10'>            │ │
[rank0]: │ │                          │   │   ],                                                          │ │
[rank0]: │ │                          │   │   choices=['A', 'B', 'C', 'D'],                               │ │
[rank0]: │ │                          │   │   tokenized_context=None,                                     │ │
[rank0]: │ │                          │   │   tokenized_continuation=None                                 │ │
[rank0]: │ │                          │   ),                                                              │ │
[rank0]: │ │                          │   LoglikelihoodSingleTokenRequest(                                │ │
[rank0]: │ │                          │   │   task_name='lighteval|gpqa|0',                               │ │
[rank0]: │ │                          │   │   sample_index='7_0',                                         │ │
[rank0]: │ │                          │   │   request_index=0,                                            │ │
[rank0]: │ │                          │   │   context='Select the correct answer to the following         │ │
[rank0]: │ │                          questions.\n\nQuestion: aniline is heat'+317,                       │ │
[rank0]: │ │                          │   │   metric_categories=[                                         │ │
[rank0]: │ │                          │   │   │   <MetricCategory.MULTICHOICE_ONE_TOKEN: '10'>            │ │
[rank0]: │ │                          │   │   ],                                                          │ │
[rank0]: │ │                          │   │   choices=['A', 'B', 'C', 'D'],                               │ │
[rank0]: │ │                          │   │   tokenized_context=None,                                     │ │
[rank0]: │ │                          │   │   tokenized_continuation=None                                 │ │
[rank0]: │ │                          │   ),                                                              │ │
[rank0]: │ │                          │   LoglikelihoodSingleTokenRequest(                                │ │
[rank0]: │ │                          │   │   task_name='lighteval|gpqa|0',                               │ │
[rank0]: │ │                          │   │   sample_index='8_0',                                         │ │
[rank0]: │ │                          │   │   request_index=0,                                            │ │
[rank0]: │ │                          │   │   context='Select the correct answer to the following         │ │
[rank0]: │ │                          questions.\n\nQuestion: Astronomers are'+645,                       │ │
[rank0]: │ │                          │   │   metric_categories=[                                         │ │
[rank0]: │ │                          │   │   │   <MetricCategory.MULTICHOICE_ONE_TOKEN: '10'>            │ │
[rank0]: │ │                          │   │   ],                                                          │ │
[rank0]: │ │                          │   │   choices=['A', 'B', 'C', 'D'],                               │ │
[rank0]: │ │                          │   │   tokenized_context=None,                                     │ │
[rank0]: │ │                          │   │   tokenized_continuation=None                                 │ │
[rank0]: │ │                          │   ),                                                              │ │
[rank0]: │ │                          │   LoglikelihoodSingleTokenRequest(                                │ │
[rank0]: │ │                          │   │   task_name='lighteval|gpqa|0',                               │ │
[rank0]: │ │                          │   │   sample_index='9_0',                                         │ │
[rank0]: │ │                          │   │   request_index=0,                                            │ │
[rank0]: │ │                          │   │   context='Select the correct answer to the following         │ │
[rank0]: │ │                          questions.\n\nQuestion: If uncertainty '+268,                       │ │
[rank0]: │ │                          │   │   metric_categories=[                                         │ │
[rank0]: │ │                          │   │   │   <MetricCategory.MULTICHOICE_ONE_TOKEN: '10'>            │ │
[rank0]: │ │                          │   │   ],                                                          │ │
[rank0]: │ │                          │   │   choices=['A', 'B', 'C', 'D'],                               │ │
[rank0]: │ │                          │   │   tokenized_context=None,                                     │ │
[rank0]: │ │                          │   │   tokenized_continuation=None                                 │ │
[rank0]: │ │                          │   ),                                                              │ │
[rank0]: │ │                          │   ... +438                                                        │ │
[rank0]: │ │                          ]                                                                   │ │
[rank0]: │ │              run_model = <bound method VLLMModel.loglikelihood_single_token of               │ │
[rank0]: │ │                          <lighteval.models.vllm.vllm_model.VLLMModel object at               │ │
[rank0]: │ │                          0x7f4c5018b0e0>>                                                    │ │
[rank0]: │ │ sample_id_to_responses = defaultdict(<class 'list'>, {})                                     │ │
[rank0]: │ │                   self = <lighteval.pipeline.Pipeline object at 0x7f4c3da03da0>              │ │
[rank0]: │ ╰──────────────────────────────────────────────────────────────────────────────────────────────╯ │
[rank0]: ╰──────────────────────────────────────────────────────────────────────────────────────────────────╯
[rank0]: TypeError: VLLMModel.loglikelihood_single_token() got an unexpected keyword argument 'override_bs'

To Reproduce

The command line instruction I used is:

lighteval vllm "pretrained=${model},trust_remote_code=False,dtype=float16,max_model_length=512" \
    "lighteval|gpqa|0|0"\
    --save-details

Expected behavior

Job doesn't run as expected.

Version info

lighteval is installed from commit 59624c (main branch 2 days ago) and vllm==0.6.6.post1

Any clues?

The text was updated successfully, but these errors were encountered:

NathanHB · 2025-01-22T10:47:07Z

Hi ! Yes, VLLM does not support this metric, you need to change it to Metrics.loglikelihood_acc.
For ease of use, you can create a custom tasks for GPQA, keeping everythoing the same but the metric

mdiazmel · 2025-01-23T22:49:55Z

Thx!
I'll close this issue!

mdiazmel added the bug Something isn't working label Jan 21, 2025

mdiazmel closed this as completed Jan 23, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Unexpected keyword argument 'override_bs' #504

[BUG] Unexpected keyword argument 'override_bs' #504

mdiazmel commented Jan 21, 2025

NathanHB commented Jan 22, 2025

mdiazmel commented Jan 23, 2025

[BUG] Unexpected keyword argument 'override_bs' #504

[BUG] Unexpected keyword argument 'override_bs' #504

Comments

mdiazmel commented Jan 21, 2025

Describe the bug

To Reproduce

Expected behavior

Version info

NathanHB commented Jan 22, 2025

mdiazmel commented Jan 23, 2025