admin管理员组

文章数量:1431997

I'm using Ray with ray.remote to define an InferenceActor class, which includes a method run_inference which contains one parameter (A list of strings) for handling model inference tasks. However, when I execute the run_inference method for the first time, I encounter the following error:

Could not serialize the argument b'__RAY_DUMMY__' for a task or actor services.inference_actor.InferenceActor.run_inference

InferenceActor class:

ray.init(num_gpus=1)
@ray.remote(num_gpus=1)
class InferenceActor:
    def __init__(self, settings: AppSettings):
        self.model = LLM(
                    model=settings.llm_settings.model_path,
                    tokenizer=settings.llm_settings.tokenizer_path,
                    gpu_memory_utilization=settings.llm_settings.gpu_mem_limit,
                )
        self.sampling_parameters = SamplingParams(top_p=settings.extraction_settings.top_p,
                                                temperature=settings.extraction_settings.temperature,
                                                max_tokens=settings.extraction_settings.max_new_tokens,
                                                stop=settings.extraction_settings.stop_sequence,
                                                include_stop_str_in_output=True)

    def run_inference(self, prompts: list[str]):
        results = self.model.generate(prompts, self.sampling_parameters)
        outputs = [result.outputs[0].text for result in results]
        return outputs

It seems to be related to serialization, but I’m not sure what’s causing the issue or how to resolve it. Has anyone run into this problem before or have suggestions on what might be going wrong?

I have tried serialising the prompt argument with multiple different serialisation libraries:

  • cloudpickle
  • pickle
  • json

Any insights would be greatly appreciated!

Thanks!

本文标签: