RayTaskRunner accepts the following optional parameters:
Parameter
Description
address
Address of a currently running Ray instance, starting with the ray:// URI.
init_kwargs
Additional kwargs to use when calling ray.init.
Note that Ray Client uses the ray:// URI to indicate the address of a Ray instance. If you don't provide the address of a Ray instance, Prefect creates a temporary instance automatically.
Ray environment limitations
Ray support for non-x86/64 architectures such as ARM/M1 processors with installation from pip alone and will be skipped during installation of Prefect. It is possible to manually install the blocking component with conda. See the Ray documentation for instructions.
When using the RayTaskRunner with a remote Ray cluster, you may run into issues that are not seen when using a local Ray instance. To resolve these issues, we recommend taking the following steps when working with a remote Ray cluster:
By default, Prefect will not persist any data to the filesystem of the remote ray worker. However, if you want to take advantage of Prefect's caching ability, you will need to configure a remote result storage to persist results across task runs.
Here's an example of a flow that uses caching and remote result storage:
fromtypingimportListfromprefectimportflow,get_run_logger,taskfromprefect.filesystemsimportS3fromprefect.tasksimporttask_input_hashfromprefect_ray.task_runnersimportRayTaskRunner# The result of this task will be cached in the configured result storage@task(cache_key_fn=task_input_hash)defsay_hello(name:str)->None:logger=get_run_logger()# This log statement will print only on the first run. Subsequent runs will be cached.logger.info(f"hello {name}!")returnname@flow(task_runner=RayTaskRunner(address="ray://<instance_public_ip_address>:10001",),# Using an S3 block that has already been created via the Prefect UIresult_storage="s3/my-result-storage",)defgreetings(names:List[str])->None:fornameinnames:say_hello.submit(name)if__name__=="__main__":greetings(["arthur","trillian","ford","marvin"])
If you get an error stating that the module 'prefect' cannot be found, ensure prefect is installed on the remote cluster, with:
pipinstallprefect
If you get an error with a message similar to "File system created with scheme 's3' could not be created", ensure the required Python modules are installed on both local and remote machines. The required prerequisite modules can be found in the Prefect documentation. For example, if using S3 for the remote storage:
pipinstalls3fs
If you are seeing timeout or other connection errors, double check the address provided to the RayTaskRunner. The address should look similar to: address='ray://<head_node_ip_address>:10001':