0

I'm working on Azure Databricks. Currently my Pyspark project is on 'dbfs'. I configured a spark-submit job to execute my Pyspark code (.py file). However, according to the Databricks documentation spark-submit jobs can only run on new automated clusters (Probably, that's by design).

Is there a way to run my Pyspark code on existing interactive cluster?

I also tried to run spark-submit command from notebook in %sh cell to no use.

1 Answer 1

1

By default, when you create a job, the cluster type is selected as "New Automated cluster".

You can configure the cluster type to choose between automated cluster or existing interactive cluster.

Steps to configure a job:

Select the job => click on the cluster => Edit button and select the "Existing interactive cluster" and select the cluster.

enter image description here

Sign up to request clarification or add additional context in comments.

3 Comments

Hi. That only works if the type of job is notebook. It doesn't work if its spark submit job. Please change the job to execute spark submit task to understand the change.
Hi, you can configure and execute pyspark code using spark-submit using jobs.
Hi thanks for the reply. Yes, we currently scheduled a spark submit job to run Pyspark code. But it seems, spark submit jobs cannot be run on existing interactive clusters. Is there a way to run spark submit on existing cluster?

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.