I am trying out PySpark3.2.1 with Oracle 11G. It fails with the following error:
Py4JJavaError: An error occurred while calling o44.load.
: java.lang.ClassNotFoundException: oracle.jdbc.OracleDriver
at java.net.URLClassLoader$1.run(Unknown Source)
at java.net.URLClassLoader$1.run(Unknown Source)
at java.security.AccessController.doPrivileged(Native Method)
My code:
from pyspark.sql import SparkSession
spark = SparkSession.builder.appName("PySpark_Oracle_Connection").getOrCreate()
driver = 'oracle.jdbc.OracleDriver'
url = 'jdbc:oracle:thin:@hostname:port/dbTEST'
user = 'myname'
password = 'mypswd'
table = 'mytable'
SPARK_CLASS_PATH = "C:\Oracle_Client\jdbc\lib\ojdbc8.jar"
df = spark.read.format('jdbc')\
.option('driver', driver)\
.option('url', url)\
.option('dbtable', table)\
.option('user',user)\
.option('password',password).load()
I'd appreciate a quick help, please. I have gone through previous posts, but still doesn't work.
spark-submitare you passing these parameters ---driver-class-path C:\Oracle_Client\jdbc\lib\ojdbc8.jar --jars C:\Oracle_Client\jdbc\lib\ojdbc8.jarspark-submitcommand. Have a look at this, there is an info for the python app as well - spark.apache.org/docs/latest/submitting-applications.html