2

I have a postgres database on an EC2 machine. Using PySpark on a cluster setup I am trying to write to the postgresDB but am not able to.

The Postgres Database has a DB: my_db, followed by a table events.

My PySpark code is:

df.write.format("jdbc") \
.option("url", "jdbc:postgresql://ec2-xxxxx.compute-1.amazonaws.com:543x/my_db") \
.option("dbtable", "events") \
.option("user", "xxx") \
.option("password", "xxx") \
.option("driver", "org.postgresql.Driver").mode('append').save()

When executing I receive this error:

py4j.protocol.Py4JJavaError: An error occurred while calling o69.save. : org.postgresql.util.PSQLException: ERROR: relation "events" already exists

It seems that it creates a new table when I execute spark-submit, how to solve this error?

1

0

Your Answer

By clicking “Post Your Answer”, you agree to our terms of service and acknowledge you have read our privacy policy.

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.