I have a spark scala program which loads a jar I wrote in java. From that jar a static function is called, which tried to read a serialized object from a file (Pattern.class), but throws a java.lang.ClassNotFoundException.
Running the spark program locally works, but on the cluster workers it doesn't. It's especially weird because before I try to read from the file, I instantiate a Pattern object and there are no problems.
I am sure that the Pattern objects I wrote in the file are the same as the Pattern objects I am trying to read.
I've checked the jar in the slave machine and the Pattern class is there.
Does anyone have any idea what the problem might be ? I can add more detail if it's needed.
This is the Pattern class
public class Pattern implements Serializable {
private static final long serialVersionUID = 588249593084959064L;
public static enum RelationPatternType {NONE, LEFT, RIGHT, BOTH};
RelationPatternType type;
String entity;
String pattern;
List<Token> tokens;
Relation relation = null;
public Pattern(RelationPatternType type, String entity, List<Token> tokens, Relation relation) {
this.type = type;
this.entity = entity;
this.tokens = tokens;
this.relation = relation;
if (this.tokens != null)
this.pattern = StringUtils.join(" ", this.tokens.toString());
}
}
I am reading the file from S3 the following way:
AmazonS3 s3Client = new AmazonS3Client(credentials);
S3Object confidentPatternsObject = s3Client.getObject(new GetObjectRequest("xxx","confidentPatterns"));
objectData = confidentPatternsObject.getObjectContent();
ois = new ObjectInputStream(objectData);
confidentPatterns = (Map<Pattern, Tuple2<Integer, Integer>>) ois.readObject();
LE: I checked the classpath at runtime and the path to the jar was not there. I added it for the executors but I still have the same problem. I don't think that was it, as I have the Pattern class inside the jar that is calling the readObject function.