In the spark program, jackson is used to do json serialization and deserialization of scala objects. There are java.lang.NoClassDefFoundError and java.lang.AbstractMethodError errors at runtime. After searching the Internet, it is found that the version conflicts between jackson/guava and spark.
1. In idea, by adjusting the order of dependencies in modules, the problem is solved quickly:
Edit->File Structure->Modules->Dependencies
2. After publishing to the Spark cluster, for some reason, maven was not used to manage the project at the beginning, so we still want to adjust the dependency order on the third-party library to solve the problem:
spark-submit --class noc.train.Train --master yarn-client --num-executors 10 --executor-cores 2 --driver-memory 10g --executor-memory 12g --conf spark.yarn.executor.memoryOverhead=2048 \ --conf spark.executor.extraClassPath=./guava-15.0.jar:./jackson-annotations-2.4.4.jar:./jackson-core-2.4.4.jar:./jackson-databind-2.4.4.jar:./jackson-module-scala_2.10-2.4.4.jar \ --conf spark.driver.extraClassPath=/home/ck/lib/guava-15.0.jar:/home/ck/lib/jackson-annotations-2.4.4.jar:/home/ck/lib/jackson-core-2.4.4.jar:/home/ck/lib/jackson-databind-2.4.4.jar:/home/ck/lib/jackson-module-scala_2.10-2.4.4.jar \ --jars /home/ck/lib/guava-15.0.jar,/home/ck/lib/jackson-annotations-2.4.4.jar,/home/ck/lib/jackson-core-2.4.4.jar,/home/ck/lib/jackson-databind-2.4.4.jar,/home/ck/lib/jackson-module-scala_2.10-2.4.4.jar \ /home/ck/noc_scala.jar
But the mistake remains:
18/09/12 09:46:47 INFO scheduler.TaskSetManager: Lost task 198.0 in stage 7.0 (TID 897) on executor host-9-136: java.lang.NoClassDefFoundError (Could not initialize class noc.grid.Grid$) [duplicate 1] 18/09/12 09:46:47 WARN scheduler.TaskSetManager: Lost task 58.0 in stage 7.0 (TID 890, host-9-136): java.lang.AbstractMethodError: noce.grid.Grid$$anon$1.com$fasterxml$jackson$module$scala$experimental$ScalaObjectMapper$_setter_$com$fasterxml$jackson$module$scala$experimental$ScalaObjectMapper$$typeCache_$eq(Lorg/spark-project/guava/cache/LoadingCache;)V at com.fasterxml.jackson.module.scala.experimental.ScalaObjectMapper$class.$init$(ScalaObjectMapper.scala:50)
I had to turn to maven
3. Use maven's shade plug-in:
First, add the dependency and use the default compile scope
<dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactId>jackson-core</artifactId> <version>2.4.4</version> </dependency> <dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactId>jackson-databind</artifactId> <version>2.4.4</version> </dependency> <dependency> <groupId>com.fasterxml.jackson.core</groupId> <artifactId>jackson-annotations</artifactId> <version>2.4.4</version> </dependency> <dependency> <groupId>com.fasterxml.jackson.module</groupId> <artifactId>jackson-module-scala_2.10</artifactId> <version>2.4.4</version> </dependency>
Then add Maven shade plugin
<plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>3.1.0</version> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> <configuration> <relocations> <relocation> <pattern>com.fasterxml.jackson</pattern> <shadedPattern>noc.com.fasterxml.jackson</shadedPattern> </relocation> <relocation> <pattern>com.google.guava</pattern> <shadedPattern>noc.com.google.guava</shadedPattern> </relocation> </relocations> </configuration> </execution> </executions>
Done!