Spark debugging
Changing Logging Level for spark driver logging:
by passing spark.driver.extraJavaOptions
programmatically:
val sparkConf = new SparkConf().setAppName(jobName)
.set("spark.driver.extraJavaOptions",
"-Dlog4j.configuration=-Dlog4jspark.root.logger=DEBUG,TimeRollingHourly")
from spark-submit command line:
spark-submit [other args] --conf "spark.driver.extraJavaOptions=-Dlog4j.configuration=log4j.properties"
for standalone mode
following will work
import org.apache.log4j.Logger
import org.apache.log4j.Level
// use the suffix of the module's namespace you want to change the logging level for.
Logger.getLogger("org").setLevel(Level.OFF)
Logger.getLogger("com.microsoft").setLevel(Level.OFF)
SparkContext.getOrCreate(sparkConf)
Changing Logging Level for spark executor logging:
programmatically:
val sparkConf = new SparkConf().setAppName(jobName)
.set("spark.executor.extraJavaOptions",
"-Dlog4j.configuration=-Dlog4jspark.root.logger=DEBUG,TimeRollingHourly")
from spark-submit command line:
spark-submit [other args] --conf "spark.executor.extraJavaOptions=-Dlog4j.configuration=log4j.properties"
NOTE: other driver/executor specific args can be passed in similar fashion
Spark Cassandra connector
Enabling Verbose Tracing for Spark Cassandra Connector
Pass the following extra command line arg to spark submit to get verbose logging for Spark-Cassandra Connector.
For driver:
--driver-java-options -Dlog4j.configuration=file:/usr/hdp/current/spark-client/conf/log4j.properties -Dlog4jspark.root.logger=INFO,TimeRollingHourly
For executor:
--conf spark.executor.extraJavaOptions=-Dlog4jspark.root.logger=INFO,TimeRollingHourly -Dlog4j.configuration=file:/usr/hdp/current/spark-client/conf/log4j.properties -XX:-UseParallelGC -XX:+UseG1GC -XX:+PrintGCDetails -XX:+PrintGCTimeStamps -XX:+HeapDumpOnOutOfMemoryError