1、安装jupyter
2、安装sparkmagic
3、设置超时
"livy_server_heartbeat_timeout_seconds": 0,
4、设置集群模式
"session_configs": {
"driverMemory": "2G",
"executorCores": 4,
"proxyUser": "bernhard",
"conf": {
"spark.master": "yarn-cluster",
"spark.jars.packages": "com.databricks:spark-csv_2.10:1.5.0"
}
}
5、配置hdp
livy.server.csrf_protection.enabled ==> false
6、启动notebook,配饰spark session
%load_ext sparkmagic.magics
%manage_spark
参考文档
https://github.com/jupyter-incubator/sparkmagic
https://github.com/bernhard-42/Sparkmagic-on-HDP