# Apache Spark plugin # Introduction **Ver**. 03-09-2021 Plugin to grab metrics from all executors of all running or finished apps on your Spark server.
**Type**: Server plug-in
# Compatibility matrix
**Systems where it has been tested**CentOS 7, Fedora
**Systems where it should work**Any linux system
# Prerequisites Required: - A Spark server - Spark history server enabled - Pandora FMS Data Server enabled - Have the Pandora FMS Plugin Server enabled # Configuration The plugin makes use of some Spark rest api endpoints, in order to access them from the plugin we will have to have a series of active ports that are not blocked by the firewall, these are the following : ``` firewall-cmd --permanent --zone=public --add-port=6066/tcp firewall-cmd --permanent --zone=public --add-port=7077/tcp firewall-cmd --permanent --zone=public --add-port=8080-8081/tcp firewall-cmd --permanent --zone=public --add-port=4040/tcp firewall-cmd --permanent --zone=public --add-port=18080/tcp firewall-cmd --reload ``` **6066**: Rest url (cluster mode). **7077**: Server master. **8080** : Web UI. **4040**: Para aplicaciones en ejecución. **18080**: Para el history server. In order to make use of the history server we will have to enable spark.eventLog.enabled, spark.eventLog.dir and spark.history.fs.logDirectory in spark-defaults.conf. You can find a conf template in /conf [![1.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/1.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/1.png) We will create in that path the file with: ``` vi spark-defaults.conf ``` And we'll leave it at that, you can choose the path where you want to save the events. [![2.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/2.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/2.png) Now we can activate the history server, in /sbin, the same path where the master, workers, etc. are activated. [![3.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/3.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/3.png) We will start it with : ``` ./start-history-server.sh ``` [![4.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/4.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/4.png) If we go to the log that returns the output we will see how correctly it has been started and its url. [![5.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/5.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/5.png) If we access the url we will see the history server. [![6.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/6.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/6.png) **Note:** This is assumed, but for the plugin to work you will need to have the master server active, as well as have applications running or that have been run and have finished, as it is from the applications that we will take the metrics, specifically from their runners. # Plugin general parameters ``` ./pandora_spark -i -[ -g ] [ --data_dir ] ``` If the execution is correct we will see a 1. [![7.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/7.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/7.png) If you want to get the data from running applications, enter the ip with port 4040, if you want to get the data from finished applications enter port 18080. # Plugin specific parameters The plugin has the following parameters:
**Parameter****Description**
-i <ip-with-port> --ip <ip-with-port>ip with port, mandatory.
-g GROUP, --group GROUPPandora FMS target group (optional)
--data\_dir DATA\_DIRPandora FMS data directory. By default it is /var/spool/pandora/data\_in/ (optional)
Help example: [![8.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/8.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/8.png) # Configuration in Pandora **Installation from the console** To register the plugin, from the console, go to the "register plugin" section. **![register_plugin.png](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/scaled-1680-/register-plugin.png)** Click on select file. **[![register_plugin2.png](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/scaled-1680-/register-plugin2.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/register-plugin2.png)** Select the .pspz2 file containing the plugin. [![register1_spark.png](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/scaled-1680-/register1-spark.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/register1-spark.png) Once uploaded, a message will be displayed indicating that it has been successfully uploaded. [![register2_spark.png](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/scaled-1680-/register2-spark.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/register2-spark.png) Once the plugin is registered, we will see it in the plugins section. [![serversingles.png](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/scaled-1680-/serversingles.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/serversingles.png) In the parameters section, all the parameters contained in the plugin will be displayed, although it is only mandatory to use the ip parameter. [![register3_spark.png](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/scaled-1680-/register3-spark.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/register3-spark.png) Below we can assign the required value to each macro. [![register4_plugin.png](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/scaled-1680-/register4-plugin.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2022-04/register4-plugin.png) The best way to manage server plugins in Pandora is from "/usr/share/pandora\_server/util/plugin" so we will send it to that path. Then we will move to the folder where we have put it ("/usr/share/pandora\_server/util/plugin" is the recommended one"). Remember: You have to install the dependencies that the Requests module needs in your system, it is explained in the configuration section. We move from home with :
``` cd /usr/share/pandora_server/util/plugin/ ``` We run the plugin to see that it works: ``` ./pandora_spark -i -[ -g ] [ --data_dir ] ``` [![7.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/7.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/7.png) **As a server plugin** Go to servers > plugins: [![image-1629974405286.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-08/scaled-1680-/image-1629974405286.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-08/image-1629974405286.png) Click in add: [![image-1629974430627.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-08/scaled-1680-/image-1629974430627.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-08/image-1629974430627.png) We put in the name and description of your choice: [![9.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/9.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/9.png) We enter as command the path to the plugin, and as parameters the ones we have entered by executing the plugin, the "\_field\_" fields are macros defined below. [![10.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/10.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/10.png) We put for each macro the description that you prefer and as value the data of your ip. [![11.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/11.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/11.png) Once this is done, if we execute the plugin from the terminal, the agent with the modules will have been created. # Modules generated An agent will be created for each application of our server with data of all its executors, the name of each module will be composed of the id of the executor plus its function. **Ejecutors modules**
**Nombre del módulo**
id
hostPort
rddBlocks
memoryUsed
diskUsed
activeTasks
failedTasks
completedTasks
totalTasks
totalDuration
totalInputBytes
totalShuffleRead
totalShuffleWrite
maxMemory
[![sparkmodulos.png](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/scaled-1680-/sparkmodulos.png)](https://pandorafms.com/guides/public/uploads/images/gallery/2021-09/sparkmodulos.png)