Skip to content

Commit 99c48bf

Browse files
committed
Update to the README
1 parent bb6440b commit 99c48bf

1 file changed

Lines changed: 22 additions & 23 deletions

File tree

README.md

Lines changed: 22 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -2,22 +2,20 @@
22
![SparkPlugins CI](https://github.com/cerndb/SparkPlugins/workflows/SparkPlugins%20CI/badge.svg?branch=master&event=push)
33
[![Maven Central](https://maven-badges.herokuapp.com/maven-central/ch.cern.sparkmeasure/spark-plugins_2.12/badge.svg)](https://maven-badges.herokuapp.com/maven-central/ch.cern.sparkmeasure/spark-plugins_2.12)
44

5-
Use Spark Plugins to extend Apache Spark with custom metrics and executors' startup actions.
6-
5+
**Spark Plugins** are an Apache Spark feature for extending Spark with custom metrics and actions.
6+
This repository provides ready-to-use examples for deploying Spark plugins across various use cases.
77
### Key Features
88

9-
- **Spark Plugins** are a mechanism to extend Apache Spark with custom code for metrics and actions.
10-
- This repository provides examples of plugins that you can deploy to extend Spark with custom metrics and actions.
11-
- **Extending Spark instrumentation** with custom metrics
12-
- **Running custom actions** when the executors start up, typically useful for integrating with
13-
external systems, such as monitoring systems.
14-
- This repo provides code and examples of plugins applied to measuring Spark on cluster resources (YARN, K8S, Standalone),
15-
including measuring Spark I/O from cloud Filesystems, OS metrics, custom application metrics, and integrations with external systems like Pyroscope.
16-
- The code in this repo is for Spark 3.x. For Spark 2.x, see instead [Executor Plugins for Spark 2.4](https://github.com/cerndb/SparkExecutorPlugins2.4)
9+
* **Custom Metrics:** Extend Spark's instrumentation with user-defined metrics.
10+
* **Executor Actions:** Trigger custom actions upon executor startup, useful for integrations (e.g., monitoring systems).
11+
* **Resource Monitoring:** Measure Spark’s usage of cluster resources (YARN, K8S, Standalone).
12+
* **I/O Metrics:** Monitor I/O performance from cloud filesystems, OS metrics, and custom application metrics.
13+
* **External Integrations:** Connect with external systems like Pyroscope for performance insights.
1714

1815
### Contents
1916
- [Getting started](#getting-started---your-first-spark-plugins)
2017
- [Demo and basic plugins](#demo-and-basic-plugins)
18+
- [Implementation notes](#implementation-notes)
2119
- [Plugin for integrating Pyroscope with Spark](#plugin-for-integrating-with-pyroscope)
2220
- [Plugin for OS metrics instrumentation with Cgroups for Spark on Kubernetes](#os-metrics-instrumentation-with-cgroups-for-spark-on-kubernetes)
2321
- [Plugin to collect I/O storage statistics for HDFS and Hadoop-compatible filesystems](#plugins-to-collect-io-storage-statistics-for-hdfs-and-hadoop-compatible-filesystems)
@@ -33,19 +31,6 @@ Use Spark Plugins to extend Apache Spark with custom metrics and executors' star
3331

3432
Author and contact: Luca.Canali@cern.ch
3533

36-
### Implementation Notes:
37-
- Spark plugins implement the `org.apache.spark.api.Plugin` interface, they can be written in Scala or Java
38-
and can be used to run custom code at the startup of Spark executors and driver.
39-
- Plugins basic configuration: `--conf spark.plugins=<list of plugin classes>`
40-
- Plugin JARs need to be made available to Spark executors
41-
- you can distribute the plugin code to the executors using `--jars` and `--packages`
42-
- for K8S you can also consider making the jars available directly in the container image
43-
- Most of the Plugins described in this repo are intended to extend the Spark Metrics System
44-
- See the details on the Spark metrics system at [Spark Monitoring documentation](https://spark.apache.org/docs/latest/monitoring.html#metrics).
45-
- You can find the metrics generated by the plugins in the Spark metrics system stream under the
46-
namespace `namespace=plugin.<Plugin Class Name>`
47-
- See also: [SPARK-29397](https://issues.apache.org/jira/browse/SPARK-29397), [SPARK-28091](https://issues.apache.org/jira/browse/SPARK-28091), [SPARK-32119](https://issues.apache.org/jira/browse/SPARK-32119).
48-
4934
---
5035
## Getting Started - Your First Spark Plugins
5136
- Deploy the code of the Spark plugins described here using from maven central
@@ -76,6 +61,20 @@ Author and contact: Luca.Canali@cern.ch
7661
```
7762
- You can see if the plugin has run by checking that the file `/tmp/plugin.txt` has been
7863
created on the executor machines.
64+
---
65+
### Implementation Notes:
66+
- Spark plugins implement the `org.apache.spark.api.Plugin` interface, they can be written in Scala or Java
67+
and can be used to run custom code at the startup of Spark executors and driver.
68+
- Plugins basic configuration: `--conf spark.plugins=<list of plugin classes>`
69+
- Plugin JARs need to be made available to Spark executors
70+
- you can distribute the plugin code to the executors using `--jars` and `--packages`
71+
- for K8S you can also consider making the jars available directly in the container image
72+
- Most of the Plugins described in this repo are intended to extend the Spark Metrics System
73+
- See the details on the Spark metrics system at [Spark Monitoring documentation](https://spark.apache.org/docs/latest/monitoring.html#metrics).
74+
- You can find the metrics generated by the plugins in the Spark metrics system stream under the
75+
namespace `namespace=plugin.<Plugin Class Name>`
76+
- See also: [SPARK-29397](https://issues.apache.org/jira/browse/SPARK-29397), [SPARK-28091](https://issues.apache.org/jira/browse/SPARK-28091), [SPARK-32119](https://issues.apache.org/jira/browse/SPARK-32119).
77+
7978
---
8079
## Plugins in this Repository
8180

0 commit comments

Comments
 (0)