Skip to content

I/O Metrics Missing #7

@Dzeri96

Description

@Dzeri96

I managed to get some of the execution metrics to show up in Grafana, however, more than half are missing. I'm using Spark 3.5.2 on Kubernetes and these are the relevant parts of the config:

# spark-dashboard
spark.metrics.conf.*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
spark.metrics.conf.*.sink.graphite.host=spark-dashboard.spark-dashboard.svc.cluster.local
spark.metrics.conf.*.sink.graphite.port=2003
spark.metrics.conf.*.sink.graphite.period=10
spark.metrics.conf.*.sink.graphite.unit=seconds
spark.metrics.conf.*.sink.graphite.prefix=lucatest
# Enable JVM metrics collection
spark.metrics.conf.*.source.jvm.class=org.apache.spark.metrics.source.JvmSource

spark.metrics.staticSources.enabled                 true
spark.metrics.appStatusSource.enabled               true
spark.executor.processTreeMetrics.enabled           true

spark.jars.packages=ch.cern.sparkmeasure:spark-measure_2.12:0.27,ch.cern.sparkmeasure:spark-plugins_2.12:0.4
spark.plugins=ch.cern.HDFSMetrics,ch.cern.CgroupMetrics,ch.cern.CloudFSMetrics
spark.cernSparkPlugin.cloudFsName                   s3a

The driver is running on the machine I'm executing spark-submit from.

As you can see in the screenshot, some data is being reported incorrectly, while other data is simply missing. The "extended" dashboard is almost completely empty.

Image

The workload I'm running is reading the TPCDS store_sales table at scale factor 1000, and saving it using Iceberg. In the spark dashboard I can see the data being read and written, including the shuffle stage.
As far as I remember, sparkmeasure gives correct numbers at the end of the job's run.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions