大数据集群监控配置操作指导(四)Spark监控使用jmx

芒果2年前技术文章2343


graphite_exporter方式
Graphite 来收集度量标准,Grafana 则用于构建仪表板,首先,需要配置 Spark 以将 metrics 报告到 Graphite。
prometheus 提供了一个插件(graphite_exporter),可以将 Graphite metrics 进行转化并写入 Prometheus (本文的方式)。
先去https://prometheus.io/download/下载graphite_exporter。
wget https://github.com/prometheus/graphite_exporter/releases/download/v0.13.1/graphite_exporter-0.13.1.linux-amd64.tar.gz
解压并修改为graphite_exporter
[root@hd1 exporters]# tar -xvf graphite_exporter-0.13.1.linux-amd64.tar.gz
graphite_exporter-0.13.1.linux-amd64/
graphite_exporter-0.13.1.linux-amd64/LICENSE
graphite_exporter-0.13.1.linux-amd64/NOTICE
graphite_exporter-0.13.1.linux-amd64/graphite_exporter
graphite_exporter-0.13.1.linux-amd64/getool
[root@hd1 exporters]# mv graphite_exporter-0.13.1.linux-amd64 graphite_exporter
进入到graphite_exporter下
创建graphite_exporter_mapping文件:
vim graphite_exporter_mapping
添加如下内容
mappings:
- match: '*.*.executor.filesystem.*.*'
  name: spark_app_filesystem_usage
  labels:
    application: $1
    executor_id: $2
    fs_type: $3
    qty: $4
- match: '*.*.jvm.*.*'
  name: spark_app_jvm_memory_usage
  labels:
    application: $1
    executor_id: $2
    mem_type: $3
    qty: $4
- match: '*.*.executor.jvmGCTime.count'
  name: spark_app_jvm_gcTime_count
  labels:
    application: $1
    executor_id: $2
- match: '*.*.jvm.pools.*.*'
  name: spark_app_jvm_memory_pools
  labels:
    application: $1
    executor_id: $2
    mem_type: $3
    qty: $4
- match: '*.*.executor.threadpool.*'
  name: spark_app_executor_tasks
  labels:
    application: $1
    executor_id: $2
    qty: $3
- match: '*.*.BlockManager.*.*'
  name: spark_app_block_manager
  labels:
    application: $1
    executor_id: $2
    type: $3
    qty: $4
- match: '*.*.DAGScheduler.*.*'
  name: spark_app_dag_scheduler
  labels:
    application: $1
    executor_id: $2
    type: $3
    qty: $4
- match: '*.*.CodeGenerator.*.*'
  name: spark_app_code_generator
  labels:
    application: $1
    executor_id: $2
    type: $3
    qty: $4
- match: '*.*.HiveExternalCatalog.*.*'
  name: spark_app_hive_external_catalog
  labels:
    application: $1
    executor_id: $2
    type: $3
    qty: $4
- match: '*.*.*.StreamingMetrics.*.*'
  name: spark_app_streaming_metrics
  labels:
    application: $1
    executor_id: $2
    app_name: $3
    type: $4
    qty: $5
- match: '*.*.executor.filesystem.*.*'
  name: filesystem_usage
  labels:
    application: $1
    executor_id: $2
    fs_type: $3
    qty: $4
- match: '*.*.executor.threadpool.*'
  name: executor_tasks
  labels:
    application: $1
    executor_id: $2
    qty: $3
- match: '*.*.executor.jvmGCTime.count'
  name: jvm_gcTime_count
  labels:
    application: $1
    executor_id: $2
- match: '*.*.executor.*.*'
  name: executor_info
  labels:
    application: $1
    executor_id: $2
    type: $3
    qty: $4
- match: '*.*.jvm.*.*'
  name: jvm_memory_usage
  labels:
    application: $1
    executor_id: $2
    mem_type: $3
    qty: $4
- match: '*.*.jvm.pools.*.*'
  name: jvm_memory_pools
  labels:
    application: $1
    executor_id: $2
    mem_type: $3
    qty: $4
- match: '*.*.BlockManager.*.*'
  name: block_manager
  labels:
    application: $1
    executor_id: $2
    type: $3
    qty: $4
- match: '*.driver.DAGScheduler.*.*'
  name: DAG_scheduler
  labels:
    application: $1
    type: $2
    qty: $3
- match: '*.driver.*.*.*.*'
  name: task_info
  labels:
    application: $1
    task: $2
    type1: $3
    type2: $4
    qty: $5
启动graphite_exporter(成功后 停止进程配置服务)
./graphite_exporter --graphite.mapping-config=graphite_exporter_mapping
配置成服务
vim /etc/systemd/system/graphite_exporter.service
[Unit]
Description=graphite_exporter
Documentation=https://prometheus.io/
After=network.target
[Service]
Type=simple
User=root
ExecStart=/opt/dtstack/exporters/graphite_exporter/graphite_exporter --graphite.mapping-config=/opt/dtstack/exporters/graphite_exporter/graphite_exporter_mapping
Restart=on-failure
[Install]
WantedBy=multi-user.target
启动graphite_exporter服务,并配置开机自启
systemctl daemon-reload
systemctl start graphite_exporter
systemctl status graphite_exporter
systemctl enable graphite_exporter 

B6BCAACC-286F-49A4-BD45-BCC918F6A7C7.png



配置Prometheus
vim /opt/dtstack/prometheus-2.33.3/prometheus.yml
增加
  - job_name: 'graphite_exporter'
    static_configs:
    - targets:
      - ‘hd1:9108' 




重启prometheus
在prometheus服务器上执行
systemctl restart prometheus
Spark配置Graphite metrics
Spark 是自带 Graphite Sink 的,
只需要配置一下metrics.properties;
 进入到spark安装目录下,进入到conf目录下,找到metrics.properties
cd /opt/spark/conf/
vim metrics.properties
添加如下内容:(graphite_exporter 接收数据端口为9109)
*.source.jvm.class=org.apache.spark.metrics.source.JvmSource
master.source.jvm.class=org.apache.spark.metrics.source.JvmSource
worker.source.jvm.class=org.apache.spark.metrics.source.JvmSource
driver.source.jvm.class=org.apache.spark.metrics.source.JvmSource
executor.source.jvm.class=org.apache.spark.metrics.source.JvmSource
*.sink.graphite.class=org.apache.spark.metrics.sink.GraphiteSink
*.sink.graphite.protocol=tcp
*.sink.graphite.host=hd1(主机名)
*.sink.graphite.port=9109
*.sink.graphite.period=1
*.sink.graphite.unit=seconds 

D723D32F-30D9-4C28-89A0-A04B3C4E1E5A.png



启动Spark程序 
启动spark程序时,需要加上–files /usr/etc/spark/conf/metrics.properties参数。
./bin/spark-submit --class org.apache.spark.examples.SparkPi --master yarn --deploy-mode cluster --files /opt/spark/conf/metrics.properties --executor-cores 1 --queue default examples/jars/spark-examples_2.12-3.3.1.jar 10
shell:
./bin/spark-shell --files /opt/spark/conf/metrics.properties
访问Prometheus是否收集到metrics数据
http://hd1:9108/metrics 


23C9B04B-81D3-4D9D-8BF7-E6D1ED1DBCB7.png


相关文章

CentOS6.x下的ntp服务

CentOS6.x下的ntp服务配置192.168.1.1(node01) 负责与外网同步时间,同时作为内网的ntp服务192.168.1.2(node02) 和内网192.168.1.1去同步时间,...

MySQL运维实战之ProxySQL(9.3)使用ProxySQL实现读写分离

proxysql读写分离主要通过mysql_query_rules表中的规则来实现。下面是具体的配置步骤:hostgroup配置insert into mysql_servers&...

HBase 的 BulkLoad 机制

HBase 的 BulkLoad 机制

1.概述在实际生产环境中,有这样一种场景:用户数据位于HDFS中,业务需要定期将这部分海量数据导入 HBase 系统,以执行随机查询更新操作。这种场景如果调用写入 API 进行处理,极有可能会给 Re...

MySQL运维实战(4.5) SQL_MODE之NO_ZERO_DATE和NO_ZERO_IN_DATE

NO_ZERO_DATE:日期中不允许'0000-00-00'NO_ZERO_IN_DATE:日期中年、月或日不允许为0,如不允许'2021-00-01', '...

PG的锁(三)

六、锁的维护6.1 锁相关参数deadlock_timeout(integer):默认1s,表示pg数据库仅对锁超时大于1s的情况进行死锁检测。log_lock_waits : 默认关闭,若打开该参数...

MySQL运维实战(4.7) SQL_MODE之ANSI_QUOTES

默认情况下,mysql使用反引号(`)作为标识符的引号。使用mysql关键字作为表名、字段名会报语法错误,这时可以加上反引号( `),避免报错。设置ANSI_QUOTES后,使用双引号(")...

发表评论    

◎欢迎参与讨论,请在这里发表您的看法、交流您的观点。