This repository was archived by the owner on Jan 29, 2025. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 34
This repository was archived by the owner on Jan 29, 2025. It is now read-only.
Missing Metrics #580
Copy link
Copy link
Open
Description
Hey guys,
Wondering if someone could assist with an issue I'm having with BigGraphite [BG]. It currently receives a large number of metrics, but appears to drop a noticable proportion randomly... this was highlighted when looking at metrics from Apache Spark, which has frequent gaps per hour (of one minute each).
Infrastructure Setup:
- Within EKS (1.20)
- internal AWS NLB
- Traffic Flow: NLB -> Carbon Container -> {elasticsearch + cassandra}
- Carbon: Running inside an upstream Alpine container
- PS:
1 root 0:00 {entrypoint} /bin/sh /entrypoint
49 root 0:00 runsvdir -P /etc/service
51 root 0:00 runsv bg-carbon
52 root 0:03 runsv brubeck
53 root 0:00 runsv carbon
54 root 0:00 runsv carbon-aggregator
55 root 0:03 runsv carbon-relay
56 root 0:03 runsv collectd
57 root 0:00 runsv cron
58 root 0:00 runsv go-carbon
59 root 0:00 runsv graphite
60 root 0:00 runsv nginx
61 root 0:03 runsv redis
62 root 0:00 runsv statsd
63 root 0:00 tee -a /var/log/carbon.log
65 root 0:00 tee -a /var/log/carbon-relay.log
68 root 0:00 tee -a /var/log/statsd.log
69 root 0:01 {gunicorn} /opt/graphite/bin/python3 /opt/graphite/bin/gunicorn wsgi --pythonpath=/opt/graphite/webapp/graphite --preload --threads=1 --worker-class=sync --workers=4 --limit-request-line=0 --max-requests=1000 --timeout=65 --bind=0.0
70 root 0:09 {node} statsd /opt/statsd/config/tcp.js
71 root 0:00 nginx: master process /usr/sbin/nginx -c /etc/nginx/nginx.conf
76 root 0:00 /usr/sbin/crond -f
79 nginx 0:00 nginx: worker process
80 nginx 0:00 nginx: worker process
81 nginx 0:00 nginx: worker process
82 nginx 0:00 nginx: worker process
85 root 0:35 tee -a /var/log/bg-carbon.log
86 root 45:27 /opt/graphite/bin/python3 /opt/graphite/bin/bg-carbon-cache start --nodaemon --debug
88 root 0:00 tee -a /var/log/carbon-aggregator.log
156 root 0:41 {gunicorn} /opt/graphite/bin/python3 /opt/graphite/bin/gunicorn wsgi --pythonpath=/opt/graphite/webapp/graphite --preload --threads=1 --worker-class=sync --workers=4 --limit-request-line=0 --max-requests=1000 --timeout=65 --bind=0.0
157 root 0:49 {gunicorn} /opt/graphite/bin/python3 /opt/graphite/bin/gunicorn wsgi --pythonpath=/opt/graphite/webapp/graphite --preload --threads=1 --worker-class=sync --workers=4 --limit-request-line=0 --max-requests=1000 --timeout=65 --bind=0.0
158 root 0:46 {gunicorn} /opt/graphite/bin/python3 /opt/graphite/bin/gunicorn wsgi --pythonpath=/opt/graphite/webapp/graphite --preload --threads=1 --worker-class=sync --workers=4 --limit-request-line=0 --max-requests=1000 --timeout=65 --bind=0.0
159 root 0:47 {gunicorn} /opt/graphite/bin/python3 /opt/graphite/bin/gunicorn wsgi --pythonpath=/opt/graphite/webapp/graphite --preload --threads=1 --worker-class=sync --workers=4 --limit-request-line=0 --max-requests=1000 --timeout=65 --bind=0.0
I can see traffic coming in to the interface (tcpdump/tcpflow), and can see logs to bg-carbon.log with references to 'cache query', but almost no datapoint logs for spark metrics.
Any assistance in troubleshooting would be greatly appreciated!
joffrey92
Metadata
Metadata
Assignees
Labels
No labels