Home › Forums › FABRIC General Questions and Discussion › Mflib – Prometheus instrumentize error
Tagged: MFLib, Prometheus
- This topic has 4 replies, 2 voices, and was last updated 5 months, 1 week ago by Charles Carpenter.
-
AuthorPosts
-
May 31, 2024 at 4:26 pm #7035
Hi,
I am having some issues getting MFlib and Prometheus setup.I have been following the setup instructions from notebooks found here: https://github.com/fabric-testbed/jupyter-examples/blob/1c9097e416ac1fdb370489b5b4fd6e0e0bf76dbe/fabric_examples/mflib/KNIT6/KNIT6_start_here.ipynb
In the process I am having some issues getting prometheus set up. Specifically in the
#instrumetize_results = mf.instrumentize()
step, where I am getting an error in the setup of prometheus. The error can be seen below asError 1
. In the same process I attempt to access grafana locally by tunneling throughmf.grafana_tunnel
, and through this I can access the node using ssh, however I then get spammed with the errorchannel 3: open failed: connect failed: Connection refused
Error 1:
Instrumentizing slice “MyMonitoredSliceTwo” Setting up Prometheus… {‘grafana_admin_pass’: ‘nedcyxBt’, ‘success’: False, ‘msg’: ‘Prometheus playbook install failed..’, ‘play_recap’: ‘PLAY RECAP *********************************************************************\nmeas-node : ok=14 changed=7 unreachable=0 failed=1 skipped=0 rescued=0 ignored=0 \n\nPlaybook run took 0 days, 0 hours, 0 minutes, 5 seconds\nFriday 31 May 2024 19:28:44 +0000 (0:00:00.430) 0:00:05.913 ************ \n=============================================================================== \nGathering Facts ——————————————————— 1.45s\n../prometheus/ansible/roles/fabric_experiment : passlib —————– 1.31s\n../prometheus/ansible/roles/fabric_experiment : Generate an OpenSSL private key. — 0.71s\n../prometheus/ansible/roles/fabric_experiment : Generate a Self Signed OpenSSL certificate. — 0.50s\n../prometheus/ansible/roles/fabric_experiment : Create network IPV4 only in docker for related monitoring containers. — 0.43s\n../prometheus/ansible/roles/fabric_experiment : Generate an OpenSSL CSR. — 0.38s\n../prometheus/ansible/roles/fabric_experiment : Create Fabric Prometheus user as fab-prom — 0.32s\n../prometheus/ansible/roles/fabric_experiment : Get Fabric Prometheus user ids — 0.25s\n../prometheus/ansible/roles/fabric_experiment : Create Directories for certs — 0.25s\n../prometheus/ansible/roles/fabric_experiment : Create Directories for keys — 0.16s\n../prometheus/ansible/roles/fabric_experiment : Self-signed certs. —— 0.03s\n../prometheus/ansible/roles/fabric_experiment : Setup the docker network to be used by the monitoring containers. — 0.03s\n../prometheus/ansible/roles/fabric_experiment : Setup user for promtheus monitoring system — 0.03s\n../prometheus/ansible/roles/fabric_experiment : debug ——————- 0.02s\n../prometheus/ansible/roles/fabric_experiment : set_fact —————- 0.02s\nFriday 31 May 2024 19:28:44 +0000 (0:00:00.431) 0:00:05.913 ************ \n=============================================================================== \n../prometheus/ansible/roles/fabric_experiment ————————— 4.45s\ngather_facts ———————————————————— 1.45s\n~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ \ntotal ——————————————————————- 5.90s\n’} Setting up Prometheus done. Setting up grafana_manager & dashboards… Setting up grafana_manager & dashboards done. Setting up elk… Setting up elk done. Instrumentize Process Complete.Extra info: I initially had this as part of a more complex project, and prometheus worked for me at first. This was last week. Then in the last few days I started getting this error, so in an effort to debug I made a more simple version just following the guide from the github repo. The error persists. I don’t know if my issues are site related since I don’t remember what sites I was using initially, or if something else is causing it.
Sites used: PRIN, BRIST, EDC
Slice name: MyMonitoredSliceTwo
Slice ID: 10ea54e7-5bc1-49c9-a62f-e48121ab7093
Fablib version: 1.6.5
MFLib version: 1.0.40The code I am running in my notebook can be found here: https://github.com/BjoernSag/mflib-prometheus-test
June 4, 2024 at 9:43 am #7040Thanks for bringing this to our attention.
In the above result, setting up Prometheus returned ‘success’: False, ‘msg’: ‘Prometheus playbook install failed’ This means that Grafana, which is part of the Promtheus install was most likely not installed.The ssh tunnel to trying to connect to Grafana’s port is unable to connect since there is no Grafana running, thus the channel 3: open failed: connect failed: Connection refused error.The error was caused by an ansible related update which broke the installation process. We have found the problem and will be pushing out a fix today.June 4, 2024 at 6:03 pm #7042The MeasurementFramework has been updated to fix the docker conflict.
Prometheus system is now working. There is an error in a script due to a mirror problem, but this does not affect the Prometheus install.The ELK install has a fatal mirror problem that remains to be fixed. I will post here when that is completed.
June 5, 2024 at 12:16 pm #7054I just tested the prometheus system and it works fine. Thank you so much for the quick fix!
June 11, 2024 at 11:22 pm #7091The ELK mirror problem with Centos/Rocky 8 has been fixed.
-
AuthorPosts
- You must be logged in to reply to this topic.