Home › Forums › FABRIC General Questions and Discussion › Management IP Invalid: None when running Python code in Jupyter
- This topic has 23 replies, 5 voices, and was last updated 2 years, 1 month ago by Mami Hayashida.
-
AuthorPosts
-
July 11, 2022 at 1:51 pm #2259
I’m currently trying to work through translating a few of the Jupyter notebooks to pure Python files, but when I call slice.submit() I get this exception:
node.execute: Management IP Invalid: None
The exception occurs just after “Running post boot config” but before the slice.submit() function call returns.
The submitted slice is actually created, but does not have any nodes.
Code to reproduce (after setting environment variables):
from fabrictestbed_extensions.fablib.fablib import fablib
from ipaddress import ip_address, IPv4Address, IPv6Address, IPv4Network, IPv6Network
import json
import tracebackslice_name = “MySlice”
node1_name = “Node1”
node2_name = “Node2″
site = fablib.get_random_site()
print(f”Site: {site}”)print(“Beginning creation of new slice…”)
try:
# Create a slice
slice = fablib.new_slice(name=slice_name)
node1 = slice.add_node(name=node1_name, site=site)
node2 = slice.add_node(name=node2_name, site=site)print(“Slice defined. Submitting…”)
try:
slice.submit()print(“Successfully submitted slice. Performing operation…”)
try:
print(f”{slice}”)
for node in slice.get_nodes():
print(f”{node}”)
stdout, stderr = node.execute(‘echo Hello, FABRIC from nodehostname -s
‘)
print(stdout)
except Exception as e:
print(f”Something went wrong while running slice. Exception: {e}”)
finally:
slice.wait_ssh(progress=True)
slice.delete()
except Exception as e:
print(f”Failed to obtain slice. Exception: {e}”)
except Exception as e:
print(f”Problem during slice creation. Exception: {e}”)July 11, 2022 at 2:39 pm #2260We are working on better error messages but for now ‘Management IP Invalid: None’ is a bit of generic fail message. It means that the VM didn’t get a Management IP assigned to it. In practice, this is the result of an uncaught VM failure, often related to errors in assigning IPs but sometime other things.
It is difficult to say what is causing this specific error but we seem to see this occasionally when a site is having issues starting VMs. You might try to resubmit the slice but on a different site. In your case you are using a random site so it may be as easy are retrying the same request. It would also be useful if you let us know which site you are seeing in this on when it happens.
Paul
July 11, 2022 at 4:24 pm #2262I just went through the list of sites, and was able to reproduce the issue with every site.
July 11, 2022 at 5:18 pm #2263I’m not sure what the problem is. When I try the code you posted it works. I think this means it has something to do with your configuration. Are you able to run the “Hello, FABRIC” notebook? That one is, basically, a test that confirms the configuration is correct.
July 12, 2022 at 9:53 am #2264The notebook runs just fine. The only notebooks that have failed have been ones that require project tags I don’t have.
July 12, 2022 at 12:50 pm #2268Which tags do you need? Which project?
Paul
July 13, 2022 at 9:03 am #2272I’m in ULTIMA. I don’t think we need any more tags at the moment, but will let you know as the need arises. However, all of the Networking examples after “Create a Local Ethernet (Layer 2)” require the Slice.Multisite tag to run.
Regardless, I’m fairly sure that permissions tags aren’t the issue here.
July 14, 2022 at 8:33 am #2280Are you still having issues running your notebook?
July 14, 2022 at 8:42 am #2281I am not having any issues running the notebook. Only with running .py scripts
July 14, 2022 at 2:15 pm #2285The Jupyter notebooks are just python but it allows you to run them one cell at a time. Can you cut/paste the code from the cells of “Hello, FABRIC” notebook to a .py script and run it? As long as your env vars and python libraries are setup correctly it should work.
July 14, 2022 at 3:11 pm #2286Right, that’s what I did.
First I made sure the “Hello, FABRIC notebook ran correctly.
Then I made a python script with all of the code cells copy/pasted directly back-to-back.
When I ran that script from the terminal, this was the output:
Name CPUs Cores RAM (G) Disk (G) Basic (100 Gbps NIC) ConnectX-6 (100 Gbps x2 NIC) ConnectX-5 (25 Gbps x2 NIC) P4510 (NVMe 1TB) Tesla T4 (GPU) RTX6000 (GPU)
—— —— ——- ——— ————- ———————- —————————— —————————– —————— —————- —————
MICH 6 188/192 1522/1536 60580/60600 381/381 0/2 2/2 10/10 2/2 3/3
UTAH 10 316/320 2544/2560 116380/116400 634/635 2/2 4/4 16/16 4/4 5/5
TACC 10 220/320 2256/2560 115390/116400 630/635 2/2 4/4 16/16 4/4 5/6
WASH 6 188/192 1520/1536 60580/60600 379/381 2/2 2/2 10/10 2/2 3/3
NCSA 6 192/192 1536/1536 60600/60600 381/381 2/2 2/2 10/10 2/2 3/3
DALL 6 192/192 1536/1536 60600/60600 381/381 2/2 2/2 10/10 2/2 3/3
MAX 10 254/320 2332/2560 115920/116400 594/635 0/2 2/4 16/16 4/4 6/6
MASS 4 118/128 984/1024 55690/55800 253/254 1/2 0/0 6/6 0/0 3/3
SALT 6 192/192 1536/1536 60600/60600 381/381 2/2 2/2 10/10 2/2 3/3
STAR 12 366/384 3000/3072 121090/121200 760/762 2/2 6/6 20/20 6/6 6/6
Running post boot config … Exception: node.execute: Management IP Invalid: None
———– ————————————
Slice Name MySlice
Slice ID b73f5090-e56a-474f-997a-16f6f7681952
Slice State Configuring
Lease End 2022-07-15 20:03:36 +0000
———– ————————————
—————– ———————————————————————————————-
ID
Name Node1
Cores
RAM
Disk
Image default_rocky_8
Image Type qcow2
Host
Site TACC
Management IP
Reservation State
Error Message
SSH Command ssh -i /home/fabric/.ssh/id_rsa -J xweintra_0000014567@bastion-1.fabric-testbed.net rocky@None
—————– ———————————————————————————————-
Exception: node.execute: Management IP Invalid: None
Exception: Failed to delete slice: Status.FAILURE, (500)
Reason: INTERNAL SERVER ERROR
HTTP response headers: HTTPHeaderDict({‘Server’: ‘nginx/1.21.6’, ‘Date’: ‘Thu, 14 Jul 2022 20:03:39 GMT’, ‘Content-Type’: ‘text/html; charset=utf-8’, ‘Content-Length’: ‘100’, ‘Connection’: ‘keep-alive’, ‘Access-Control-Allow-Credentials’: ‘true’, ‘Access-Control-Allow-Headers’: ‘DNT, User-Agent, X-Requested-With, If-Modified-Since, Cache-Control, Content-Type, Range’, ‘Access-Control-Allow-Methods’: ‘GET, POST, PUT, DELETE, OPTIONS’, ‘Access-Control-Allow-Origin’: ‘*’, ‘Access-Control-Expose-Headers’: ‘Content-Length, Content-Range, X-Error’, ‘X-Error’: ‘Unable to delete Slice# b73f5090-e56a-474f-997a-16f6f7681952 that is not yet stable, try again later’})
HTTP response body: Unable to delete Slice# b73f5090-e56a-474f-997a-16f6f7681952 that is not yet stable, try again laterThe Errors after the “Running post boot config…” line are because the submit() call throws an exception before it finishes, so the later calls are trying to act on a slice that is not stable yet.
The slice does eventually reach StableOK state, but it has no nodes.
July 15, 2022 at 8:29 am #2295Can you send me the python file you are using so I can try to recreate this issue?
July 15, 2022 at 9:29 am #2299I don’t think I have permissions to upload files
- This reply was modified 2 years, 4 months ago by Xander Maddox Weintraut. Reason: Not allowed to upload .py files apparently. You'll have to resave this as a .py before you can run it
- This reply was modified 2 years, 4 months ago by Xander Maddox Weintraut. Reason: Can't upload files
July 15, 2022 at 9:59 am #2310I think I fixed it so you can attach .py and .txt file. Can you try again?
July 15, 2022 at 10:00 am #2311Let’s try this
-
AuthorPosts
- You must be logged in to reply to this topic.