1. Management IP Invalid: None when running Python code in Jupyter

Management IP Invalid: None when running Python code in Jupyter

Home Forums FABRIC General Questions and Discussion Management IP Invalid: None when running Python code in Jupyter

Viewing 15 posts - 1 through 15 (of 24 total)
  • Author
    Posts
  • #2259

    I’m currently trying to work through translating a few of the Jupyter notebooks to pure Python files, but when I call slice.submit() I get this exception:

    node.execute: Management IP Invalid: None

    The exception occurs just after “Running post boot config” but before the slice.submit() function call returns.

    The submitted slice is actually created, but does not have any nodes.

    Code to reproduce (after setting environment variables):

    from fabrictestbed_extensions.fablib.fablib import fablib
    from ipaddress import ip_address, IPv4Address, IPv6Address, IPv4Network, IPv6Network
    import json
    import traceback

    slice_name = “MySlice”
    node1_name = “Node1”
    node2_name = “Node2″
    site = fablib.get_random_site()
    print(f”Site: {site}”)

    print(“Beginning creation of new slice…”)
    try:
    # Create a slice
    slice = fablib.new_slice(name=slice_name)
    node1 = slice.add_node(name=node1_name, site=site)
    node2 = slice.add_node(name=node2_name, site=site)

    print(“Slice defined. Submitting…”)
    try:
    slice.submit()

    print(“Successfully submitted slice. Performing operation…”)
    try:
    print(f”{slice}”)
    for node in slice.get_nodes():
    print(f”{node}”)
    stdout, stderr = node.execute(‘echo Hello, FABRIC from node hostname -s‘)
    print(stdout)
    except Exception as e:
    print(f”Something went wrong while running slice. Exception: {e}”)
    finally:
    slice.wait_ssh(progress=True)
    slice.delete()
    except Exception as e:
    print(f”Failed to obtain slice. Exception: {e}”)
    except Exception as e:
    print(f”Problem during slice creation. Exception: {e}”)

    #2260
    Paul Ruth
    Keymaster

      We are working on better error messages but for now ‘Management IP Invalid: None’ is a bit of generic fail message. It means that the VM didn’t get a Management IP assigned to it.  In practice, this is the result of an uncaught VM failure, often related to errors in assigning IPs but sometime other things.

      It is difficult to say what is causing this specific error but we seem to see this occasionally when a site is having issues starting VMs.  You might try to resubmit the slice but on a different site.  In your case you are using a random site so it may be as easy are retrying the same request.  It would also be useful if you let us know which site you are seeing in this on when it happens.

      Paul

       

      #2262

      I just went through the list of sites, and was able to reproduce the issue with every site.

      #2263
      Paul Ruth
      Keymaster

        I’m not sure what the problem is. When I try the code you posted it works.   I think this means it has something to do with your configuration.  Are you able to run the “Hello, FABRIC” notebook?   That one is, basically, a test that confirms the configuration is correct.

        #2264

        The notebook runs just fine. The only notebooks that have failed have been ones that require project tags I don’t have.

        #2268
        Paul Ruth
        Keymaster

          Which tags do you need? Which project?

          Paul

          #2272

          I’m in ULTIMA. I don’t think we need any more tags at the moment, but will let you know as the need arises. However, all of the Networking examples after “Create a Local Ethernet (Layer 2)” require the Slice.Multisite tag to run.

          Regardless, I’m fairly sure that permissions tags aren’t the issue here.

          #2280
          Paul Ruth
          Keymaster

            Are you still having issues running your notebook?

            #2281

            I am not having any issues running the notebook. Only with running .py scripts

            #2285
            Paul Ruth
            Keymaster

              The Jupyter notebooks are just python but it allows you to run them one cell at a time. Can you cut/paste the code from the cells of “Hello, FABRIC” notebook to a .py script and run it?  As long as your env vars and python libraries are setup correctly it should work.

              #2286

              Right, that’s what I did.

              First I made sure the “Hello, FABRIC notebook ran correctly.

              Then I made a python script with all of the code cells copy/pasted directly back-to-back.

              When I ran that script from the terminal, this was the output:

              Name CPUs Cores RAM (G) Disk (G) Basic (100 Gbps NIC) ConnectX-6 (100 Gbps x2 NIC) ConnectX-5 (25 Gbps x2 NIC) P4510 (NVMe 1TB) Tesla T4 (GPU) RTX6000 (GPU)
              —— —— ——- ——— ————- ———————- —————————— —————————– —————— —————- —————
              MICH 6 188/192 1522/1536 60580/60600 381/381 0/2 2/2 10/10 2/2 3/3
              UTAH 10 316/320 2544/2560 116380/116400 634/635 2/2 4/4 16/16 4/4 5/5
              TACC 10 220/320 2256/2560 115390/116400 630/635 2/2 4/4 16/16 4/4 5/6
              WASH 6 188/192 1520/1536 60580/60600 379/381 2/2 2/2 10/10 2/2 3/3
              NCSA 6 192/192 1536/1536 60600/60600 381/381 2/2 2/2 10/10 2/2 3/3
              DALL 6 192/192 1536/1536 60600/60600 381/381 2/2 2/2 10/10 2/2 3/3
              MAX 10 254/320 2332/2560 115920/116400 594/635 0/2 2/4 16/16 4/4 6/6
              MASS 4 118/128 984/1024 55690/55800 253/254 1/2 0/0 6/6 0/0 3/3
              SALT 6 192/192 1536/1536 60600/60600 381/381 2/2 2/2 10/10 2/2 3/3
              STAR 12 366/384 3000/3072 121090/121200 760/762 2/2 6/6 20/20 6/6 6/6
              Running post boot config … Exception: node.execute: Management IP Invalid: None
              ———– ————————————
              Slice Name MySlice
              Slice ID b73f5090-e56a-474f-997a-16f6f7681952
              Slice State Configuring
              Lease End 2022-07-15 20:03:36 +0000
              ———– ————————————
              —————– ———————————————————————————————-
              ID
              Name Node1
              Cores
              RAM
              Disk
              Image default_rocky_8
              Image Type qcow2
              Host
              Site TACC
              Management IP
              Reservation State
              Error Message
              SSH Command ssh -i /home/fabric/.ssh/id_rsa -J xweintra_0000014567@bastion-1.fabric-testbed.net rocky@None
              —————– ———————————————————————————————-
              Exception: node.execute: Management IP Invalid: None
              Exception: Failed to delete slice: Status.FAILURE, (500)
              Reason: INTERNAL SERVER ERROR
              HTTP response headers: HTTPHeaderDict({‘Server’: ‘nginx/1.21.6’, ‘Date’: ‘Thu, 14 Jul 2022 20:03:39 GMT’, ‘Content-Type’: ‘text/html; charset=utf-8’, ‘Content-Length’: ‘100’, ‘Connection’: ‘keep-alive’, ‘Access-Control-Allow-Credentials’: ‘true’, ‘Access-Control-Allow-Headers’: ‘DNT, User-Agent, X-Requested-With, If-Modified-Since, Cache-Control, Content-Type, Range’, ‘Access-Control-Allow-Methods’: ‘GET, POST, PUT, DELETE, OPTIONS’, ‘Access-Control-Allow-Origin’: ‘*’, ‘Access-Control-Expose-Headers’: ‘Content-Length, Content-Range, X-Error’, ‘X-Error’: ‘Unable to delete Slice# b73f5090-e56a-474f-997a-16f6f7681952 that is not yet stable, try again later’})
              HTTP response body: Unable to delete Slice# b73f5090-e56a-474f-997a-16f6f7681952 that is not yet stable, try again later

              The Errors after the “Running post boot config…” line are because the submit() call throws an exception before it finishes, so the later calls are trying to act on a slice that is not stable yet.

              The slice does eventually reach StableOK state, but it has no nodes.

              #2295
              Paul Ruth
              Keymaster

                Can you send me the python file you are using so I can try to recreate this issue?

                #2299

                I don’t think I have permissions to upload files

                • This reply was modified 1 year, 9 months ago by Xander Maddox Weintraut. Reason: Not allowed to upload .py files apparently. You'll have to resave this as a .py before you can run it
                • This reply was modified 1 year, 9 months ago by Xander Maddox Weintraut. Reason: Can't upload files
                #2310
                Paul Ruth
                Keymaster

                  I think I fixed it so you can attach .py and .txt file.  Can you try again?

                   

                  #2311

                  Let’s try this

                Viewing 15 posts - 1 through 15 (of 24 total)
                • You must be logged in to reply to this topic.