1. Dealing with loss of management connectivity/NetworkManager

Dealing with loss of management connectivity/NetworkManager

Home Forums FABRIC Announcements Dealing with loss of management connectivity/NetworkManager

Viewing 1 post (of 1 total)
  • Author
    Posts
  • #3461
    Ilya Baldin
    Participant

      Dear Experimenters,

      We recently discovered a problem in how we deal with dataplane interface configuration from inside the VMs. In order to make the configuration persistent and work consistently from the Portal and from Jupyter Notebooks we use NetworkManager software. Some images, like Rocky Linux come with NM built in, others, like Ubuntu and Debian derivatives we add it to them.

      For legacy reasons fablib currently turns NetworkManager off as part of the boot process. This creates a problem for some image types (particularly Rocky, it appears) because NM also keeps track of the DHCP leases of the management interface, thus if your VM sliver persists for more than 24 hours it usually loses management connectivity.

      We will remedy this problem in the next fablib release. For the moment the best solution is to add the following script to your post boot sequence in Jupyter:

      
      for node in slice.get_nodes():
          node.network_manager_start()    
              
      for iface in slice.get_interfaces():
              iface.get_node().execute(f'sudo nmcli device set {iface.get_device_name()} managed no')
      

      This will keep NM running however remove all dataplane interfaces from its control.

    Viewing 1 post (of 1 total)
    • You must be logged in to reply to this topic.