1. modifying device properties

modifying device properties

Home Forums FABRIC General Questions and Discussion modifying device properties

Viewing 12 posts - 1 through 12 (of 12 total)
  • Author
    Posts
  • #2343
    Arash SARABI
    Participant

      I was wondering if there was a way to disable the ethernets’ CRC check. I attempted to use “ethtool -K ens8 rx-all on” but received the following error:
      Could not change any device features

      any help would be appreciated.

      #2344
      Paul Ruth
      Keymaster

        I think that might be a side effect of NIC_Basic’s being SRIOV VFs with limited low level configuration available.   You will have a bit more control if you use FABRIC’s dedicated NICs.

        I tried running that ethtool command on a NIC_ConnectX_5 and it worked. I think the lower level control you need requires NIC_ConnectX_5 or NIC_ConnectX_6 NICs.

        Let me know if that works for you.

        Paul

        #2345
        Arash SARABI
        Participant

          Thank you Paul,

          I tried to replace NIC_Basic with NIC_ConnectX_5 it returns the following error:

          /opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/slice.py in submit(self, wait, wait_timeout, wait_interval, progress, wait_jupyter)
          1207 ssh_key=self.get_slice_public_key())
          1208 if return_status != Status.OK:
          -> 1209 raise Exception(“Failed to submit slice: {}, {}”.format(return_status, slice_reservations))
          1210
          1211 logging.debug(f’slice_reservations: {slice_reservations}’)

          Exception: Failed to submit slice: Status.FAILURE, (500)
          Reason: INTERNAL SERVER ERROR
          HTTP response headers: HTTPHeaderDict({‘Server’: ‘nginx/1.21.6’, ‘Date’: ‘Sun, 17 Jul 2022 20:04:56 GMT’, ‘Content-Type’: ‘text/html; charset=utf-8’, ‘Content-Length’: ‘130’, ‘Connection’: ‘keep-alive’, ‘Access-Control-Allow-Credentials’: ‘true’, ‘Access-Control-Allow-Headers’: ‘DNT, User-Agent, X-Requested-With, If-Modified-Since, Cache-Control, Content-Type, Range’, ‘Access-Control-Allow-Methods’: ‘GET, POST, PUT, DELETE, OPTIONS’, ‘Access-Control-Allow-Origin’: ‘*’, ‘Access-Control-Expose-Headers’: ‘Content-Length, Content-Range, X-Error’, ‘X-Error’: ‘PDP Authorization check failed – Policy Violation: Your project is lacking Component.SmartNIC tag to provision a VM with SmartNIC.’})
          HTTP response body: PDP Authorization check failed – Policy Violation: Your project is lacking Component.SmartNIC tag to provision a VM with SmartNIC.

          #2346
          Arash SARABI
          Participant

            Here is my code to submit slice:

            slice = fablib.new_slice(name=SLICENAME)

            nodeclient = slice.add_node(name=”client”, site=SITE, cores=1, ram=16,disk=900, image=’default_ubuntu_20′)
            nodeserver = slice.add_node(name=”server”, site=SITE, cores=1, ram=16,disk=900, image=’default_ubuntu_20′)
            nodeRouter = slice.add_node(name=”router”, site=SITE, image=’default_ubuntu_20′)

            ifaceclient = nodeclient.add_component(model=”NIC_ConnectX_5″, name=”if_client”).get_interfaces()[0]
            ifaceserver = nodeserver.add_component(model=”NIC_ConnectX_5″, name=”if_server”).get_interfaces()[0]
            ifaceRouterC = nodeRouter.add_component(model=”NIC_ConnectX_5″, name=”if_router_c”).get_interfaces()[0]
            ifaceRouterS = nodeRouter.add_component(model=”NIC_ConnectX_5″, name=”if_router_s”).get_interfaces()[0]

             

            netC = slice.add_l2network(name=’net_c’, type=’L2Bridge’, interfaces=[ifaceclient, ifaceRouterC])
            netS = slice.add_l2network(name=’net_s’, type=’L2Bridge’, interfaces=[ifaceserver, ifaceRouterS])

            slice.submit()

            • This reply was modified 2 years, 5 months ago by Arash SARABI.
            • This reply was modified 2 years, 5 months ago by Arash SARABI.
            #2349
            Paul Ruth
            Keymaster

              Your project did not have permissions required to use smart NICs.  I just added the permissions.  Please try again.

              #2350
              Arash SARABI
              Participant

                I restarted the kernel, I am facing the following problem

                ID Name Site Host Cores RAM Disk Image Management IP State Error
                ———————————— —— —— ————————– ——- —– —— —————– ————— ——- ——————————————————————————————-
                96275a0f-f2f7-4928-b20a-f34fc6b2a1e1 client TACC tacc-w5.fabric-testbed.net 4 16 500 default_ubuntu_20 Closed TicketReviewPolicy: Closing reservation due to failure in slice
                5517d25c-2b19-4815-8c0c-136395169d48 server TACC tacc-w5.fabric-testbed.net 4 16 500 default_ubuntu_20 Closed TicketReviewPolicy: Closing reservation due to failure in slice
                46b000bf-ff65-4f7c-9cd5-8bef2f140a42 router TACC default_ubuntu_20 Closed Insufficient resources : Component of type: ConnectX-6 not available in graph node: 8QQBZC3

                Exception: node.execute: Management IP Invalid: None

                 

                #2351
                Arash SARABI
                Participant

                  It appears that I should check which site has more available.

                  #2352
                  Paul Ruth
                  Keymaster

                    Yeah, that slice request requires 4 connectx-5’s and will need to be aware of their availability.

                    I have a few observations that may help:

                    – All 3 of your nodes are being sent to the same site.  You might try putting them on different sites.  From your perspective it will work about the same.  The main difference will be that the latency between nodes is greater.

                    – If you only need low level configuration for some interfaces, you could mix-and-match NIC_ConnectX_5 with NIC_Basic. For example, maybe your router needs a ConnectX_5 but the nodes can use a NIC_Basic (or the other way around).

                    – Your router node is asking for 2 connectx-5’s.  Each connectx-5 has two ports. If you really only need two ports you can use 1 connectx-5 for your router. The code for that will look something like:

                    connectx5_interfaces = nodeRouter.add_component(model=”NIC_ConnectX_5″, name=”cx5_nic”).get_interfaces()
                    ifaceRouterC = connectx5_interfaces[0]
                    ifaceRouterS = connectx5_interfaces[1]

                    or you can shorten it to:
                    [ifaceRouterC,ifaceRouterS] = nodeRouter.add_component(model=”NIC_ConnectX_5″, name=”cx5_nic”).get_interfaces()

                     

                    • This reply was modified 2 years, 5 months ago by Paul Ruth.
                    • This reply was modified 2 years, 5 months ago by Paul Ruth.
                    • This reply was modified 2 years, 5 months ago by Paul Ruth.
                    • This reply was modified 2 years, 5 months ago by Paul Ruth.
                    • This reply was modified 2 years, 5 months ago by Paul Ruth.
                    #2356
                    Arash SARABI
                    Participant

                      Thank you for all your help,

                      When I use your code it will return an error:

                      ---> 12 [ifaceRouterC,ifaceRouterS] = nodeRouter.add_component(model="NIC_ConnectX_5", name="cx5_nic").get_intefaces()
                      13
                      14 netC = slice.add_l2network(name='net_c', type='L2Bridge', interfaces=[ifaceclient, ifaceRouterC])
                      
                      AttributeError: 'Component' object has no attribute 'get_intefaces'

                      and when I use my code with a different site that has available resources (STAR) it will go to the Active state but after some time it returns the following error:

                      
                      ID                                    Name    Site    Host                          Cores    RAM    Disk  Image              Management IP                           State    Error
                      ------------------------------------  ------  ------  --------------------------  -------  -----  ------  -----------------  --------------------------------------  -------  -------
                      8da49cc5-568d-4262-93fa-15cecbf45017  client  STAR    star-w4.fabric-testbed.net        4     16     500  default_ubuntu_20  2001:400:a100:3030:f816:3eff:fe1d:28a2  Active
                      9e8cbc19-1e42-4244-a8a6-58f4310f58b3  server  STAR    star-w4.fabric-testbed.net        4     16     500  default_ubuntu_20  2001:400:a100:3030:f816:3eff:fe7e:474d  Active
                      6f9725ff-f4e9-4c45-90f1-06425e8fa97c  router  STAR    star-w5.fabric-testbed.net        2      8      10  default_ubuntu_20  2001:400:a100:3030:f816:3eff:fe8a:4cc1  Active
                      
                      Time to stable 168 seconds
                      Running post_boot_config ... 
                      
                      ---------------------------------------------------------------------------
                      AttributeError                            Traceback (most recent call last)
                      /tmp/ipykernel_551/3709344673.py in <module>
                           16 
                           17 
                      ---> 18 slice.submit()
                      
                      /opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/slice.py in submit(self, wait, wait_timeout, wait_interval, progress, wait_jupyter)
                         1218 
                         1219         if progress and wait_jupyter == 'text' and fablib.isJupyterNotebook():
                      -> 1220             self.wait_jupyter(timeout=wait_timeout, interval=wait_interval)
                         1221             return self.slice_id
                         1222 
                      
                      /opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/slice.py in wait_jupyter(self, timeout, interval)
                         1162 
                         1163         print("Running post_boot_config ... ", end="")
                      -> 1164         self.post_boot_config()
                         1165         print(f"Time to post boot config {time.time() - start:.0f} seconds")
                         1166 
                      
                      /opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/slice.py in post_boot_config(self)
                         1107 
                         1108         for iface_thread in iface_threads:
                      -> 1109             iface_thread.result()
                         1110 
                         1111 
                      
                      /opt/conda/lib/python3.9/concurrent/futures/_base.py in result(self, timeout)
                          431                 raise CancelledError()
                          432             elif self._state == FINISHED:
                      --> 433                 return self.__get_result()
                          434 
                          435             self._condition.wait(timeout)
                      
                      /opt/conda/lib/python3.9/concurrent/futures/_base.py in __get_result(self)
                          387     def __get_result(self):
                          388         if self._exception:
                      --> 389             raise self._exception
                          390         else:
                          391             return self._result
                      
                      /opt/conda/lib/python3.9/concurrent/futures/thread.py in run(self)
                           50 
                           51         try:
                      ---> 52             result = self.fn(*self.args, **self.kwargs)
                           53         except BaseException as exc:
                           54             self.future.set_exception(exc)
                      
                      /opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/interface.py in ip_link_toggle(self)
                          284 
                          285         """
                      --> 286         self.get_node().ip_link_down(None, self)
                          287         self.get_node().ip_link_up(None, self)
                          288 
                      
                      /opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/node.py in ip_link_down(self, subnet, interface)
                         1238         """
                         1239 
                      -> 1240         if interface.get_network().get_layer() == NSLayer.L3:
                         1241             if interface.get_network().get_type() == ServiceType.FABNetv6:
                         1242                 ip_command = "sudo ip -6"
                      
                      AttributeError: 'NoneType' object has no attribute 'get_layer'
                      #2357
                      Paul Ruth
                      Keymaster

                        I think the forum markup messed up the quotes in the code snippet I sent. Paste it in and re-type the quotes. It will work.

                        Also, that error is an unnecessary exception that is thrown by the currently deployed version of fablib when you have an interface that is not attached to a network. In your case, it is caused by the second port of the connectx-5’s on your nodes. You can safely ignore the error and it will work fine. Basically, fablib gets confused when it is tries to configure an interface that you are not using. This error will be suppressed in the next version of fablib.

                        #2358
                        Arash SARABI
                        Participant

                          Thank you very much. By ignoring the error, it appears that everything is fine.
                          Regarding the code you provided, I had previously fixed the quotes, and I think the error may caused by something else.

                          #2359
                          Paul Ruth
                          Keymaster

                            Ah… there is a typo… “get_intefaces()” is missing an ‘r’.

                            It should be:
                            [ifaceRouterC,ifaceRouterS] = nodeRouter.add_component(model="NIC_ConnectX_5", name="cx5_nic").get_interfaces()

                            • This reply was modified 2 years, 5 months ago by Paul Ruth.
                          Viewing 12 posts - 1 through 12 (of 12 total)
                          • You must be logged in to reply to this topic.