Home › Forums › FABRIC General Questions and Discussion › modifying device properties
- This topic has 11 replies, 2 voices, and was last updated 2 years, 4 months ago by Paul Ruth.
-
AuthorPosts
-
July 17, 2022 at 12:03 pm #2343
I was wondering if there was a way to disable the ethernets’ CRC check. I attempted to use “ethtool -K ens8 rx-all on” but received the following error:
Could not change any device featuresany help would be appreciated.
July 17, 2022 at 1:29 pm #2344I think that might be a side effect of NIC_Basic’s being SRIOV VFs with limited low level configuration available. You will have a bit more control if you use FABRIC’s dedicated NICs.
I tried running that ethtool command on a NIC_ConnectX_5 and it worked. I think the lower level control you need requires NIC_ConnectX_5 or NIC_ConnectX_6 NICs.
Let me know if that works for you.
Paul
July 17, 2022 at 3:06 pm #2345Thank you Paul,
I tried to replace NIC_Basic with NIC_ConnectX_5 it returns the following error:
/opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/slice.py in submit(self, wait, wait_timeout, wait_interval, progress, wait_jupyter)
1207 ssh_key=self.get_slice_public_key())
1208 if return_status != Status.OK:
-> 1209 raise Exception(“Failed to submit slice: {}, {}”.format(return_status, slice_reservations))
1210
1211 logging.debug(f’slice_reservations: {slice_reservations}’)Exception: Failed to submit slice: Status.FAILURE, (500)
Reason: INTERNAL SERVER ERROR
HTTP response headers: HTTPHeaderDict({‘Server’: ‘nginx/1.21.6’, ‘Date’: ‘Sun, 17 Jul 2022 20:04:56 GMT’, ‘Content-Type’: ‘text/html; charset=utf-8’, ‘Content-Length’: ‘130’, ‘Connection’: ‘keep-alive’, ‘Access-Control-Allow-Credentials’: ‘true’, ‘Access-Control-Allow-Headers’: ‘DNT, User-Agent, X-Requested-With, If-Modified-Since, Cache-Control, Content-Type, Range’, ‘Access-Control-Allow-Methods’: ‘GET, POST, PUT, DELETE, OPTIONS’, ‘Access-Control-Allow-Origin’: ‘*’, ‘Access-Control-Expose-Headers’: ‘Content-Length, Content-Range, X-Error’, ‘X-Error’: ‘PDP Authorization check failed – Policy Violation: Your project is lacking Component.SmartNIC tag to provision a VM with SmartNIC.’})
HTTP response body: PDP Authorization check failed – Policy Violation: Your project is lacking Component.SmartNIC tag to provision a VM with SmartNIC.July 17, 2022 at 3:08 pm #2346Here is my code to submit slice:
slice = fablib.new_slice(name=SLICENAME)
nodeclient = slice.add_node(name=”client”, site=SITE, cores=1, ram=16,disk=900, image=’default_ubuntu_20′)
nodeserver = slice.add_node(name=”server”, site=SITE, cores=1, ram=16,disk=900, image=’default_ubuntu_20′)
nodeRouter = slice.add_node(name=”router”, site=SITE, image=’default_ubuntu_20′)ifaceclient = nodeclient.add_component(model=”NIC_ConnectX_5″, name=”if_client”).get_interfaces()[0]
ifaceserver = nodeserver.add_component(model=”NIC_ConnectX_5″, name=”if_server”).get_interfaces()[0]
ifaceRouterC = nodeRouter.add_component(model=”NIC_ConnectX_5″, name=”if_router_c”).get_interfaces()[0]
ifaceRouterS = nodeRouter.add_component(model=”NIC_ConnectX_5″, name=”if_router_s”).get_interfaces()[0]netC = slice.add_l2network(name=’net_c’, type=’L2Bridge’, interfaces=[ifaceclient, ifaceRouterC])
netS = slice.add_l2network(name=’net_s’, type=’L2Bridge’, interfaces=[ifaceserver, ifaceRouterS])slice.submit()
- This reply was modified 2 years, 4 months ago by Arash SARABI.
- This reply was modified 2 years, 4 months ago by Arash SARABI.
July 17, 2022 at 3:31 pm #2349Your project did not have permissions required to use smart NICs. I just added the permissions. Please try again.
July 17, 2022 at 3:40 pm #2350I restarted the kernel, I am facing the following problem
ID Name Site Host Cores RAM Disk Image Management IP State Error
———————————— —— —— ————————– ——- —– —— —————– ————— ——- ——————————————————————————————-
96275a0f-f2f7-4928-b20a-f34fc6b2a1e1 client TACC tacc-w5.fabric-testbed.net 4 16 500 default_ubuntu_20 Closed TicketReviewPolicy: Closing reservation due to failure in slice
5517d25c-2b19-4815-8c0c-136395169d48 server TACC tacc-w5.fabric-testbed.net 4 16 500 default_ubuntu_20 Closed TicketReviewPolicy: Closing reservation due to failure in slice
46b000bf-ff65-4f7c-9cd5-8bef2f140a42 router TACC default_ubuntu_20 Closed Insufficient resources : Component of type: ConnectX-6 not available in graph node: 8QQBZC3Exception: node.execute: Management IP Invalid: None
July 17, 2022 at 3:46 pm #2351It appears that I should check which site has more available.
July 17, 2022 at 3:56 pm #2352Yeah, that slice request requires 4 connectx-5’s and will need to be aware of their availability.
I have a few observations that may help:
– All 3 of your nodes are being sent to the same site. You might try putting them on different sites. From your perspective it will work about the same. The main difference will be that the latency between nodes is greater.
– If you only need low level configuration for some interfaces, you could mix-and-match NIC_ConnectX_5 with NIC_Basic. For example, maybe your router needs a ConnectX_5 but the nodes can use a NIC_Basic (or the other way around).
– Your router node is asking for 2 connectx-5’s. Each connectx-5 has two ports. If you really only need two ports you can use 1 connectx-5 for your router. The code for that will look something like:
connectx5_interfaces = nodeRouter.add_component(model=”NIC_ConnectX_5″, name=”cx5_nic”).get_interfaces() ifaceRouterC = connectx5_interfaces[0] ifaceRouterS = connectx5_interfaces[1]
or you can shorten it to:
[ifaceRouterC,ifaceRouterS] = nodeRouter.add_component(model=”NIC_ConnectX_5″, name=”cx5_nic”).get_interfaces()
- This reply was modified 2 years, 4 months ago by Paul Ruth.
- This reply was modified 2 years, 4 months ago by Paul Ruth.
- This reply was modified 2 years, 4 months ago by Paul Ruth.
- This reply was modified 2 years, 4 months ago by Paul Ruth.
- This reply was modified 2 years, 4 months ago by Paul Ruth.
July 17, 2022 at 4:07 pm #2356Thank you for all your help,
When I use your code it will return an error:
---> 12 [ifaceRouterC,ifaceRouterS] = nodeRouter.add_component(model="NIC_ConnectX_5", name="cx5_nic").get_intefaces() 13 14 netC = slice.add_l2network(name='net_c', type='L2Bridge', interfaces=[ifaceclient, ifaceRouterC]) AttributeError: 'Component' object has no attribute 'get_intefaces'
and when I use my code with a different site that has available resources (STAR) it will go to the Active state but after some time it returns the following error:
ID Name Site Host Cores RAM Disk Image Management IP State Error ------------------------------------ ------ ------ -------------------------- ------- ----- ------ ----------------- -------------------------------------- ------- ------- 8da49cc5-568d-4262-93fa-15cecbf45017 client STAR star-w4.fabric-testbed.net 4 16 500 default_ubuntu_20 2001:400:a100:3030:f816:3eff:fe1d:28a2 Active 9e8cbc19-1e42-4244-a8a6-58f4310f58b3 server STAR star-w4.fabric-testbed.net 4 16 500 default_ubuntu_20 2001:400:a100:3030:f816:3eff:fe7e:474d Active 6f9725ff-f4e9-4c45-90f1-06425e8fa97c router STAR star-w5.fabric-testbed.net 2 8 10 default_ubuntu_20 2001:400:a100:3030:f816:3eff:fe8a:4cc1 Active Time to stable 168 seconds Running post_boot_config ... --------------------------------------------------------------------------- AttributeError Traceback (most recent call last) /tmp/ipykernel_551/3709344673.py in <module> 16 17 ---> 18 slice.submit() /opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/slice.py in submit(self, wait, wait_timeout, wait_interval, progress, wait_jupyter) 1218 1219 if progress and wait_jupyter == 'text' and fablib.isJupyterNotebook(): -> 1220 self.wait_jupyter(timeout=wait_timeout, interval=wait_interval) 1221 return self.slice_id 1222 /opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/slice.py in wait_jupyter(self, timeout, interval) 1162 1163 print("Running post_boot_config ... ", end="") -> 1164 self.post_boot_config() 1165 print(f"Time to post boot config {time.time() - start:.0f} seconds") 1166 /opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/slice.py in post_boot_config(self) 1107 1108 for iface_thread in iface_threads: -> 1109 iface_thread.result() 1110 1111 /opt/conda/lib/python3.9/concurrent/futures/_base.py in result(self, timeout) 431 raise CancelledError() 432 elif self._state == FINISHED: --> 433 return self.__get_result() 434 435 self._condition.wait(timeout) /opt/conda/lib/python3.9/concurrent/futures/_base.py in __get_result(self) 387 def __get_result(self): 388 if self._exception: --> 389 raise self._exception 390 else: 391 return self._result /opt/conda/lib/python3.9/concurrent/futures/thread.py in run(self) 50 51 try: ---> 52 result = self.fn(*self.args, **self.kwargs) 53 except BaseException as exc: 54 self.future.set_exception(exc) /opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/interface.py in ip_link_toggle(self) 284 285 """ --> 286 self.get_node().ip_link_down(None, self) 287 self.get_node().ip_link_up(None, self) 288 /opt/conda/lib/python3.9/site-packages/fabrictestbed_extensions/fablib/node.py in ip_link_down(self, subnet, interface) 1238 """ 1239 -> 1240 if interface.get_network().get_layer() == NSLayer.L3: 1241 if interface.get_network().get_type() == ServiceType.FABNetv6: 1242 ip_command = "sudo ip -6" AttributeError: 'NoneType' object has no attribute 'get_layer'
July 17, 2022 at 4:28 pm #2357I think the forum markup messed up the quotes in the code snippet I sent. Paste it in and re-type the quotes. It will work.
Also, that error is an unnecessary exception that is thrown by the currently deployed version of fablib when you have an interface that is not attached to a network. In your case, it is caused by the second port of the connectx-5’s on your nodes. You can safely ignore the error and it will work fine. Basically, fablib gets confused when it is tries to configure an interface that you are not using. This error will be suppressed in the next version of fablib.
July 17, 2022 at 4:39 pm #2358Thank you very much. By ignoring the error, it appears that everything is fine.
Regarding the code you provided, I had previously fixed the quotes, and I think the error may caused by something else.July 17, 2022 at 4:53 pm #2359 -
AuthorPosts
- You must be logged in to reply to this topic.