Home › Forums › FABRIC General Questions and Discussion › issue with creating chameleon server using the notebook chameleon_facility_port
- This topic has 14 replies, 2 voices, and was last updated 9 months, 3 weeks ago by Sanjana Das.
-
AuthorPosts
-
February 20, 2024 at 3:16 pm #6592
I wanted to run an experiment spanning chameleon and fabric and i was jsut running the chameleon_facility_port_fabnetv4 notebook but I am getting an error while trying to create the chameleon server. The error is as follows:
--------------------------------------------------------------------------- ResourceFailure Traceback (most recent call last) Cell In[6], line 19 17 for server in servers: 18 print(f'Waiting for server: {server.name}') ---> 19 chi.server.wait_for_active(server.id) 20 print('Done!') File /opt/conda/lib/python3.10/site-packages/chi/server.py:514, in wait_for_active(server_id, timeout) 512 compute = connection().compute 513 server = compute.get_server(server_id) --> 514 return compute.wait_for_server(server, wait=timeout) File /opt/conda/lib/python3.10/site-packages/openstack/compute/v2/_proxy.py:2510, in Proxy.wait_for_server(self, server, status, failures, interval, wait, callback) 2482 """Wait for a server to be in a particular status. 2483 2484 :param server: The :class:
~openstack.compute.v2.server.Server
to wait (...) 2507status
attribute. 2508 """ 2509 failures = ['ERROR'] if failures is None else failures -> 2510 return resource.wait_for_status( 2511 self, 2512 server, 2513 status, 2514 failures, 2515 interval, 2516 wait, 2517 callback=callback, 2518 ) File /opt/conda/lib/python3.10/site-packages/openstack/resource.py:2409, in wait_for_status(session, resource, status, failures, interval, wait, attribute, callback) 2407 return resource 2408 elif normalized_status in failures: -> 2409 raise exceptions.ResourceFailure( 2410 "{name} transitioned to failure state {status}".format( 2411 name=name, status=new_status 2412 ) 2413 ) 2415 LOG.debug( 2416 'Still waiting for resource %s to reach state %s, ' 2417 'current state is %s', (...) 2420 new_status, 2421 ) 2423 if callback: ResourceFailure: Server:343a728a-3f4f-4edf-a8c5-9e8e9d7f9b9c transitioned to failure state ERROR
February 21, 2024 at 4:58 pm #6594I am still having this issue so I manually created a server in Chameleon, go its ip, and went on to run the notebook but at the final step the fabric node was not able to ping the Chameleon server.
Moreover, I have another question. From the fabric portal, I know that we can create a chameleon facility port and connect it to a fabric slice but can we replicate this exact experiment (creating a node(s) in chameleon and also in fabric and connecting them) using the fabric portal?
February 22, 2024 at 2:28 pm #6597@Sanjana – Chameleon team would be better equipped to help you regarding the failure observed while creating Node on Chameleon. Jupyter Notebook referred in your post uses chameleon python API.
FABRIC portal doesn’t provide support to provision resources on Chameleon. You would have to use Chameleon Portal to use their Graphical Interface.
Thanks,
Komal
February 23, 2024 at 8:49 am #6601Just realized you also had problem with network reachability between Chameleon and Fabric nodes. Could you please share your slice ID for FABRIC?
Also, please check the interface and routes are setup correctly on Chameleon node.
February 23, 2024 at 7:35 pm #6605Hello. There are two slices being created in that notebook so I will send both of them! The first one is “tacc_stitch” with the id 2ff552d5-2965-4c1f-bac5-7b4221869cc5 and the second one is “MyFabricNodes” with the id 5f9a95b6-b933-4582-a650-5ac28af8ef9e. In the Chameleon facility port l3fabnetv4 notebook, the node from “MyFabricNodes” is pinging the Chameleon server with the ip 10.130.164.13.
I think the interfaces and routes are correctly set up since I did not change anything in the notebook. Moreover, I was successfully able to create the server from the notebook itself so that issue was also resolved.
February 27, 2024 at 11:36 pm #6619February 28, 2024 at 8:45 am #6622Hi Sanjana,
I am able to reproduce this issue on MASS. But I was able to get this to work on other sites like SEAT, PSC. Could you please use a different site like SEAT or PSC while we investigate this issue. I will keep you updated with the findings for MASS.
Thank you for sharing your observations and helping us make the testbed better.
Thanks,
Komal
February 28, 2024 at 1:19 pm #6630Hi,
I am still facing the same issue on the other sites as well. Do we need to explicitly add routes from the nodes of those sites so that they can reach the fabnetv4 subnet created at tacc?
February 28, 2024 at 2:18 pm #6631MASS is working as well. We checked your FABRIC nodes, Fabnet services seems to be connected properly and we can ping the gateway. FABRIC VMs in your slice can ping each other too.
Not sure how your Chameleon Server is setup.
You should see routes and interface setup something similar to below on your Chameleon Node:
cc@kthare10-fabric-stitch-server-1:~$ ip route list default via 10.130.163.2 dev eno1np0 proto dhcp src 10.130.163.10 metric 100 10.128.0.0/10 via 10.130.163.1 dev eno1np0 proto dhcp src 10.130.163.10 metric 100 10.130.163.0/24 dev eno1np0 proto kernel scope link src 10.130.163.10 169.254.169.254 via 10.130.163.3 dev eno1np0 proto dhcp src 10.130.163.10 metric 100 cc@kthare10-fabric-stitch-server-1:~$ cc@kthare10-fabric-stitch-server-1:~$ cc@kthare10-fabric-stitch-server-1:~$ cc@kthare10-fabric-stitch-server-1:~$ ifconfig eno1np0: flags=4163<UP,BROADCAST,RUNNING,MULTICAST> mtu 1500 inet 10.130.163.10 netmask 255.255.255.0 broadcast 10.130.163.255 inet6 fe80::be97:e1ff:fec4:8e0 prefixlen 64 scopeid 0x20<link> ether bc:97:e1:c4:08:e0 txqueuelen 1000 (Ethernet) RX packets 4937 bytes 1058216 (1.0 MB) RX errors 0 dropped 0 overruns 0 frame 0 TX packets 4804 bytes 410390 (410.3 KB) TX errors 0 dropped 0 overruns 0 carrier 0 collisions 0
P.S: I did execute the cell indicated as “(Optionally) Add a Router and Attach it to the Subnet”.
- This reply was modified 9 months, 4 weeks ago by Komal Thareja.
- This reply was modified 9 months, 4 weeks ago by Komal Thareja.
- This reply was modified 9 months, 4 weeks ago by Komal Thareja.
February 28, 2024 at 5:26 pm #6637Hi,
Thanks for that! Would it be possible for you to share the code snippet of how you set up the server? This is how I did it and I can see it up and running on the Chameleon GUI. I can also see that it was successfully assigned the ip address from the fabnetv4 pool of ip addresses. Also, I pinged the fabric gateway (10.30.162.1) from a node in psc and that worked but it failed for the chameleon gateway (10.130.162.2).
Create the lease
BLAZAR_TIME_FORMAT = ‘%Y-%m-%d %H:%M’
# Set start/end date for lease
# Start one minute into future to avoid Blazar thinking lease is in past
# due to rounding to closest minute.
start_date = (datetime.now(tz=tz.tzutc()) + timedelta(minutes=1)).strftime(BLAZAR_TIME_FORMAT)
end_date = (datetime.now(tz=tz.tzutc()) + timedelta(days=1)).strftime(BLAZAR_TIME_FORMAT)# Build list of reservations (in this case there is only one reservation)
reservation_list = [
{
“resource_type”: “network”,
“network_name”: chameleon_network_name,
“network_properties”: “”,
“resource_properties”: json.dumps(
[“==”, “$stitch_provider”, ‘fabric’]
),
},{‘resource_type’: ‘physical:host’,
‘resource_properties’: ‘[“==”, “$node_type”, “compute_skylake”]’,
‘hypervisor_properties’: ”, ‘min’: 1, ‘max’: 1}]chameleon_lease = chi.lease.create_lease(chameleon_lease_name,
reservations=reservation_list,
start_date=start_date,
end_date=end_date)#Print the lease info
chameleon_network_reservation_id = [reservation for reservation in chameleon_lease[‘reservations’] if reservation[‘resource_type’] == ‘network’][0][‘id’]
print(f”chameleon_network_reservation_id: {chameleon_network_reservation_id}”)
chameleon_server_reservation_id = [reservation for reservation in chameleon_lease[‘reservations’] if reservation[‘resource_type’] == ‘physical:host’][0][‘id’]
print(f”chameleon_node_reservation_id: {chameleon_server_reservation_id}”)Configure chameleon network and routes
chameleon_subnet = chi.network.create_subnet(chameleon_subnet_name, chameleon_network_id,
cidr=str(subnet),
allocation_pool_start=chameleon_allocation_pool_start,
allocation_pool_end=chameleon_allocation_pool_end,
gateway_ip=chameleon_gateway_ip)chi.neutron().update_subnet(subnet=chameleon_subnet[‘id’] ,
body={
“subnet”: {
“host_routes”: [
{
“destination”: f”{fablib.FABNETV4_SUBNET}”,
“nexthop”: f”{fabric_gateway_ip}”
}
]
}
})print(f”subnet name : {chameleon_subnet[‘name’]}”)
print(f”subnet : {chameleon_subnet[‘cidr’]}”)
print(f”gateway_ip : {chameleon_subnet[‘gateway_ip’]}”)for starting server
import chi.server
servers = []
for i in range(chameleon_server_count):
server_name=f”{chameleon_server_name}_{i+1}”# Create the server
servers.append(chi.server.create_server(server_name,
reservation_id= chameleon_server_reservation_id,
network_name=chameleon_network_name,
image_name=chameleon_image_name,
key_name=chameleon_key_name
))# Wait until the server is active
for server in servers:
print(f’Waiting for server: {server.name}’)
chi.server.wait_for_active(server.id)
print(‘Done!’)February 28, 2024 at 8:48 pm #6638I just ran this notebook: https://github.com/fabric-testbed/jupyter-examples/blob/main/fabric_examples/complex_recipes/Chameleon_Facility_Port/Chameleon_Facility_Port_fabnetv4.ipynb
No additional steps needed.
Thanks,
Komal
February 29, 2024 at 10:44 am #6642Hi,
What did you do for the Chameleon server reservation ID? Since in the notebook, there aren’t steps to reserve a server. That is where I tried reserving a server using the chi api and got the reservation id.
February 29, 2024 at 4:31 pm #6646Please create a Lease to reserve a host on Chameleon via Project -> Reservations -> Leases -> Create Lease.
Once the lease is created, click on the lease, you will Reservation section on it, Copy the Id from there.
This is the Id you need to use in the notebook. Hope this helps.
If you create the Server on Chameleon manually. Please set the IP address and the routes on the server as below:
ip addr add 10.130.162.2/24 dev eth1
Add route:
route add -net 10.130.162.0/24 dev eth1
Change the IP and interface as per your FabNet subnet.
February 29, 2024 at 5:05 pm #6651Attaching the screenshot for Chameleon Lease
March 4, 2024 at 10:51 am #6672Thank you so much for all your help! Really appreciate it.
-
AuthorPosts
- You must be logged in to reply to this topic.