1. Rasman Mubtasim Swargo

Rasman Mubtasim Swargo

Forum Replies Created

Viewing 11 posts - 1 through 11 (of 11 total)
  • Author
    Posts
  • Yes. That was the case. My bad. Do I need to delete the slice and then reserve again, or there is any way to continue with the existing slice?

     

    Thank you for your assistance.

    Hi Rasman,

    Which JH container are you using?

    Best,

    Komal

    Hi Komal,

    I am using the default FABRIC JupyterHub container (

    (default) FABRIC Examples v1.9.0, FABlib v1.9.3: released: 07/28/2025, stable: Summer 2025

    ) from the FABRIC portal, not a custom local environment.

    If you need additional details, I can also share those.

    Best,
    Rasman

    Hello, The FABRIC MASS dataplane link was having some problems, however that should be working now. Please try your service again and let us know if all is working.  Thanks, Tom

    Hello,

    Thanks for the update.

    I removed the old slice and created a new one to retry from scratch, but I am still seeing the same issue. The new slice is:

    • Slice ID: f00b699f-7332-4cd5-b4fd-f64b0da3afe3
    • Slice name: iPerf3-tuned-nic-x6-64gb

    In this new slice, the L2 network was created successfully and shows Active, and both nodes are also Active. The interfaces are attached to net1, but the IP Address field still remains None for both sides.

    Current setup:

    • Node-FIU at FIU
    • Node-KANS at KANS
    • Network: net1
    • Subnet: 192.168.1.0/24

    From the topology view:

    • Node-FIU-nic1-p1 is attached to net1, mode auto, IP Address None
    • Node-KANS-nic1-p1 is attached to net1, mode auto, IP Address None

    So the issue still seems to be that the network is created, but IPs are not being assigned to the interfaces.

    Please let me know whether there is an additional step needed for IP assignment on L2 networks, or if this indicates another dataplane/control-plane issue.

    Thank you.

     

    Code:

    #Create Slice
    slice = fablib.new_slice(name=slice_name)

    net1 = slice.add_l2network(name=network_name, subnet=subnet)

    for s in sites:
    # Node1
    node1 = slice.add_node(name=f”Node-{s}”, cores=cores, ram=ram, disk=disk, site=s, image=image)

    iface1 = node1.add_component(model=model_name, name=nic_name).get_interfaces()[0]
    node1.add_component(model=’NVME_P4510′, name=’nvme1′)
    iface1.set_mode(‘auto’)
    net1.add_interface(iface1)
    net1.set_bandwidth(60)

    node1.add_post_boot_upload_directory(‘node_tools’,’.’)
    node1.add_post_boot_execute(‘sudo node_tools/host_tune.sh’)
    # node1.add_post_boot_execute(‘node_tools/enable_docker.sh {{ _self_.image }} ‘)
    # node1.add_post_boot_execute(‘docker pull fabrictestbed/slice-vm-ubuntu20-network-tools:0.0.1 ‘)

    #Submit Slice Request
    slice.submit();

    in reply to: Performance Drop on ConnectX-6 After Release 1.9 #9047

    It worked after manually doing the steps you described. Thanks.

    in reply to: Performance Drop on ConnectX-6 After Release 1.9 #9046

    I have booked a slice ( d6065a22-c893-425f-b12f-3bc0fe4d2481 ) with NEWY and CERN nodes which are listed as 320Gbps. This time, it did not get stuck. Everything went smoothly. I am still getting around 3 Gbps.

    Could you please have a look?
    I saw that there is another 8 Gbps line listed for (NewY, CERN). Can you guide me on how to pick sites so that I can get the fastest network speed?

    (‘NEWY’, ‘CERN’) link:local-port+cern-data-sw:FourHundredGigE0/0/0/26.3733:remote-port+newy-data-sw:FourHundredGigE0/0/0/60.3733 320 N/A L2

     

    in reply to: Performance Drop on ConnectX-6 After Release 1.9 #9041

    Your provided snippet gets stuck at the ‘make’ command in both of the nodes:

    ubuntu@Node-GATECH:~/iperf-3.18$ make
    Making all in src
    make[1]: Entering directory '/home/ubuntu/iperf-3.18/src'
    make all-am
    make[2]: Entering directory '/home/ubuntu/iperf-3.18/src'
    CC iperf3-main.o
    main.c:212:1: fatal error: opening dependency file .deps/iperf3-main.Tpo: Permission denied
    212 | }
    | ^
    compilation terminated.
    make[2]: *** [Makefile:974: iperf3-main.o] Error 1
    make[2]: Leaving directory '/home/ubuntu/iperf-3.18/src'
    make[1]: *** [Makefile:733: all] Error 2
    make[1]: Leaving directory '/home/ubuntu/iperf-3.18/src'
    make: *** [Makefile:404: all-recursive] Error 1
    in reply to: Performance Drop on ConnectX-6 After Release 1.9 #9039

    Slice ID: 25c5b6c2-f0f8-4cc9-b4e1-cad570231aca

    One thing I forgot to mention is the execution often gets stuck in slice submission cell. Like, post boot config of one node is usually done but the other gets stuck. It gets stuck at this point, FIU’s node does not deliver the ‘done!’ message:

    Time to StableOK 246 seconds
    Running post_boot_config ... 
    Running post boot config threads ...
    Post boot config Node-GATECH, Done! (16 sec)
    

    Here’s the code:

    sites = ['GATECH', 'FIU']
    print(f"Sites: {sites}")
    
    node1_name = 'Node1'
    node2_name = 'Node2'
    cores=8
    ram=64
    disk=1000
    image='default_ubuntu_20'
    
    slice_name = 'iPerf3-tuned-nic-x6-64gb-1tb-GF-2'
    nic_name = 'nic1'
    model_name = 'NIC_ConnectX_6'
    network_name='net1'
    from ipaddress import ip_address, IPv4Address, IPv6Address, IPv4Network, IPv6Network
    
    subnet = IPv4Network("192.168.1.0/24")
    available_ips = list(subnet)[1:]
    
    #Create Slice
    slice = fablib.new_slice(name=slice_name)
    net1 = slice.add_l2network(name=network_name, subnet=subnet)
    
    for s in sites:
    # Node1
    node1 = slice.add_node(name=f"Node-{s}", cores=cores, ram=ram, disk=disk, site=s, image=image)
    
    iface1 = node1.add_component(model=model_name, name=nic_name).get_interfaces()[0]
    node1.add_component(model='NVME_P4510', name='nvme1')
    iface1.set_mode('auto')
    net1.add_interface(iface1)
    net1.set_bandwidth(50)
    
    node1.add_post_boot_upload_directory('node_tools','.')
    node1.add_post_boot_execute('sudo node_tools/host_tune.sh')
    # node1.add_post_boot_execute('node_tools/enable_docker.sh {{ _self_.image }} ')
    # node1.add_post_boot_execute('docker pull fabrictestbed/slice-vm-ubuntu20-network-tools:0.0.1 ')
    
    #Submit Slice Request
    slice.submit();
    
    
    

    I have to stop the execution and move to the next cell. I’ll report here what I get after running the esnet iperf3. Let me know if you need anything to investigate this issue.

    It’s working now.

    Thanks,

    Swargo

    Hi Rasman,

    I was able to run iperf3 optimized notebook without issues. I am unable to access your notebook. It says Page Not Found.

    Could you please share your slice ID?

    Thanks,

    Komal

    I have tried again, this time without any modifications to the original notebook, but I still got the same error.

    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    Source:  Node-MASS to Dest: Node-TACC
    iperf3: error - unable to connect to server: No route to host
    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx
    Source:  Node-TACC to Dest: Node-MASS
    iperf3: error - unable to connect to server: No route to host
    xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx

     

    Slice ID: 919b590f-87aa-41e8-bb22-58acf4c79d4c

    Hi Rasman,

    I was able to run iperf3 optimized notebook without issues. I am unable to access your notebook. It says Page Not Found.

    Could you please share your slice ID?

    Thanks,

    Komal

    Hi Komal,
    Here is the slice ID: f05dedc0-468f-406e-a566-8041d507ad60

    I am trying to transfer some files from MASS to TACC but it is failing to find any route.

     

    Here is the notebook that I used: https://github.com/swargo98/LLM-based-Data-Movement-Optimizer/blob/main/iperf3_optimized_w_error.ipynb

    Sorry, I could not upload the notebook as the file type is not supported.

    Here is the github link of the notebook: https://github.com/swargo98/LLM-based-Data-Movement-Optimizer/blob/main/iperf3_optimized_w_error.ipynb

Viewing 11 posts - 1 through 11 (of 11 total)