1. Bekmukhamed Tursunbayev

Bekmukhamed Tursunbayev

Forum Replies Created

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • in reply to: Cannot allocate GPU + ConnectX-6 on same node #9723

    Thanks for the suggestion.

    I checked cern-w2 on the portal and confirmed it has both A30 and ConnectX-6 available. I also verified through the fablib API:
    cern-w2.fabric-testbed.net:
    a30_available: 1
    nic_connectx_6_available: 1

    I tried allocating with host=”cern-w2.fabric-testbed.net” and also without specifying host (letting FABRIC choose). Both fail:
    With host specified: “Component of type: ConnectX-6 not available in graph node: 1B5F6R3”
    Without host: “Component of type: A30 not available in graph node: 2B5F6R3”

    The graph node IDs in the errors (1B5F6R3, 2B5F6R3) change between attempts, which makes me think the allocation engine is not placing the VM on cern-w2 or its internal resource graph is out of sync with what the API reports.

    I also tried lease_in_hours=6 with a 24-hour window, same result.

    Has anyone seen this kind of mismatch between API availability and actual allocation? Any suggestions on how to work around this?

    in reply to: Cannot allocate GPU + ConnectX-6 on same node #9720

    Thank you for your response!

    I tried CERN (A30 + CX6) but got “Component of type: A30 not available in graph node: 2B5F6R3”. The portal shows A30 available at CERN. Could the A30 and free CX6 be on different workers? Is there a way to target a specific worker that has both?

    Also, CERN resources are almost always fully allocated. Is there a way to reserve or schedule resources in advance? Or is there a waitlist I can join?

Viewing 2 posts - 1 through 2 (of 2 total)