1. Issue with Node-to-Node Communication in Fabric Experiment

Issue with Node-to-Node Communication in Fabric Experiment

Home Forums FABRIC General Questions and Discussion Issue with Node-to-Node Communication in Fabric Experiment

Viewing 3 posts - 1 through 3 (of 3 total)
  • Author
    Posts
  • #9556

    Dear FABRIC Support Team,

    I hope you are doing well.

    I am currently running an experiment on FABRIC with two compute nodes in the same slice. The nodes have the following private IP addresses:

    Node2: 11.30.6.178
    Node3: 11.30.6.178

    Although both nodes appear to be in the same subnet, they are unable to communicate directly with each other. For example:

    • ping 11.30.6.178 from Node3 does not receive any response. (# sample ip’s)

    • ssh 11.30.6.178 from Node3 results in a connection timeout.

    • Similarly, Node2 cannot ping or SSH into Node3.

    However, both nodes are accessible individually via the bastion host, and ARP resolution between the nodes appears to succeed. This suggests that Layer-2 connectivity exists but direct node-to-node communication may be blocked by network or security policies.

    My experiment requires direct communication between the nodes for running a distributed computing framework (Ray cluster). Could you please advise whether intra-node traffic within the slice needs to be explicitly enabled, or if there is any configuration required to allow communication between these nodes?

    Thank you for your assistance.

    Best regards,
    Sree Bhargavi Balija

    #9558
    Komal Thareja
    Participant

      Hi Sree,

      Could you please share your slice ID so we can look at it? In addition, please check some of the following examples available via jupyter-examples-*/start_here.ipynb that may be useful.

       

      Thanks,

      Komal

      #9565
      Komal Thareja
      Participant

        Hi Sree,

        VMs cannot communicate with each other over the private IPs assigned to interfaces connected to the management network. The interfaces with addresses in the 10.* range belong to this management network. Inter-VM communication should instead occur over the data plane network, which in your case is the L2Bridge network.

        I reviewed your slice and noticed that you have three VMs and two L2Bridge networks configured. However, the IP addresses on the VM interfaces are not set up correctly. Each network must use a different subnet, and the corresponding VM interfaces should be assigned IP addresses from those respective subnets.

        Please refer to the following example notebook, which demonstrates how to correctly configure the network:
        jupyter-examples-*/fabric_examples/fablib_api/create_l2network_basic/create_l2network_basic_auto.ipynb

        Make sure to use separate subnets for each network and assign the appropriate IPs to the VM interfaces so that communication works properly.

        Best,
        Komal

      Viewing 3 posts - 1 through 3 (of 3 total)
      • You must be logged in to reply to this topic.