pin_cpu & poa(operation=”cpupin”)

This topic has 1 reply, 2 voices, and was last updated 22 hours, 2 minutes ago by Komal Thareja.

Viewing 2 posts - 1 through 2 (of 2 total)

Author

Posts
October 31, 2025 at 9:58 am #9126
yoursunny
Participant
I’m poking around the CPU pinning feature and noticed some problems around pin_cpu() and poa(operation="cpupin") APIs.

pin_cpu(cpu_range_to_pin=) syntax

In Node.pin_cpu function, the cpu_range_to_pin= parameter is described as:

cpu_range_to_pin: range of the cpus to pin; example: 0-1 or 0

However, passing cpu_range_to_pin="0" would raise ValueError: not enough values to unpack (expected 2, got 1) at this line:
start, end = map(int, cpu_range_to_pin.split("-"))
pinned CPUs still in other VM’s affinity list

I created two slices each having a node on the same worker host.
After successfully pinning two physical CPUs to VCPUs on node1, I checked the output of node2.get_cpu_info():
{ "atla-w2.fabric-testbed.net": { "pinned_cpus": ["117", "53"] }, "instance-0000187f": [ { "CPU": "116", "CPU Affinity": "0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127", "CPU time": "4.6s", "State": "running", "VCPU": "0" } ] }
Notably, the other node’s CPU affinity list still includes the pinned CPUs.
I’m hoping that the pinned CPUs can be reserved exclusively for node1 and removed from the CPU affinity list of all other nodes.
This would allow CPU-intensive workloads on node1 to be executed more accurately.

poa_cpupin with a CPU range

Normally, the pin_cpu function would send a POA command like this:
node.poa(operation="cpupin", vcpu_cpu_map=[ {"vcpu":"2","cpu":"15"}, {"vcpu":"3","cpu":"79"}, ])
I attempted to send a variation of this command:
node.poa(operation="cpupin", vcpu_cpu_map=[ {"vcpu":"2","cpu":"14-15"}, {"vcpu":"3","cpu":"78-79"}, ])
The latter command would return "SentToAuthority" instead of "Success".
Subsequently, further POA commands including node.get_cpu_info() would return 500 errors.

When I adjust QEMU process with taskset command on my own server, I can set affinity of a VCPU to multiple physical CPUs.
If FABRIC cannot support that, the server side should have rejected the poa_cpupin command, instead of letting the node fall into an error state.

poa_cpupin with in-use CPUs

I created two slices each having a node on the same worker host.
Then, I sent POA commands pinning the CPUs of these two nodes to the same physical CPUs:
node1.poa(operation="cpupin", vcpu_cpu_map=[{"vcpu":"2","cpu":"15"}]) node2.poa(operation="cpupin", vcpu_cpu_map=[{"vcpu":"2","cpu":"15"}])
The first POA completely successfully, while the second POA returns "SentToAuthority".
Subsequently, further POA commands on the second node would return 500 errors.

This error could happen even if the user is only calling the high-level API node.pin_cpu() .
When pin_cpu() calls are running concurrently against two separate nodes, which could belong to different users, they could pick the same physical CPU cores and send the conflicting POAs.

Again, I’d suggest the server side to reject the poa_cpupin command that causes a conflict, instead of letting the second node fall into an error state.
October 31, 2025 at 11:10 am #9131
Komal Thareja
Participant
Thank you, @yoursunny, for sharing these observations and the detailed steps to reproduce them. This appears to be a bug. I’ll work on addressing it and will update you once the patch is deployed.

Best,

Komal

1 user thanked author for this post.

yoursunny
Author

Posts

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.

pin_cpu(cpu_range_to_pin=) syntax

pinned CPUs still in other VM’s affinity list

poa_cpupin with a CPU range

poa_cpupin with in-use CPUs

1 user thanked author for this post.