Add RFC-0055 Out-of-Tree Platform Build and Distribution by afrittoli · Pull Request #97 · pytorch/rfcs

afrittoli · 2026-06-22T16:23:21Z

No description provided.

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>

groenenboomj · 2026-06-22T17:23:07Z

Red Hat is also very interested in supporting a nightly runner signal.

albanD · 2026-06-22T17:49:36Z

Thanks for sending this rfc, I expect we'll finish the current CRCR for testing and focus on onboarding projects there and make sure we get benefits from that before we build more pieces there.
But happy to take a closer look at this one after that!

afrittoli · 2026-06-23T16:03:37Z

Thanks for sending this rfc, I expect we'll finish the current CRCR for testing and focus on onboarding projects there and make sure we get benefits from that before we build more pieces there. But happy to take a closer look at this one after that!

Thanks @albanD - feedback would be welcome - I believe there's plenty of design and prototyping work that I can look into in parallel to the current work on CRCR.

malfet · 2026-06-24T22:16:57Z

+
+Each platform operates in an isolated lane:
+
+- **Credential isolation**: Each platform has a dedicated IAM role that can only write to that platform's storage prefix. OIDC trust policies scope the role to the specific vendor repo. A compromised vendor repo cannot access another platform's storage or the main PyTorch artifact space.


This implies AWS S3 infra. Also, this implies centralized management for the IAMs but I guess RelEng team (which is very Meta-heavy atm)

The RFC describes credential isolation within the existing AWS-based storage infrastructure (S3, IAM), which is what PyTorch uses today.

Management and payment of the S3 bucket are both handled by Meta today, changing that is a conversation that is beyond the scope of the RFC, however the implications need to be considered:

costs: adding more platforms on the same S3 infra would add to the storage costs. Do you think this would be an issue?

management: the idea is to grant vendors the ability to add/remove binaries by themselves, to avoid the extra burden for the RelEng team. The only thing required would be provisioning (and deprovisioning) of the roles required to give/remove access to the vendors

malfet · 2026-06-24T22:17:45Z

+Each platform operates in an isolated lane:
+
+- **Credential isolation**: Each platform has a dedicated IAM role that can only write to that platform's storage prefix. OIDC trust policies scope the role to the specific vendor repo. A compromised vendor repo cannot access another platform's storage or the main PyTorch artifact space.
+- **Upload workflow isolation**: Uploads go through the official `_binary_upload.yml` workflow, which enforces naming conventions before writing to S3. Once [Stage 3](#implementation-plan) is complete, this workflow also generates provenance attestations. If the `job_workflow_ref` dual-gate can be confirmed (see [Credentials and Publishing Access](#credentials-and-publishing-access)), vendors cannot bypass this workflow even with valid OIDC credentials.


I don't think _binary_upload.yml can really be used to enforce anything, it must be done on IAM level, which is hard and implies a lot of heavy lifting from the RelEng team

The RFC is designed to reduce that heavy lifting as much as possible - I think it should be possible to reduce it to adding/removing a new line in terraform to provision/de-provision a role.

The idea is to use OIDC for AWS/GitHub Actions. The role must be associated with the vendor workflow on the vendor repo, and it would grant access to the S3 bucket only in the vendor specific namespace.

malfet · 2026-06-24T22:18:51Z

+Platform vendors are responsible for security vulnerabilities in their platform-specific code. When a vulnerability affects packages hosted at `download.pytorch.org`, the following process applies:
+
+1. Vendor discloses the vulnerability to the PyTorch security team at security@pytorch.org (or equivalent) within 7 days of discovery.
+2. PyTorch infra can yank (remove from the CDN index without deleting) the affected artifacts while a fix is prepared.


Why without deleting? What if affecting artifact distributes a malicious content?

Leaving the artifact in place and only remove it from the index could be helpful for "forensics" like post mortem analysis or so, but that works only if removing the artifact from the index prevents the artifact from being installed by end-users and used by CI/CD pipelines. If not it should be completely removed.
I'll include both options here and clarify the intent.

Add RFC-0051 Out-of-Tree Platform Build and Distribution

a427b49

Signed-off-by: Andrea Frittoli <andrea.frittoli@uk.ibm.com>

meta-cla Bot added the cla signed label Jun 22, 2026

afrittoli changed the title ~~Add RFC-0051 Out-of-Tree Platform Build and Distribution~~ Add RFC-0055 Out-of-Tree Platform Build and Distribution Jun 22, 2026

malfet reviewed Jun 24, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add RFC-0055 Out-of-Tree Platform Build and Distribution#97

Add RFC-0055 Out-of-Tree Platform Build and Distribution#97
afrittoli wants to merge 1 commit into
pytorch:masterfrom
afrittoli:rfc0051

afrittoli commented Jun 22, 2026

Uh oh!

groenenboomj commented Jun 22, 2026

Uh oh!

albanD commented Jun 22, 2026

Uh oh!

afrittoli commented Jun 23, 2026

Uh oh!

malfet Jun 24, 2026

Uh oh!

afrittoli Jun 26, 2026

Uh oh!

malfet Jun 24, 2026

Uh oh!

afrittoli Jun 26, 2026

Uh oh!

malfet Jun 24, 2026

Uh oh!

afrittoli Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants


		Each platform operates in an isolated lane:

		- Credential isolation: Each platform has a dedicated IAM role that can only write to that platform's storage prefix. OIDC trust policies scope the role to the specific vendor repo. A compromised vendor repo cannot access another platform's storage or the main PyTorch artifact space.

Uh oh!

Conversation

afrittoli commented Jun 22, 2026

Uh oh!

groenenboomj commented Jun 22, 2026

Uh oh!

albanD commented Jun 22, 2026

Uh oh!

afrittoli commented Jun 23, 2026

Uh oh!

malfet Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

afrittoli Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

malfet Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

afrittoli Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

malfet Jun 24, 2026

Choose a reason for hiding this comment

Uh oh!

afrittoli Jun 26, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants