Your GPU dashboard says 70% utilization. On paper, the cluster is busy. In practice, a large chunk of that time is spent with your $40,000 accelerators sitting idle, waiting on a file that lives three network hops away on a NAS box. The compute queue is empty, and the pipeline is fine. The problem is that data is just somewhere else. This is the awkward truth underneath most stalled AI projects. The constraint in modern AI infrastructure stopped being storage capacity years ago. Now, it's more about data placement and access. What matters is where files live and how they get to GPUs, along with how much copying happens in between. In that sense, AI infrastructure has become less of a storage capacity problem and more of an operational data problem. The Hammerspace Data Platform takes that as its starting point. It sits between your compute and the storage you already own, from NAS to object stores and even the NVMe drives bolted into your GPU servers. It makes all of that data addressable through a single global namespace. Instead of moving data to wherever the GPUs are, the architecture makes the compute aware of where the data already lives. As a result, rather than treating each storage system as its own operational silo, Hammerspace separates the data layer from the underlying infrastructure, allowing heterogeneous storage, sites, and clouds to operate as part of the same coordinated data environment. Applications and AI pipelines access that data through standard protocols such as NFS, SMB, and S3, without proprietary clients or application rewrites. Fragmentation is the bottleneck, not bandwidth Data fragmentation is a big problem for enterprises embarking on an AI journey. Training sets are scattered across departments, sites and clouds. "The data is in disparate groups and disparate orgs and disparate silos within a company," says Jonathan Flynn, director of applied systems at Hammerspace. "Having the data in a curated data set for you just to go train is rare. It has to be collected. It has to be moved around from system to system, and then the curation needs to happen in order to actually do the training on it." The fragmentation often leaves pipelines copying and staging files between systems that were never designed to talk to each other. None of this shows up on a storage IOPS chart, but it will visibly affect training velocity. According to Gartner, 57% of organizations believe that their data isn't AI ready. Alarmingly, two thirds of executives believe that no one in their organization understands all of the data they've collected and how to access it. That seems hard to swallow, until you recall that Facebook's engineers have admitted the same thing. You can't orchestrate what you can't see. Mike Bloom, who covers AR architecture at Hammerspace, says the default vendor response makes the problem worse. "They'll go to a vendor that will promise them that if they sweep the floor and throw out all of their legacy storage arrays, their brand will solve the problem," he says, adding that's like throwing the baby out with the bath water. "Those data sets that are all over the place? They're not sitting in a corner. They're sitting on legacy storage arrays." The NVMe you already paid for There is also a less obvious idle resource in most AI environments: the NVMe inside the GPU servers themselves. A modern HGX or DGX box ships with eight to sixteen NVMe drives, each hanging off four lanes of PCIe. Almost every orchestration layer treats that capacity as local scratch space, used by one server and invisible to the rest of the cluster. Hammerspace calls this "stranded" capacity, and it is now meaningful. It amounts to hundreds of terabytes per server, with two-petabyte GPU servers on the roadmap. Pull all of it into a shared namespace and you have a new layer that Hammerspace calls Tier 0. It uses storage you already paid for, attached to a network you already deployed. Flynn argues this layer is structurally faster than anything sold as a separate appliance. "Tier one is typically oriented around storage capacity. A 2U box, 24 NVMe, or 40 NVMe with some of the Dell systems in there," he calculates. "That's 96 lanes or 192 lanes of PCI Express, with maybe one or two 400 gigabit NICs, which gives you 16 or 32 lanes. So the over subscription just in the one box is massive." His more provocative claim is that it is also the cheapest tier in the rack. The compute and the network are already there. The drives (at least in the case of customers buying GPU servers) are already in the bill of materials. Compared with racking and stacking a dedicated all-flash array, adding metadata servers and a few data movers to existing GPU nodes barely registers as a procurement event. Assimilating what you already own Ripping and replacing infrastructure takes time most teams don't have. The Hammerspace approach is assimilation, which the company describes as a metadata-only operation: scan the existing NAS, ingest the directory tree into the global namespace, and redirect mounts. The bytes never move. Hammerspace says that fast deployment is a key benefit of this approach. Data access is restored almost immediately, even while assimilation continues in the background. Underneath this, the source-of-truth NetApp, Qumulo or VAST array keeps serving the bytes, while Hammerspace presents a unified view on top. That has practical consequences. If something tagged as a training input changes from being a tier-two archive file to a hot input, a policy (Hammerspace calls this an "objective") can trigger an instance copy onto tier 0 without users having to do anything. "Nobody's running a copy. Nobody's running an rsync command," Flynn says. "It's all orchestrated based in the file system." That same orchestration layer can also support retrieval-augmented generation (RAG), inference, and agentic AI workflows, where distributed enterprise data needs to be continuously curated, governed, and made accessible without relying on large-scale data copying. Once the training job finishes, that tier 0 copy is automatically vacated. The clean-up matters because the alternative (letting a hot tier fill up) creates a quality-of-service problem for everything else trying to land there. "Other architectures that have a hot tier and a cold tier often have an issue where the hot tier becomes congested and that endangers the quality of service for the pipeline," Bloom says. āRather than requiring organizations to rebuild infrastructure around AI, the Hammerspace approach is designed to operationalize the storage, cloud, and compute environments enterprises already have in place. Standards-based, with some asterisks Hammerspace's positioning leans heavily on the word "standard". The Samsung-Hammerspace submission that landed inside the top 10 of the IO500 10-Node Production benchmark in November 2025 used standard Linux, the upstream NFSv4.2 client, standard NVMe SSDs and IP-over-InfiniBand. There was no proprietary client, and no custom kernel modules. The company submitted its own results to MLPerf Storage v2.0 showing linear scaling out to 420.8 GB/s across 140 GPUs on five nodes with GPU utilisation above 96%. That kind of performance is not achievable with traditional NFS architectures, which struggle with the parallel access patterns common in large-scale AI environments. Instead, Hammerspace runs on parallel NFS (pNFS). Instead of letting a single server handle file metadata transfer alongside data transfer, it creates a layout map that the client can then use to transfer data from multiple servers in parallel. That became the RFC 5661 standard in 2010. Hammerspace was also instrumental in extending pNFS in NFSv4.2 in 2018, introducing the Flex Files extension. This is what lets pNFS work with real-world hetergeneous storage across cloud tiers, legacy files, and multi-site deployments. The larger implication is that open, standards-based infrastructure is no longer inherently at odds with AI-scale performance, challenging the assumption that enterprises must adopt proprietary storage stacks to support large-scale AI workloads. "With the performance improvements that we contribute into the upstream, we're actually seeing a decades-old file system transmute into a parallel access system that can rival WEKA, Lustre, and GPFS," Flynn says. Multi-site and sovereign by default Once a single global namespace spans on-prem arrays, cloud object stores and the NVMe inside GPU boxes, the next questions are jurisdictional. Where can a given file legally live? Who is allowed to copy it? The platform handles this through the same objectives mechanism used for performance tiering. Tag a dataset as EU-only and the orchestration layer will exclude it from North American volumes. Tag it as HIPAA-bound and write-once-read-many rules apply. Because those policies operate at the data layer rather than within individual storage silos, governance persists even as data moves across clouds, sites, and performance tiers. That is becoming increasingly important as AI pipelines, inference workflows, and agentic systems operate across distributed infrastructure rather than within a single environment. That matters more in 2026 than it did two years ago, since such operational flexibility also changes the economics of AI infrastructure expansion. The SSD supply situation has tightened. NAND and DRAM prices climbed through 2024 and into 2025, driven by AI build-out and hyperscaler hoarding. Buying your way out of a data-movement problem by adding another all-flash array is harder when the flash is harder to get. A control plane that understands workload, location and policy together is now a valuable procurement workaround. Real-world usage The most useful data point about whether any of this matters at scale is Meta. The company runs two 24,576-GPU clusters used to train Llama 3 and deploys Hammerspace specifically to enable live job debugging and real-time code propagation across the training pipelines. If a company with effectively unlimited engineering resources still hits a data-movement ceiling at that scale, the enterprises running a fraction of the workload are almost certainly hitting it too, and the standard answer of "buy more GPU" does not address a problem one layer below the compute plane. Flynn put the underlying joke about NFS politely. "The joke I always heard was, NFS is not for speed." That used to be true. The newer claim, that an open, standards-based file system can sit underneath an AI factory and feed it, casts the venerable file protocol in a new light. ICustomers will likely want to see an independent benchmark of this system's performance against the likes of VAST, WekaIO and NetApp in heterogeneous customer environments, using test systems not designed by the vendor. Nevertheless, it looks promising. In the meantime, the data placement architecture conversation is certainly the right one to be having. Sponsored by Hammerspace.