Main content

Cumulus VFX - a Disney and Netflix client - scales with Pure Storage

Derek du Preez Profile picture for user ddpreez May 29, 2024
Summary:
Australian-based Cumulus VFX was hitting capacity with a growing number of clients, turning to Pure Storage to support its scaling requirements.

Image of red coloured theatre seats in a movie theatre
(Image by Bruno from Pixabay)

What started as a one person studio in the back of a garage a decade ago, Cumulus VFX is now a growing provider of visual effects services to clients that include Disney and Netflix. Based in New South Wales, Australia, Cumulus has worked on projects that include the Oscar nominated ‘Elvis’, as well as ‘Lockwood & Co’ and ‘Man vs Bee’. However, as the company has scaled in recent years, this has placed increased pressure on its technology infrastructure and data needs. 

VFX studios are often working with extremely heavy data demands and have unique latency and compliance requirements, given the nature of their work. Rendering images and video, which has to be done in highly reliable and secure environments, often to strict deadlines, means that data platforms being used need to be proficient in handling heavy workloads at speed. 

Cumulus found a few years ago, when it began winning some high profile projects, that its current technical capacity was reaching its limits. Will Gammon, CEO at Cumulus VFX, has always been aware that a studio can deliver for global clients if it is smart about its technology: 

Being a startup, we've always been very careful about how we buy tech, how we spend our money. We're not supported by any funding channels or anything like that. As we added more and more people to our servers, they got slower and slower. So there was a diminishing return on our technology. 

Cumulus punches above its weight in terms of our size and how we compete on the global market. We do things differently to most studios: we use software where other people turn their nose up, we embrace it because of its efficiencies in GPU technology, its power efficiency, its ability to turn around renders faster. The multiplier effects these have on power consumption and the environment and cost of being a startup…that's kind of our secret power.

Cumulus’ CTO, Nicky Ladas, recognized that with the company growing from one operator ten years ago, to the 50 operators currently on staff (with this expected to double by the end of this year), that the studio’s technology requirements were going to change and need upgrading. Four years ago he started vetting vendors to change Cumulus’ storage provider, because of capacity constraints: 

There's three components that hammer the storage system - there's the virtual workload of our virtual infrastructure; there's the render workload of the render farm; and then the operator workload. Any one of those components could max out on our legacy storage solution. 

Cumulus found that if the operators were hitting the system hard, all of the disks it had in operation would get to 100% utilization and create what is called a ‘wait state’ - a delay experienced by a computer processor when accessing external memory that is slow to respond. This compounded with the render load on top and operators were experiencing latency that went from a few milliseconds to hundreds of seconds. In addition, there was a constant random I/O access from the virtualization layer, which caused further delays:

Those three things compounded brought [everything to a] grinding halt. It was adequate for when we were in the scope of a boutique - a dozen operators and two dozen render nodes, that was fine.

However, when we started ramping up to a significant team, the render farm blew out accordingly. So if you’ve got 50 operators, you've probably been 150 render nodes, and then your virtualization requirements increase, all of those things just literally brought the storage systems to a grinding halt. 

Eliminating performance constraints

Ladas had experience of enterprise and all flash solutions in the past and knew that an ‘all flash paradigm’ would solve Cumulus’ problems. The company went to the market for a solution and considered three to five enterprise providers - all of which he had exposure to previously, aside from Pure Storage: 

We got them to present their solutions based on our capacity requirements and we had some certain qualifying criteria: performance latency was number one; the second was rack real estate; and the third was power footprint. We've got limited space on prem in the data center, so we needed to solve those three problems. And at that point, Pure was a clear winner. None of the other vendors could satisfy all three criteria like Pure did for us.

Cumulus VFX opted for Pure Storage’s FlashBlade solution to support its new integrated data storage platform, which now operates its entire production environment. For Ladas, the ‘proof was in the pudding’ when he saw how the migration was handled. Pure provided Cumulus with an engineer to assist, during which time they had set aside a week to migrate all of the workloads - of which it included a couple of hundred terabytes for the core dataset and 20 terabytes for the virtualization layer. The speed of migration surpassed Cumulus’ expectations: 

We scheduled about a week for the entire migration and it happened within the first 12 hours. We managed to move all of that data across, Pure just slurped it all and within the first business day we had migrated all of our data off the legacy systems. And the second day we went live with the Pure. It was quite impressive. 

And the results since then have shown that with an enterprise storage solution, Cumulus no longer has to worry about downtime or latency issues: 

There's a couple of things…we had real technical issues and problems to deal with. The moment we went live with Pure, those problems across those three different verticals disappeared completely. They are no longer performance limiters for us - we got an instant bandwidth increase, we got an instant latency drop, and we have not been able to hit the capacity. 

We benefited immediately, from day dot. Renders came out faster. Operated downtime was minimized from ten, twenty, thirty minutes a day, to zero. And that has been pretty much par for the course since then. Storage downtime was a weekly occurrence prior to that. 

In addition to this, Pure Storage has also supported Cumulus’ sustainability goals by reducing the amount of power required to carry out its operations, which in turn also has associated cost benefits: 

We’re limited with real estate and power - and power is at a premium these days. We pay out of the nose for our power per kilowatt. So, having something that was economical was a requirement. 

You compare three or four chassis of 24 spinning disks to what Pure does with four power modules, it was a significant cost saving. The initial unit was only four units, which we then upgraded to a six unit case. 

The equivalent solutions, from the alternatives at the time, were two full racks at 14 kilowatts on power consumption. The current unit runs six units off four kilowatts. So we are talking about a difference of 10 kilowatts, which is significant. 

Then you add cooling, which is another 30% that you need to put into the data center and you're looking at almost 20 kilowatts versus six or seven kilowatts of power, with cooling included. It was better than twice as good.

A sales advantage

Finally, the new Pure system also has some business development advantages for Cumulus. CEO Will Gammon notes that when pitching to clients, mentioning that the company has a ‘Pure Storage studio’ has seen a positive reaction from prospects who understand the technical demands: 

It gives us a bit of credibility, that we’ve invested in our tech and taken the leap early. 

And secondly, as Ladas notes, the Pure system also helps Cumulus promote its compliance advantage: 

The fact that I now have immutable snapshots, and I can give our Disney and Netflix clients peace of mind that their data is immutable and safe…Pure were the first to market for that immutable enterprise snapshot piece as well. That's been a huge selling point for us. 

 

 

Loading
A grey colored placeholder image