Essential features Avere brought to Framestore include the ability to identify hot files, improve storage load visibility, identify “rogue renders” that could wreak havoc on the system and accelerate random and sequential read and write access to the system’s backend storage or core filers. In a nutshell, the Avere FXT 4500 Edge filers served the render farm, freeing up the storage system to service the rest of the facility.
Meeting the gravity challenge
Two years ago Framestore was settling into the final stages of a multi-year journey to complete its most ambitious undertaking to date – the delivery of the Alphonse Cuarón epic space story Gravity. The Framestore team has invested considerable development in tools to provide metrics and analysis of technical resources. Production staff use the data derived from these tools as they manage complex schedules and deliverables. It was clear from the onset that Gravity’s resource demands would be enormous, eventually scaling up to over 15,000 processor cores at peak rendering. As rendering hits full stride, the render nodes create the classic production bottleneck, consuming all available storage bandwidth and creating slowdowns on the artist / workstation side.
Framestore’s technology team was interested in partitioning the storage in such a way that Gravity related rendering would pull data from storage pools separate to those being used for the rest of the facility. This list includes the other films going through Framestore during this period (Sherlock Holmes: Game of Shadows, Wrath of the Titans, Lincoln, War Horse) and work of the Commercials division whose customers include such familiar brands as Nestle, P&G, Pepsi and Volkswagen. To support the projected load Framestore undertook a project to completely rebuild their storage and networking infrastructure around a primary storage capacity of 1.2 petabytes of tiered HDS (formerly BlueArc) storage sitting behind six high performance Mercury 110 heads. The core networking for the facility was all upgraded to 10Gb and a new low latency Arista core switch was deployed to accommodate the huge increase in network traffic. Initial results on this configuration were very favourable but as the reality of the rendering requirement became evident it also became clear that Framestore needed new methods to meet the computational needs of the render farm.
Avere – how one thing can change everything
Avere have been part of the conversation from the early days. As production and core counts continued to ramp up it became clear that an Avere evaluation would be essential. Framestore CTO Steve MacPherson explained that several competitive SSD-based solutions were considered. Everything from building their own system based on CacheFS software to adding SSDs to the newly installed core filers.
“Ultimately, Avere won the argument with its ability to deliver the type of metrics, analysis and identification of hot files essential to production efficiency. Simplicity is a key Framestore engineering design goal. Once we established the correct configuration for the Avere system, we were able to just step back and let it do its thing.”
Reliable metrics are vital in the planning and early growth stages, according to MacPherson. Given the unpredictable nature of the rendering work, visibility on system load guides decision making. The more accurate the telemetry extracted from the system, the more accurate the planning around the use of resources.
“We are also very pleased with the support and technical knowledge of the Avere team at all levels – from the installation engineers to the CEO, we have been able to discuss our challenges in a way that focuses on an informed solution. The Avere Edge filers were absolutely critical to the delivery of Gravity – the most computationally demanding film Framestore has ever done.”
Avere edge filers save the day
Even the best laid plans can be thrown into disarray by external events. This happened to Framestore in 2012, when the power grid at one of the studio’s locations in London’s Soho district was struck by an electrical brownout. At that time, post-production processes were running all day, every day.
“Usually we clear the farm over the weekend, but we were running 24x7, and our render queues were full to capacity. We had to restart entire job sequences. Fortunately, everything was cached in the Avere FXT4500s. Once restarted all jobs were back up to speed very quickly, performance levels of the Avere went through the roof and the Avere Edge filers did an amazing job of shielding the BlueArc from the impact of the entire farm simultaneously asking for data” explained MacPherson.
Rising to meet new markets
Framestore’s success is leading to growth with the company expanding into new facilities in Los Angeles and Montréal.
“The challenge we face is how to remain an integrated organization even though we are geographically split,” said MacPherson. “It’s a tough business environment we operate in and the foundation for financial success is increasingly based around intelligent processes and efficient use of resources.
“Managing capacity in Framestore’s five London sites is tricky enough, but we now support multiple sites in multiple cities across multiple time zones. Managing Framestore’s technical infrastructure is a bit like a game of three-dimensional chess. It is essential we demonstrate an ability to distribute this computational load across a global landscape using available resources in an opportunistic manner. Our challenge is how to shield our creative production community from the details of how we make this happen. MacPherson is looking at Avere’s FlashMove and FlashMirror as technologies to help address this storage game of “three dimensional chess.”
FlashMove is integrated with Avere’s native real-time tiering and allows data to be transparently and non-disruptively moved between NAS systems. FlashMirror keeps replicated data closely in sync by sending updates directly and in parallel to both the primary and secondary NAS core filers. In addition, FlashMirror offloads the replication-processing load from the storage and supports clustering to scale data replication performance to any level required.
MacPherson said, “Our goal is for data transport processes to be transparent to the users. With the rise in multi-site working practices and the economic concerns around sharing computational resources it’s essential that the data follow working practices rather than rely on an arbitrary shuffling of data from one location to another. The WAN network connections themselves are a resource that must be managed and the Avere approach is a natural solution to managing this workflow.”
Summarizing his experience of implementing the Avere FXT 4500 Edge filers, MacPherson said, “For us, it’s all about production efficiency and risk management. In the technical realm this becomes successfully balancing the available resources. Without Avere, the risk we carry is the impact of rendering on our core storage and the effect this has on artist interactivity. By the time the film gets to the end stage of production where time is most critical, we are managing huge renders that push storage to its limits. At the same time, we have artists working from that same storage.
“On Gravity Framestore had an unprecedented level of CG imagery being created and a huge number of people working on this material simultaneously. Despite this pressure on our infrastructure and the creative demands, the delivery atmosphere was really relaxed. It took a lot of tuning, but in the end we were able to focus our efforts on getting the right jobs through at the right times rather than just trying to keep everything from sinking! The difference is enormous, in terms of our ability to respond to client requests and to ‘be there’ for the studio; Avere played a key role in moving our infrastructure and our creative abilities forward.”