AI is both powerful and power-hungry. It’s already driving the next technological revolution, but AI applications consume far more power than regular compute tasks. In fact, it’s double jeopardy for IT teams: GPUs are both intensive to run and expensive to buy. Even in the cloud, a basic GPU instance is often twice as expensive as a comparable compute instance.
This means more pressure and hard choices for IT teams. Experimenting with AI is often seen as a way of staying competitive, but cutting costs in other areas can leave the business exposed. A recent report from Crayon found that a staggering 94% of IT leaders are struggling to optimise their cloud costs – and AI lives in the cloud.
However, what we are increasingly seeing is that many companies are looking at more sustainable solutions from both a hardware and software perspective to stay agile on cost.
Efficient and Effective
Sustainability may not quite sit on a par with cost and security, but teams are increasingly realising that sustainable technology is often efficient technology. This is particularly important for AI, because on average, a GPU dedicated to AI consumes around five times more electricity than a regular GPU for gaming or rendering processes, for example.
This makes it key to understand how and why power efficiency works across the entire technology stack. All of the components interact with each other, making it a tangled system to unravel.
However, we can start with the bigger picture. At an infrastructure level, organisations should think about how quickly they need AI processes to be done. If you’re a startup, you often need to iterate quickly, because you’re trying to bring a product to market, so you probably do need fast GPUs like Nvidia H100s. But if you’re a large enterprise doing model training over a weekend, then although an H100 might finish by lunchtime on Saturday, a V100 finishing by 8am on Sunday would still be around 40% cheaper, depending on the provider .
Of course, this is an oversimplification; when specifying a GPU, technology teams need to think about the model they’re using, how many parameters it has, the degree of precision required, the number of concurrent users and the usage patterns, for example.
And there’s additional considerations to bear in mind. When organisations are doing model training, it can be done anywhere, because it’s not user-facing. Inference, on the other hand, is user-facing, so it’s much more sensitive to latency and therefore location. This means that model training can be done almost anywhere, with a broader range of – potentially cheaper – providers. At the same time, using lower-spec kit is greener. An H100 has a ‘cradle to gate’ (i.e. manufacturing) carbon footprint of approximately 150kgCO2e, compared to an L4’s 50kgCO2e. For reference, an average CPU has a footprint in the region of 5-25kgCO2e . In short, being smart about AI infrastructure can cut costs and reduce carbon footprint.
At the same time, energy consumption depends on the characteristics of the application itself. CPUs and GPUs are generally more energy intensive than tasks which use more of the RAM or SSD. And in general, running applications is where we can make the most impact – largely because most semiconductors are assembled in the same way, in the same factories, so the carbon footprint is broadly similar. Interestingly, the manufacturing impact is the inverse of the running patterns: making RAM and SSDs usually takes up more power and has a higher carbon impact than making CPUs.
Furthermore, it’s also important to look at the energy mix of a country when locating AI tasks, because some areas are more dependent on fossil fuels than others. And however much green energy your provider buys, the actual power for a task will come from the grid, and will therefore be from a mix of sources, making the national mix one of the most important things to look at.
It probably goes without saying that redundancy is an important consideration. Having two sets of infrastructure running at the same time means twice the bill and twice the carbon footprint. Many backup and business continuity infrastructure providers offer a ‘passive mode’ which can reduce power consumption, but it does increase the time taken to get back up and running in the event of a failure. This makes it key to understand the profile of the application you’re running, which means talking to business stakeholders and adapting accordingly.
There’s also a sliding scale of costs for storage. High performance storage like NVMe SSDs, for example, are expensive and carbon-intensive compared to conventional hard drives with a spinning disc. Old-fashioned tape storage is very cheap, sustainable and has a long lifespan, but is very slow to retrieve data from!
Our final infrastructure consideration – resource scaling – is where we start blurring the line between the hardware and the software. Correctly sizing a virtual machine in the cloud is a fine art, mostly because it’s relatively easy to scale CPU allocation, but harder to re-allocate memory – and although memory might not have as great a power consumption as CPU, no-one wants to be paying for memory they’re not using.
Into the Code
It might surprise you to learn that even the code of an application has an impact on its power consumption, but today we have organisations like the Green Software Foundation which exist to help promote more power-efficient software practices. One of the historic problems with code efficiency and power consumption is that developers don’t code from scratch anymore, and indeed, it doesn’t make sense to do so. This means that although it’s faster to code because you can use libraries and APIs, if you have inefficient building blocks to start with, your application will perpetuate the problems.
Furthermore, not all programming languages are equally efficient. Languages that run at a more fundamental level (like Rust and C) tend to be more high-performance and powerful, but this does incur the Spiderman rule (with great power comes great responsibility). The potential for exploits and mistakes in a more powerful, fundamental language is exponentially greater than that with languages like Perl and PHP, so teams must always balance cost, power and risk.
Lean and Green
The IT industry is always under pressure to do more with less and perform the careful balancing act of making sure technology can do what it needs to while staying secure, agile and cost-effective. AI is a great tool for helping to enhance organizational efficiency and productivity, but it can also be a spanner in the works, inflating costs – sometimes for uncertain payoff.
This makes it more important than ever before to understand every link in the chain. Although there can often be simple, broad-reaching initiatives to reduce costs, it’s equally important to think from the grass roots upwards and trim costs without compromising on functionality. This is where good environmental practice and cost control can intersect, helping businesses to get the most from their systems, but also staying efficient and sustainable. Today, sustainability is not just ethical, it’s the path to efficiency and effectiveness in the AI era for all businesses.