Preparing the infrastructure foundations - how to get ready for successful AI
- Summary:
- As pressure rises for companies to implement generative AI, there's a risk of buying technology that jars with existing IT infrastructure. Patrick Smith looks beyond the AI hype with four key elements for success.
Generative AI is making headlines globally and organizations are facing increased pressure to have a strategy for their AI deployments. As the hype increases, the key question facing buyers is: how do we make AI work for us? Data is the key, but in order to take full advantage of the opportunity, laying the correct infrastructure foundations is essential. And lots of companies are not prepared.
Internal pressure
A survey by Equinix found that 42% of IT leaders believe their existing IT infrastructure is not fully prepared for the demands of AI. A recent Pure Storage survey found that 90% of respondents had been pressured into buying technology which their infrastructure didn’t fully support. There’s a risk that business pressure is overshadowing strategic decision making.
In order to get ready for this next wave of AI innovation, organizations should strongly consider these four elements of a successful AI strategy: performance; flexibility; reliability; and efficiency.
1. Performance
AI is all about pumping lots and lots of data into GPUs, over and over again. The faster that is done, the quicker and better results will be achieved. AI resources such as GPUs and data scientists are expensive and in high-demand, so keeping them waiting on access to data is the last thing organizations want to do.
But just as important as feeding the GPUs, is accelerating the whole data preparation and curation workflows - helping to collect and process the data in the first place. Part of a consideration on performance is the wide variety of data types: images, audio, text. These are going to be accessed differently depending on the industry they’re from. There has got to be high performance across all these data types and access patterns to make use of, and extract value for AI.
2. Flexibility
We are living through an accelerated pace of change: tools, data sets, algorithms, techniques, they are changing regularly. Timelines have taken a huge leap. A fast pace of change is no longer a quarterly or monthly occurrence, it’s now weekly, if not daily. Realistically, no-one knows what’s ahead with certainty. With the introduction of generative AI, people have had to reprioritize incredibly quickly. There are very large companies saying they need massive AI structures in place to keep up. Medium to large organizations are repurposing existing AI infrastructure into generative-AI suitable systems.
In AI, data patterns shift and objectives change. Storage that is highly optimized for narrow use cases (such as by data set location, file size, or access type) can be difficult to repurpose for changing conditions such as when adding new input types to a model. This highlights the need for technology which enables a company to adapt and evolve. Rigidity will mean organizations will be left behind.
3. Reliability
As mentioned, there are a large variety of different types of data. Collecting this data, to manage, curate, index, label and prepare it to be analyzed takes time. At the next stage, as data scales to production levels, organizations need infrastructure which is reliable, with no downtime and can handle the massive data pipelines that come with AI.
There are also issues around data privacy and governance at play here. A reliable infrastructure has to ensure that data is stored securely, in the region it is meant to be in. Some of the use cases for AI have serious real world implications if these systems are offline. Downtime will cause delays and impact results downstream, this could include patient diagnosis, resolving a customer complaint, or impact financial markets.
Looking for a technology vendor that has a track record of availability, backed by SLAs, mitigates risk and manages recovery from unplanned downtime - such as a ransomware attack - is really important.
4. Efficiency
Everyone knows that data will grow. Many of the estimates seen from analyst houses didn’t include generative AI, so they’re likely incredibly conservative. The more data that’s generated, the more storage will be needed. The more advanced and ubiquitous AI gets, organizations will need more GPUs, more servers, and therefore increase their demands on energy supply.
To deal with this, organizations should look for the most energy efficient infrastructure they can get: all-flash solutions which have no spinning disks reduce power consumption and datacenter footprint. Pure Storage has customers saving up to 85% in energy costs running storage in their data centres. With energy scarcity already impacting where data centres are being built and the rising cost of energy, organizations really need to think about how they are going to meet the storage requirements that AI has from an energy consumption point of view. Pure Storage has customers who say they’ve paid for the technology just through the savings they’ve made in energy consumption so these are significant numbers.
Data quality
With these four elements incorporated into an AI strategy, it’s important to consider the impact data quality has on successful AI. The key to good results is connecting AI models and AI powered applications to data. It’s hard to predict exactly what this link is going to look like and what the future of AI applications are going to be – but some first principles are around data quality:
- Lots of data is important;
- Data can’t be cold, it needs to be readily available;
- It needs to be easily accessible, across silos and applications.
Get your infrastructure ready for AI
Thinking about these foundational points, customers should be better prepared to meet their AI goals with an AI-ready infrastructure, such as Pure Storage’s AIRI. Alongside helping customers with self-driving cars, financial services and healthcare, one of Pure’s most well known AI implementations is at Meta which built a Research Super Cluster (RSC), the largest AI supercomputer in the world. Meta understood the importance of performance, flexibility, reliability and efficiency when it turned to Pure Storage to help build the RSC. Pure is supporting Meta’s next-generation AI research trained on petabytes of data, on infrastructure which has a very low power useage.
AI is already having important real world impacts on healthcare, life sciences, customer service and a whole host of other applications. Performance, reliability, flexibility and efficiency are the four elements that should be the starting point for any organisation that wants to implement AI with a long term view on success.
It’s vitally important for the IT team to be embedded in the wider business to ensure needs are met on both sides. In having this four point strategy as a business wide approach, there shouldn’t be hidden surprises in what the business needs and what the technology can deliver.
The secret sauce to successful AI isn’t hype, it’s infrastructure which can cope with the demands AI brings; technology which allows businesses to maintain a competitive advantage through successful AI initiatives.