AIPressRoom
Posts
A brand new option to optimize and prioritize AI initiatives for the GPU scarcity

A brand new option to optimize and prioritize AI initiatives for the GPU scarcity

September 08, 2023

Head over to our on-demand library to view classes from VB Remodel 2023. Register Here

Generative AI, enabled by massive language fashions (LLMs) like GPT-4, has precipitated shockwaves within the tech world. ChatGPT’s meteoric rise has triggered the worldwide tech trade to reassess and prioritize gen AI, reshaping product methods in actual time.

Integration of LLMs has given product builders a simple option to incorporate AI-powered options into their merchandise. However it’s not all clean crusing. A obvious problem looms massive for product leaders: the GPU scarcity and spiraling prices.

Rise of LLMs and GPU scarcity

The rising variety of AI startups and companies has led to excessive demand for high-end GPUs resembling A100s and H100s, thereby overwhelming Nvidia and its manufacturing associate TSMC, each of whom are struggling to satisfy the provision. On-line boards like Reddit are abuzz with frustrations over GPU availability, echoing the sentiment throughout the tech neighborhood. It’s grow to be so dire that each AWS and Azure have had no selection however to implement quota methods.

This bottleneck doesn’t simply squeeze startups; it’s a stumbling block for tech giants like OpenAI. At a current off-the-record assembly in London, OpenAI’s CEO Sam Altman candidly acknowledged that the pc chip scarcity is stymieing ChatGPT’s development. Altman reportedly lamented that the dearth of computing energy has resulted in subpar API availability and has obstructed OpenAI from rolling out bigger “context home windows” for ChatGPT.

Occasion

VB Remodel 2023 On-Demand

Did you miss a session from VB Remodel 2023? Register to entry the on-demand library for all of our featured classes.

Prioritizing AI options

On the one hand, product leaders discover themselves caught in a relentless push to innovate, going through the expectations to ship cutting-edge options that leverage the power of gen AI. However, they grapple with the tough realities of GPU capability constraints. It’s a fancy juggling act, the place ruthless prioritization turns into not only a strategic determination however a necessity.

Provided that GPU availability is poised to stay a problem for the foreseeable future, product leaders should assume strategically about GPU allocation. Historically, product leaders have leaned on prioritization methods just like the Buyer Worth/Want vs. Effort Matrix. This technique, nevertheless logical in a world the place computational assets have been considerable, now calls for a little bit of reevaluation.

In our present paradigm, the place compute is the constraint and never software program expertise, product leaders should redefine how they prioritize numerous merchandise or options, bringing GPU limitations to the forefront of strategic decision-making.

Planning round capability constraints might sound uncommon for the tech trade, however it’s a commonplace technique in different industries. The underlying idea is simple: Essentially the most worthwhile issue is the time spent on the constrained useful resource, and the target is to optimize the worth per unit of time spent on that constraint.

Expertise success metrics

As a former guide, I’ve efficiently utilized this framework throughout numerous industries. I consider that tech product leaders can even use an identical strategy to prioritize merchandise or options whereas GPU constraints exist. When making use of this framework, probably the most simple measure of worth is profitability.

Nonetheless, in tech, profitability won’t all the time be the suitable metric, notably when venturing into a brand new market or product. Thus, I’ve tailored the framework to align with the success metrics typically utilized in tech, outlining a easy 4 steps course of:

1. Contribution

Firstly, establish your North Star metric. That is the contribution of every product or function, one thing that encapsulates the essence of its value. Some concrete examples may embody:

A rise in income and revenue
Features in market share
Progress within the variety of every day/month-to-month lively customers

2. Variety of GPUs required

Gauge the variety of GPUs wanted for every product or function. Concentrate on key components together with:

Variety of queries per consumer per day
Variety of every day lively customers
Complexity of the question (what number of tokens every question consumes)

3. Calculate contribution per GPU

Break it right down to the specifics. How does every GPU contribute to the general purpose? Understanding this offers you a transparent image of the place your GPUs are greatest allotted.

Prioritize merchandise based mostly on contribution per GPU

Now, it’s time to make the powerful selections. Rank your merchandise by their Contribution per GPU, after which line them up accordingly. Concentrate on the merchandise with the best Contribution per GPU first, guaranteeing that your restricted assets are channeled into the areas the place they’ll take advantage of affect.

With GPU constraints not a blind spot however a quantifiable issue within the decision-making course of, your organization can extra strategically navigate the GPU scarcity. To carry this framework to life, let’s visualize a state of affairs the place you, as a product chief, are grappling with the problem of prioritizing amongst 4 completely different merchandise:

Though Product A has the best income potential, it doesn’t yield the best contribution per GPU. Surprisingly, Product D, with the least income potential, gives probably the most substantial return per GPU. By prioritizing based mostly on this metric, you might maximize complete potential income.

Let’s say you’ve a complete of 1,000 GPUs at your disposal. A simple selection may need you choosing Product A, producing a income potential of $100 million. Nonetheless, by making use of the prioritization technique described above, you might obtain $155 million in income:

The identical technique might be utilized to different contribution metrics, resembling market share achieve:

Equally, choosing Product A would have led to a market share achieve of 5%. Nonetheless, making use of the prioritization technique described above, you might obtain 7.75% in market share achieve:

Advantages and limitations

This various prioritization framework introduces a extra nuanced and strategic strategy. By zeroing in on the Contribution Per GPU, you’re strategically aligning assets the place they’ll take advantage of substantial distinction, whether or not when it comes to income, market share or another defining metric.

However the benefits don’t cease there. This technique additionally fosters a larger sense of readability and objectivity throughout product groups. In my expertise, together with my early days main digital transformation at a healthcare firm and later whereas working with numerous McKinsey shoppers, this strategy has been a game-changer in situations the place capability constraints are a important issue. It’s enabled us to prioritize initiatives in a extra data-driven and rational method, sidelining the normal politics the place selections may in any other case fall to the loudest voice within the room.

Nonetheless, no one-size-fits-all answer exists, and it’s value acknowledging the potential limitations of this technique. As an illustration, this strategy could not all the time encapsulate the strategic significance of sure investments. Thus, whereas exceptions to the framework can and must be made, they must be rigorously thought of moderately than the norm. This maintains the integrity of the method and ensures that any deviations are made with a broader strategic context in thoughts.

Conclusion

Product leaders are going through an unprecedented scenario with the GPU shortage, so discovering new methods of managing assets is required. Within the phrases of the good strategist Solar Tzu, “Within the midst of chaos, there may be additionally alternative.”

The GPU scarcity is certainly a problem, however with the precise strategy, it could even be a catalyst for differentiation and success. The proposed prioritization framework, specializing in Contribution Per GPU, gives a strategic option to prioritize. By zeroing in on Contribution Per GPU, corporations can maximize their return on funding, aligning assets the place they’ll take advantage of affect and specializing in what issues probably the most to the long-term success of their firm.

Prerak Garg is senior director of cloud and AI company technique at Microsoft and a former McKinsey and Firm engagement supervisor.

DataDecisionMakers

Welcome to the VentureBeat neighborhood!

DataDecisionMakers is the place specialists, together with the technical folks doing information work, can share data-related insights and innovation.

If you wish to examine cutting-edge concepts and up-to-date data, greatest practices, and the way forward for information and information tech, be part of us at DataDecisionMakers.

You may even contemplate contributing an article of your personal!