Innovators of Big Compute Assemble Again – This Time, Virtually
We just concluded the first Big Compute microSUMMIT of 2021, kicking off an ongoing virtual event series focused on the state of the industry. While circumstances prevented our in-person Big Compute conference (BC 2020), it is exciting to see the continued growth in the community of scientists, engineers, and technologists interested in the state of the Cloud HPC industry. This inaugural virtual event offered a little something for everyone from IT to HPC and R&D professionals.
Led by a panel of seasoned HPC experts from Microsoft, Rescale, and Hyperion Research, the microSUMMIT covered topics ranging from the latest hardware and software to how organizations and practitioners are reacting to the rapid changes cloud is bringing to digital R&D. Azure HPC champion Mark Eastman moderated a deep dive into the recent research findings from Rescale – presented by Chief Product Officer Edward Hsu – and Hyperion – presented by CEO Earl Joseph.
Following a “tipping point year in 2019,” Earl shared that 2020 was a year of continued growth – not only despite, but due in large part to COVID-19 related complications. Distributed work accelerated the adoption of HPC hardware in the cloud for its flexibility, while simultaneously, many new specialized HPC architectures from cloud providers continue to become available.
Edward cited the 2021 Big Compute State of Cloud HPC Report, which shows that the choice of hardware in the cloud is approximately doubling every two years, and organizations are showing an appetite for using the latest options available (note figure below and view the full report here).
“The world is moving toward a multi-architecture operating model,” Edward Hsu, CPO, Rescale explains. The challenge is, as Earl points out, that “many sites don’t know what the right answer is. They can’t make the decision to buy the right hardware if they’re not sure what the answer is or what they’ll need next month or next year.
Unsurprisingly, these cloud veterans had answers to some of these tough questions and shared best-practices and data-driven models to solve key cloud HPC challenges.
New Hardware and Software Converge at a Fever Pitch
The industry landscape is heating up, and as you would expect from a cloud provider, platform provider, and research analyst, there were no shortage of vendors and solutions to discuss.
Gone are the days of homogenous, static hardware, and arrived are the days of optionality and increasingly complex application portfolios.
As discussed in the microSUMMIT, never before has HPC seen such an explosion of new entrants and disruptive technologies such as GPU accelerators, ARM/RISC-V, and specialized cloud instances like AMD’s new HBv3 available from Azure. Hyperion Research points out that, as a result, companies are able to run larger jobs even faster as evidenced by UK Met’s $1.5 billion investment in Azure public cloud for their weather codes, a use case known to be particularly challenging to run.
With cloud providers like Amazon Web Services, Microsoft Azure, Google Cloud Platform, and Oracle Cloud are continuously onboarding newer and more specialized architectures, it’s no surprise to see growth in the number of organizations using hardware from multiple cloud providers. Rescale analytics showed that multi-cloud posture for HPC grew from 35% in 2018 to 50% in 2020 (note figure below). From this it would appear that IT and HPC managers are moving beyond cloud value propositions like disaster recovery and business continuity to now hearing the dynamic needs of their end-users and taking advantage of cloud to unlock new strategic capabilities.
Of course, it wouldn’t be a HPC event without getting into the hottest applications and use cases being deployed by practitioners across industries. Hyperion Research’s latest research on cloud HPC usage uncovers major growth in sectors like biosciences, computer-aided engineering (CAE), chemical engineering, and electronic design automation (EDA).
“What about AI/ML,” you ask?
AI and ML/DL use cases are undoubtedly cropping up in almost every industry and woven into the fabric of many organizations’ R&D processes. HPC has a unique relationship with AI in that it can be used to generate data to train models, create surrogate models, and explore the design space.
Despite Big Strides, It’s Still Early Days
There is good news for anyone considering a career path in managing HPC infrastructure or utilizing cloud to run scale-out applications. The industry is facing a significant talent shortage with many advancements yet to be discovered.
As the panel discussed, while that may mean solid job security, talent shortages also mean organizations are challenged in migrating and managing their operating models to the cloud. Challenges range from ensuring security and compliance to reducing creeping compute costs and maximizing performance across complex portfolios of software across the enterprise.
One major topic that is prone to controversy, is the economics of cloud in an HPC context. The panel unpacked this healthy cloud skepticism, and Earl pointed out that many organizations struggle with cloud cost transparency, and as a result, cost overruns. While that may not be shocking, this time around Rescale introduced their Performance and Maturity indices (RPI and RMI), the secret ingredient to Rescale’s platform intelligence. These tools enable practitioners to control for the ideal cost-performance profile. Using this data-driven approach companies can now select the latest and best hardware based on their variety of software workloads which, in many cases, can reduce “full-stack” costs.
Predicting the Future of the Industry
What does the future hold for technologists and R&D innovators?
According to data provided by Rescale and Hyperion Research, growth in cloud HPC is expected to outpace that of on-premises computing. In fact, Earl shared that he expects it to be 250% faster not accounting for any additional acceleration caused by drivers like ease of use improvements and potential performance advantages (see figure below).
Despite all the exciting change and growth, it may still be early days in the digital transformation for R&D organizations. Ed’s closing thoughts shed some light on where he thinks the industry is headed:
“Now everything is connected: Saas-based, all the tools you need as a developer are readily available. We are very near a world where that’s going to be true for science and engineering. Every resource you need is on-demand and is given to you in-context to what you’re trying to accomplish. So I think we’ll see breakthrough improvements in time-to-market of new physical products, a lot more collaboration and connectedness between researchers and people manufacturing products. Rescale’s goal is to help these HPC and IT organizations to harness the resources they need (HW, SW, and other tools) to commercialize innovation breakthroughs faster.”– Edward Hsu, CPO, Rescale
So where will the next breakthroughs be in computing and R&D? Lucky for our panelists, it appears they sit at the intersection of many different industries and cutting edge solutions that enable a wide range of use cases.
We look forward to checking back in with them in future events.
Want the Full Scoop?
If you missed the live event, while you may have missed your chance at an Oculus Quest 2 trivia prize, you can still watch the session recording on-demand here, and keep tabs on future Big Compute events and announcements by signing up for the Big Compute newsletter.
Rescale, Microsoft Azure, and Hyperion Research are all a part of the Big Compute community – a global group of thought leaders who embrace the freedom to think big in this era of virtually unlimited compute. To learn more about Big Compute including the fast-growing Big Compute podcast (ranked in the Top 100 for Science and Technology), please visit bigcompute.org.
Whether you are an engineer looking to build better products, a scientist looking to uncover your next breakthrough, or a technology leader looking to improve the performance and efficiency of your teams, we recommend checking out the full Big Compute State of Cloud HPC Report.