From OnDemand to AweSim: Merging HPC with the Cloud

Executive Director, OSC
Ohio Supercomputer Center
Tuesday, October 15, 2013 - 10:30am (updated Thursday, October 31, 2013 - 12:44pm)
Screenshots of OSC OnDemand windows.

Ed.—This post is the third in a series on manufacturing in recognition of Ohio Manufacturing Month (October) and National Manufacturing Day (Oct. 4). For others in this series, follow the manufacturing tag.

The Ohio Supercomputer Center (OSC) launched OSC OnDemand in January, and we've given presentations about it at various meetings and conferences, including the XSEDE conference in July. Through OnDemand, users can run high-performance computing (HPC) and visualization on Glenn and Oakley, our production clusters. To some of our users, I suspect OnDemand looks like a web site and nothing more. I think it is substantially more. I think it is Sputnik. 

The Sputnik 1 satellite was launched in 1957. It proved that the available technologies of the time, cleverly combined, could put a man-made device into orbit. While a remarkable achievement on its own, Sputnik more importantly served as an example of what could be done. It started the Space Race and spurred the creation of technology that put a man on the moon. Similarly, OnDemand on its own is a useful service and a technical accomplishment. I am proud of the team who delivered it. But, I also see it as a first example of what could be done to merge HPC with the cloud. This is the goal of our new initiative, AweSim. 

To see what I mean, let’s compare HPC with desktop computers. When I started programming in the 1980’s, I had to go to a computer center on campus to type my programs on a VT100 terminal and submit them to the machine. Every fifteen minutes or so, the machine would process an entire batch of these submissions and I would get a printout with my answers. (Personal note: I’d work late on Friday evenings to chat with the cute girl who distributed the printouts. This past July, we've been married twenty-five years.) While the process I followed in that computer center was not technically supercomputing, it was exactly the HPC workflow: remote access to a shared computer, storage on the server and batch execution. 

In graduate school, I got a desktop machine. I found desktop usage to be a much different user experience, since processing and storage are local (for example, reading and writing to the hard drive) and interactive (beginning processing immediately). However, when I connected to remote supercomputer systems I was back to the HPC user experience. These two experiences coexist to this day. 

So, what do I mean by the cloud user experience? For an example, let’s look at how I use email. I read and write email using my work laptop, my home laptop and my iPhone. I see the same emails on all three devices along with the same responses and contacts. If I save a draft on one device, it is available for editing on another. I see the cloud computing experience defined by interactive, customized apps that provide a single, seamless user experience (same data and actions) across multiple mobile devices. 

This experience did not just happen. It is built from a set of technologies. The web provides a standard mechanism for computers to exchange content. Mobile devices like notebooks, smartphones and tablets run apps. Wireless networks like 4G and Wi-Fi connect devices from nearly anywhere. Cloud vendors provide remote computing and storage accessible from any device. These providers emerged when dot-com giants like Amazon and Google realized that their internal compute resources (used for taking customer orders or performing web searches) could be commoditized and leased on the open market. 

HPC is remote and batch. The desktop is local and interactive. The cloud is remote and interactive. Why not just make HPC interactive? After all, no one enjoys waiting. The issue is the hours (or even days) it takes to run an HPC job or transfer HPC-scale data sets. (And unfortunately, this isn’t getting any faster until someone changes the speed of light.) But, people find ways to manage things that take longer than they want, like making chili in the crockpot. For example, some engineers scale their problems so that they complete overnight. They start a batch before leaving and have answers in the morning. One set of HPC tools is used to manage the batch work. Another set of desktop tools is used for interactive analysis (often called preprocessing and post-processing). Unfortunately, engineers have to manage their work across these two environments. 

AweSim is a new initiative spearheaded by OSC. We’re bringing the cloud user experience to modeling and simulation.  We are designing interactive, customized apps that provide a single, seamless user experience across HPC batch and interactive cloud solutions. Engineers will use our apps to fire off batches of HPC jobs that generate huge data sets. They will use the same apps to interrogate this data with analytics to answer any number of design questions. Using the lessons learned building OnDemand, we are preparing for our next launch.