How much do we need?

The campus has begun a major effort to calculate how much power, cooling, storage, and network speed its high-level research requires. Got room for 500 million virtual trees?

Advanced research often requires sophisticated displays of data, an ability these days that calls for more--much more--than a hearty dose of PowerPoint.

It can require software and equipment sturdy enough to support virtual "caves," or high-definition display walls taller than a basketball player, or digital recreations of ancient cultural sites in three dimensions.

Researchers and faculty increasingly use almost magical images to understand and explore huge sets of data, in areas ranging from physics to sociology. With the right power and networks, they can send the images to distant colleagues or supercomputers, for analysis or to work together.

But to do all that, researchers need computing power, cooling, and data storage--plus a network big enough to move enormous files quickly--in quantities that would have seemed impossibly huge a decade ago. Which has raised a key question: How much high-tech firepower does UC Davis need, to do the work its researchers want to do? How much will it need in a few years?

The campus has begun to find out, through discussions, a cyber-infrastructure conference on campus this spring, a survey, and other focused efforts. Getting answers isn't easy, but it is important.

At stake, said Information and Educational Technology (IET) Vice Provost Pete Siegel, is the future of the campus as a top research university. Without the right framework, the really advanced research can't happen here.

Like printing half a billion trees

An assessment of faculty by IET in late 2005 helps define the need for more capacity, although the figures are so large they're hard to comprehend.

Take just one component among many: hosted data storage (data stored off campus, in this case). The requirement at UC Davis in late 2005 was estimated at 51 terabytes. One terabyte equals 1,000 gigabytes, or 50,000 trees made into paper and printed. So 51 terabytes equals about 2.5 million trees turned into printed pages. (The comparison comes from information posted at UC Berkeley and credited to scientist Roy Williams of the California Institute of Technology.)

In 2010, the assessment estimated, the demand for hosted data storage from UC Davis will swell to a nearly unimaginable 10 petabytes. Each petabyte is 1,000 terabytes. Added up, that's equal to 500 million trees turned to print.

The report, though, only goes so far. The response rate was 59 percent, and it didn't reach everyone it should. That became clear when Doug Hartline, the director of technology, planning and development for IET who ran the survey, started getting responses from people on campus neither he, nor the people he consulted, had known to ask. The unexpected responses came from researchers who learned about the study from their peers.

In a decentralized campus, the existence of research largely unknown outside the host department is not surprising. But it creates another hurdle tech planners must overcome: Identifying and extracting requirements from researchers who mostly work on their own.

Mark Redican, director of the campus Network Operations Center (NOC), hears about some needs indirectly. A faculty question or requirement might reach Siegel's office, who then asks NOC about it.

"So part of our challenge is to get out there and beat the bushes," he said. "We've got to go to them."

Conferences this spring

On April 5 and 6, the campus scheduled a cyber-infrastructure workshop in the Genome and Biom edical Sciences Facility. The goal: Consult numerous campus faculty to assess what researchers require, and to analyze where the resources available to the campus fall short. The results, plus an initial map for developing cyber-infrastructure at UC Davis, will be shared with the campus this spring. (Read about what happened at the workshop here.)

Siegel and Barry M. Klein, vice chancellor for the Office of Research, sponsored the event.

"Cyber-infrastructure is an increasingly important issue for the campus, and for higher education generally," said Babette Schmitt, director for strategic planning and communications in IET. "We're bringing together campus researchers, campus technical specialists, and representatives from national and regional cyber-infrastructure organizations so they can exchange ideas, discuss challenges, and explore opportunities for future collaboration."

The dialogue will help IET and the Office of Research understand the key campus challenges in cyber-infrastructure, as well as how to address them.

Based on these talks, Schmitt said, "we will start strategizing about ways in which we can improve support for the demands of current and future research computing."

While many faculty can get what they need from the campus information technology infrastructure, timely upgrades and expansions are necessary, said Bernd Hamann, a professor in computer science and associate vice chancellor for research.

The campus has untapped potential for interdisciplinary work, he said, such as engaging in major next-generation complex computational simulations.

Most of what researchers and graduate students need lies in the areas of computing power, space for computers, data archives and networking, Hamann said. The vast digital archives used by astrophysicists, cosmologists and high-energy physicists, for instance, require the ability to store, mine and explore these data sets in a distributed computing environment, where scientists work together all over the world. That requires advanced networking capacity.

In engineering, researchers looking into next-generation, simulation-based material design need space for clusters of computers, plus access to the clusters. In social sciences and the humanities, researchers increasingly rely on major digital archives in areas including history, economics, and cultural heritage. Their work often calls for remote access to digital data, too.

Building an advanced infrastructure, Hamann added, will allow the campus to attract the next generation of faculty and researchers who rely on this infrastructure in their work.

Some of the added IT capacity can arrive in a few years. Some must arrive by summer.

Too much heat ahead

Expanding demand has given Morna Mellor, director of the Data Center and Client Services in IET, an urgent task this year--either juice up the cooling power of the campus Data Center, or send some of its computing load off campus.

The center added more electrical power and air conditioning in 2006, blocking in windows and adding high-volume cooling ducts. But the center's capacity is still too small for the heat its machines now produce, so the building cannot continue as is through another scorching Davis summer.

Short term, the campus might outsource some of its data-storage needs if it can't augment the Data Center in time. Longer term, it's looking at building a new data center, perhaps in four to six years.

Another key question is how to organize the infrastructure. Broadly speaking, the us ual choice is between centralizing resources and distributing them throughout the campus, to allow researchers and faculty easy access to servers near their labs and offices (often called co-location).

Each option has its arguments.

There's a logical hierarchy for structuring the work, Hartline said: department, campus, regional and national levels. Some tasks should be done at each level. But by working together on common IT needs, researchers could devote more of their grants to their work, not to buying equipment to support the work.

The probable result, Schmitt said, is better research.

Making room for virtual orchestras

Here's a hypothetical example. Under the decentralized system the campus has generally followed, researchers in various fields, working separately, might acquire six servers for their projects. But centralizing and coordinating new infrastructure might let them meet their needs with four.

A more centralized structure would also help the "have-nots" by giving them access when the researchers aren't using the servers. Transmission demand for research varies. It might spike for 20 minutes at 700 megabytes per second, Redican said, then go away for a couple of weeks.

"With research," he said, "you have to build the network to handle your peaks."

The have-nots include faculty in the arts who want access to high-capacity networks and storage. Artists have used high-performance computing to conduct virtual orchestras, choreograph dancers working simultaneously on stages thousands of miles apart, or compare paintings against images stored in archives around the globe.

Centralization also maximizes the efficiency of cooling systems and other support equipment. The IET survey lists more than 40 high-performance computer applications that are driving the demand for cluster computing. They range from protein folding and modeling functional brain activ ation images to simulating optical networks and simulating human exposure to contaminants through groundwater. As a group, it's very advanced work.

S.J. Ben Yoo, a professor in Electrical and Computer Engineering and UC Davis director of the Center for Information Technology Research in the Interest of Society (CITRIS), conducts applied research in ultrahigh-performance networking and computing involving healthcare, the environment, and other areas.

His work on peta-scale optical routers and realizing a "data-center-on-a chip" could produce tremendously useful changes--such as allowing the immediate transmission and routing of very high-density 3-D images between data centers, medical facilities, and patients, perhaps even from ambulances and desktops. That way patient care, even in remote locations, could start within minutes.

"The prospects for realizing a world-class UC Davis cyber-infrastructure are extremely bright," he said, crediting "strong collaborations between the faculty and IET under new leadership and initiatives."

Yoo prefers cyber-infrastructure resources to be co-located and distributed on "a seamlessly networked platform," but sees value in centralizing core services. ?Either can work,? he said. "The key is the reachability of support."

The cost of a major building

Siegel, who joined UC Davis last August, doesn't see just one plan emerging from this year's discussions. The campus needs to get a handle on some core pieces of infrastructure and services, and that will lead to specific initiatives, he said. But cyber-infrastructure is important enough that planning must continue all the time.

The consensus of outstanding faculty, Siegel said, will be a major influence in shaping campus priorities. Some of the work at UC Davis might be done by IET, and some might be done by other parts of the campus with support from IET.

Speaking generally, Siegel added, major research campuses consider it likely that they will spend as much on their cyber-infrastructure upgrades as they would on a major new building.

Interdisciplinary and multi-institutional partnerships are likely to grow as an area of federal research. That's good news for the campus financially. "Effective and agile institutions," Siegel said, "are likely to be the most successful at getting stable or increasing federal funds."

The campus, he said, "has a well-deserved national reputation for collaboration and interdisciplinary projects that actually work. Cyber-infrastructure can play an enormous role in amplifying that work."

"Research excellence is a priority," Siegel said. "This is crucial."