Beginning his career as a biologist exploring evolution and ecology, and later, phylogenetics at the Smithsonian Institution, Dr. James Wilgenbusch, Ph.D., was drawn to the statistical side of these fields. “I was intrigued by the computational component of phylogenetics and how phylogeny inferences are computationally intensive,” says Wilgenbusch. “As I was joining this field, I discovered there were several interesting statistical issues that had yet to be solved, as well as a lack of resources to address these very large computational problems. I decided to roll up my sleeves and figure out how to conduct the type of analyses that would develop a better picture of how things worked. This included exploring the history of the evolution of organisms, which we were inferring from DNA sequence information extracted from the organisms.”
As computing capabilities began to grow in the late eighties and early nineties, Wilgenbusch says it was an exciting time of discovery as they explored the tools and the compute environments needed to advance this work. “You could make real impacts in terms of developing large scale systems that would address some of the emerging challenges that we were facing across a variety of disciplines, not just phylogeny. I believe this experience brought me to my current career as the Director of Research Computing, in the Research and Innovation Office at the University of Minnesota (U of M), Twin Cities. At that time, I had the opportunity to work in high performance computing, supercomputing, and research computing; putting me at the nexus of people who were trying to solve problems from just about every conceivable discipline, from the humanities, to medicine, to astrophysics. I truly enjoy engaging with multiple disciplines and over the last twenty years, my role has grown to serve the entire research and scholarly community.”
Steering Computational Research
Dr. Wilgenbusch’s extensive experience has helped inform his approach to his current role as Director of Research Computing and his involvement in promoting and developing research cyberinfrastructure. “In the past, we frequently had to make a case for computation as the third pillar of scientific research in combination with theory and experimental approaches,” says Wilgenbusch. “Now we have the fourth paradigm, which is focused on data intensive research, which includes areas like artificial intelligence (AI) and machine learning (ML). Data has certainly been an enabling factor that has transformed many disciplines and put more demands on research computing. Having experience at these different levels of research enablement has allowed me to better steer the strategic direction of research projects, including project requirements, expectations, and timelines, and ensuring that faculty and researchers are supported at all levels and stages of their effort, which has helped to boost the success of each project.”
Heading up four major computational and data intensive research areas under the Research Innovation Office, Wilgenbusch leads University-wide initiatives, collaborates with collegiate senior administrators, and works with faculty to expand research computing and new programs. In Spring 2023, a large group of thought leaders from federal and state agencies, the private sector, and U of M convened to develop a strategic plan called Research 2030 that outlined their vision and priorities for increasing their impact locally, nationally, and globally and expanding the recognition and competitiveness of the University’s research.
“We continue to advance this plan and feel we have a lot to offer around equity and access,” says Wilgenbusch. “U of M has a long history in biomedicine and biomedical research, so that will be among our top priorities. We also have fantastic research happening in environmental science, agriculture, and forestry. Being in the upper Midwest, there is a tremendous amount of research being done to determine ways to minimize our global footprint, specifically in terms of CO2 production and preserving our water resources. We are exploring new ways to manage our resources that help minimize the impact on our environment. UMN-based programs and institutes like AI-CLIMATE and the Institute on the Environment are good examples of these efforts.”
“In the past, we frequently had to make a case for computation as the third pillar of scientific research in combination with theory and experimental approaches. Now we have the fourth paradigm, which is focused on data intensive research, which includes areas like artificial intelligence (AI) and machine learning (ML). Data has certainly been an enabling factor that has transformed many disciplines and put more demands on research computing. Having experience at these different levels of research enablement has allowed me to better steer the strategic direction of research projects, including project requirements, expectations, and timelines, and ensuring that faculty and researchers are supported at all levels and stages of their effort, which has helped to boost the success of each project.”
— Dr. James Wilgenbusch
Director of Research Computing
University of Minnesota
Enabling Research Computing
Developing world class research does not happen overnight and Wilgenbusch says the higher education community needs to help shape the public’s perception of research. “We must work with public entities to help them better understand the research life cycle, how it’s done, why it’s done, the potential impact, and what societal benefits could be provided. The work that we do in research computing has much to do with how we scale up problems. While we continue to explore and develop tools that scale up with large data sets, we must also address the variety of big data that can be used to solve problems. We often discuss big data in terms of its volume, velocity, value, variety, and veracity. The variety of data needed to address real-world problems can be a real challenge, in no small part because of issues related to interoperability.”
“We’re trying to significantly reduce the time to results by introducing key metadata at the beginning, when data is uploaded into the analysis platform,” continues Wilgenbusch. “We aim to enter metadata more seamlessly, using automation where possible while respecting a variety of different ontologies and data vocabularies. The University also understands we must be good stewards of data and create frameworks that respect the personal privacy and intellectual property concerns linked to some data sets. I think we’ve made some significant strides in protecting research data, and as a result, we hope to demonstrate the value of this approach to the public.”
Creating a framework that enables the entire gamut of research computing resources at U of M has allowed the University to scale up staffing and expand the variety of research domains that are being supported across the institution. “The University’s commitment to our mission of supporting computation and data intensive research is the bedrock on which we build trust among our researchers,” shares Wilgenbusch. “By investing in our research enterprise, we develop good governance and are able to plan more effectively because our faculty understands the landscape of computing and data resources that will be available to them.”
“The framework for supporting the research computing deployed by Dr. Wilgenbusch at the University of Minnesota is a model to be replicated,” says Dr. Forough Ghahramani, Assistant Vice President for Research, Innovation, and Sponsored Programs for Edge. “As a research and education (R&E) network, we are in a unique position to share this best practice with the community.”
Forming Collaborative Partnerships
Wilgenbusch hopes the framework created at U of M can be generalized to apply to other academic institutions and help those organizations expand the reach of their own research and innovation. “This model is similar to what has been broadly implemented around cluster computing,” explains Wilgenbusch. “For example, there is a range of expectations from open resources that are available on a first come, first served basis, to completely dedicated resources that are available at the time and for the duration that they’re needed. We also see that staff efforts can be modeled in a similar manner and we are applying a similar framework to the human element of computing and data to set clear expectations. For instance, staff are available to address general questions, but there is also a need for dedicated staff who can engage with groups on a regular and dedicated basis, which could constrain them from being more generally available.”
“We’ve structured the model in such a way that we’re available for 85 to 90 percent of the people using our systems without any chargeback,” continues Wilgenbusch. “For those individuals whose needs exceed what we can support sustainably, then some shared financial contribution will need to be established. I believe this general framework could be customized according to each institution, based on their appetite for shared contributions to the research, computing, and the data resources that they maintain and their own individual journey of building a successful research computing environment.”
In developing this framework, Wilgenbusch says one of the most important lessons is valuing partnership and avoiding trying to control every aspect. “Forming partnerships is essential because you cannot do it alone. I look to these partners for their unique perspectives and expertise and depend on them for support to advance the research mission. I want to bake this ethos into everything that we do within our university community.”
“Regional research and education networks play an important role in enabling these collaborative partnerships and providing essential infrastructure that institutions need to advance their research initiatives. There is an increasing need for team science where facilities are aligned with the compute infrastructure and the regional and national networks. This allows us to do a better job creating an environment for teams to effectively work and collaborate and fill that missing middle where data intensive research happens.”
— Dr. James Wilgenbusch
Director of Research Computing
University of Minnesota
Advancing Research on a Grander Scale
To answer questions promptly and respond to researchers, U of M uses a tiered model of support that ranges from general needs to highly dedicated needs. “We have general consultations through a basic ticketing system that we answer in a timely fashion,” explains Wilgenbusch. “This is part of our core service that is available at no cost to our university researchers. As we move along the continuum toward dedicated, we have what we call collaborative engagements. These are typically projects where we may dedicate some internal resources and funding to a limited number of collaborative efforts, especially when these efforts line up with the greater good of the community and result in solutions that are more generally applicable to other groups. At the other end of the spectrum are our partnership engagements, which involve significant resources and require direct charge back, so that we can sustain our work. We have over two dozen staff who are engaged in these types of dedicated partnership agreements where research computing staff may be working at a rate of 100 percent for a particular partnership. The terms of these partnership is described in a memorandum of understanding, and we have templated this memorandum so it can easily be used in other contexts.”
Dedicated to helping advance research on a national scale, Wilgenbusch plays a variety of roles in organizations such as the National Science Foundation’s Big Data Regional Innovation Hubs, the Coalition for Academic Scientific Computation (CASC), and Campus Research Computing Consortium (CARCC). “I hope I’m bringing value to the broader community through my experience,” says Wilgenbusch. “Engaging with others in the education community allows us to gain insights into what is working on our campuses, regional organizations, and beyond. This bidirectional exchange of information is hugely valuable and allows us to effectively execute research that has a global impact. Regional research and education networks play an important role in enabling these collaborative partnerships and providing essential infrastructure that institutions need to advance their research initiatives. There is an increasing need for team science where facilities are aligned with the compute infrastructure and the regional and national networks. This allows us to do a better job creating an environment for teams to effectively work and collaborate and fill that missing middle where data intensive research happens.”
Leadership at U of M is focused on making a broader impact within the State and the region and helping to address the equity and access issues that many institutions face. “To continue to fulfill its land grant mission, U of M sets diversity and inclusion as a strategic priority and aims to help close opportunity gaps and empower more individuals to pursue their dreams of higher education,” says Wilgenbusch. “We must also support equity within the research space and give researchers the tools they need to be successful, whether they’re at a national institute or within a smaller, more specific area of research. Many federal funding agencies are now asking investigators to address how they’re broadening participation in their research, and we are developing a suite of programs that help form these working relationships between smaller institutions and larger research universities.”
“I’m looking forward to playing a role in creating these internal and external pathways for groundbreaking research and clearing the barriers between the silos that can often form within an organization,” continues Wilgenbusch. “It’s a privilege to be directly involved in the many layers that affect research and shape research computing at the University. I’m excited to further grow the partnership framework and help great minds come together to advance statewide, regional, and national research.”