One of the most valuable impacts that technology can have is in the area of healthcare and scientific research. Aside from the cost savings and process improvements that technologies like cloud and data analytics can have, it can also be used to uncover previously unattainable goals.
Technology is playing a vital role in helping scientists to find cures and new treatments for things like cancer, Ebola, the Zika Virus and more.
CBR’s James Nunns spoke to Jean-Christophe Ducom, high-performance computing manager, IT services at The Scripps Research Institute.
JN: What is your role at Scripps Research Institute and what does Scripps do?
JCD: “I am HPC manager, Information Technology Services at The Scripps Research Institute (TSRI). TSRI is one of the world’s largest independent, not-for-profit organisations focusing on biomedical research.
“We have approximately 2,700 employees spread over campuses in La Jolla, Calif., and Jupiter, Fla., and a roster of renowned scientists (including two Nobel Laureates) who collaborate on ground-breaking discoveries.”
JN: What research areas does Scripps focus on?
JCD: “We have 12 academic departments including Cancer Biology, Immunology and Microbial Science, Metabolism and Aging, Neuroscience and so on, but these departments have “soft boundaries”. This means that, rather than defining circumscribed areas of research, our faculty members have the freedom to follow their intellectual curiosity where it leads them.
“In real-world terms, we’re helping to lay the foundation for new and innovative ways to treat cancer, rheumatoid arthritis, haemophilia and other diseases, and we have also been at the forefront of combating infectious and deadly viruses, such as HIV, Ebola and Zika.”
JN: What technology, tools, and processes does Scripps use in its research?
JCD: “Technology-driven research has always been a hallmark of the institute, so we provide research groups with access to the latest technology for both instrumentation (for example microscopy and sequencing) and analysis of the data this produces.
“One of the latest technologies we are pioneering the use of is cryo-electron microscopy (Cryo-EM). The significance of Cryo-EM is that it enables scientists to look more closely at the inner workings of organelles (tiny structures that perform specific functions within a cell) and to study the structure of medically important proteins.
“Critically, it delivers atomic-level, high-resolution 3D molecule models with unprecedented speed and accuracy and provides a more complete description of molecular movements than previously possible.”
“To support our use of Cryo-EM we’ve made other significant changes. We’ve recently deployed state-of-the-art transmission electron microscopes. We’ve developed new software to automate capture and analysis of massive amounts of data generated by the latest microscopes. And we’ve also developed a new extensive and sophisticated processing pipeline to streamline image analysis, so that scientists can move quickly from raw data to 3D structures.”
JN: What are the technology challenges that research organisations like Scripps face?
JCD: “One of the greatest technology challenges faced by research organisations is in dealing with the growing amounts of data produced as instrumentation technology advances: firstly to simply manage this data and, secondly, to most effectively harness it, so that research can be achieved at the highest possible level.
“Cryo-EM has quickly become the biggest producer of data at TSRI, yielding four times more output than our genomics workloads. We’re collecting about 30 TBs of data each week, and obviously this major surge in data acquisition created an urgent need to extend our existing storage infrastructure.
“As a temporary fix on several occasions, for example, TSRI scientists were forced to archive existing data so that they had sufficient space to continue running the microscopes.”
JN: How are you dealing with data growth?
JCD: “For more than a decade, we have refined our high performance computing (HPC) and storage infrastructure as needed to handle growing amounts of data. We incorporated high-performance parallel file storage from DataDirect Networks (the DDN SFA10K® GRIDScaler®) to support our HPC environment five years ago, and recently expanded it by 700TB.
“And when it’s time to take a fresh look at our needs we do so. So when it became clear that Cryo-EM needed its own storage to keep up with ever-increasing data demands, we turned our attention to finding a departmental solution that could be dedicated to Cryo-EM research.”
JN: What are the storage requirements around active research data and archiving? Where do you store the data?
JCD: “The requirements for active research data are that it must be easily ingested, processed, stored, archived and shared. Storage needs to be easily and cost-effectively expanded to meet escalating requirements.
“When it comes to archiving, the main issue is that we need to equip researchers with a simple and expedient way to archive older data to free up space, yet keep it readily available for later re-analysis when required – what we call an “active archive.”
“With our Cryo-EM storage needs, we achieved this with a combination of DDN’s SFA7700X GRIDScaler® appliance and WOS® object storage appliance with 2PB of capacity.”
JN: What are the benefits of using both a parallel file system and object storage?
JCD: “The benefits of using both systems is that the parallel file system offers ample capacity to store up to six months of active research data for immediate accessibility while older project data can be moved automatically to the object storage system, WOS.
“We achieve significant cost savings by moving older project data from primary storage to the less expensive object storage platform while, crucially, maintaining accessibility for collaboration. And because data movement is transparent, users don’t need to know where files are stored, which is key.
JN: How has this approach helped scientists and researchers at Scripps?
JCD: “It’s helped a great deal. Because data movement is automated and transparent, approximately 50 scientists across six research groups have instant access to robust storage to fuel Cryo-EM research and collaboration. Our scientists based in California can share research with their counterparts in Florida and also with thousands of scientists around the world.”
JN: What tools and processes were you using prior to technology advancements? Can you quantify the impact new technology has had?
JCD: “New technology has had a huge impact. Because Cryo-EM delivers atomic-level, high-resolution 3D molecule models with unprecedented speed and accuracy, structures that once took years or even decades to fully understand now often can be elucidated in weeks.
“This breakthrough allows us to design drugs or vaccines to combat a great swath of diseases.
“One of our recent studies, for example, used Cryo-EM technology to capture the structure of the HIV protein responsible for recognition and infection of host cells. The resulting images included a more complete depiction of the protein structure than ever seen before. Study findings also included a detailed map of a vulnerable site at the base of this protein, along with a binding site of an antibody that can neutralise HIV. This gives researchers a better idea of the most important factors to consider in the development of an HIV vaccine.
“To give you another example: in years past, using X-ray crystallography, it could take a year or more before researchers could look at the protein structure of diseases such as Ebola and the Zika virus in order to develop antibodies. With Cryo-EM and the ability to collect and analyse data really fast, we can turn that discovery process around in a matter of weeks, which was previously unheard of.
“In a nutshell, Cryo-EM is a complete game-changer in the world of scientific research. By harnessing data that holds the secret to life-saving discoveries effectively, we can accelerate time to discovery while ensuring that scientists have immediate access to decades of vital research.”
JN: Do you bring in data from other sources in order to help with your research? Particularly thinking around Ebola and Zika, using data provided by other institutes in order to identify patterns etc.
JCD: “Yes, collaboration is key. Scientists of The Scripps Research Institute (TSRI) are involved in a number of scientific consortia. Each consortium brings together the knowledge, resources, and expertise of a number of research institutes and organizations to pursue a specific line of scientific inquiry. By collaborating with other institutes, each member of a consortium is better able to leverage its resources to advance scientific knowledge.”