Our Assets: Regional Hardware/Networking Assets

Children’s Mercy Hospitals

The Core of Genetic Research, directed by Dr. Shui Qing Ye, is equipped with a compute cluster (Advanced Computer Clustering, Inc., Kansas City, KS).  There is one head node with six compute nodes installed in a dedicated rack with room for several more nodes. It comprises a 96-core cluster with 384GB of DDR3 RAM and 48TB SATA hard drives (6 x dual 8-core Intel Xeon E5-2670 “Sandy Bridge” 2.6GHz processors), Quantum SuperLoader3 2U 16 tape library for backups. It is located within a dedicated data center with environmental controls, conditioned power, and hospital emergency back-up power. Deployed on the cluster are the latest versions of GATK, CASAVA, Bowtie, TopHat, Cufflinks, R with CummeRbund, Python with NumPy and SciPy, BreakDancer, Plink, and Haploview software applications.

Center for Pediatric Genomic Medicine, directed by Neil Miller, has five full time bioinformatics/IT staff.  The Center’s compute resources are located within a dedicated data center with environmental controls, 15 tons of air conditioning, conditioned power, hospital emergency back-up power and 45kVa UPS capability. The compute resources comprise a 608-core Linux compute cluster with 6TB of DDR3 RAM and 20TB SATA hard drives (20 x 12-core Intel Xeon X5670, 8 x 16-core Intel Xeon E5-2650 and 12 x 20-core Intel Xeon E5-2660 ), redundant head nodes (12-core Intel Xeon X5670 with 48GB RAM and 500GB SATA drive), a pipeline server (12-core Intel Xeon X5670 with 96GB RAM and 1TB SATA hard drive), Isilon X400 storage system with 810TB usable capacity, SGI Infinite Storage Gateway disaster recovery and backup appliance with 160TB usable capacity, Spectra Logic T950 tape library with 2.4PB uncompressed usable capacity, redundant web servers (12-core Intel Xeon X5670 with 48GB RAM and 500GB SATA drive), and database server (12-core Intel Xeon X5670 with 96GB RAM and 16TB SATA drives) on which are deployed the LIMS, GATK, GSNAP, CASAVA, SSAGA, CMH variant warehouse, RUNES and VIKING software systems. The data center is adjacent to the room housing the DNA sequencers, which also features environmental controls to maintain ambient temperature at 65 degrees C, conditioned power, hospital emergency back-up power and substantial UPS capability.

Kansas State University

Beocat/Institute for Computational Research in Engineering and Science
Daniel Andresen, dan@ksu.edu, 785‐532‐7914
http://www.beocat.cis.ksu.edu

One of the largest academic research clusters in Kansas, Beocat consists of approximately 2PB of storage and ~3100 processor cores on machines ranging from dual‐processor Xeons with 128GB RAM to six 80‐core Xeons with 1TB RAM connected by 40Gbps QDR Infiniband. Beocat acts as the central computing resource for multiple departments across campus, and heavy users can “buy in” through adding computational or personnel resources for the cluster. Any researcher in the state of Kansas can access Beocat for free.

KSU Bioinformatics Center
Sue Brown, Sjbrown@ksu.edu, 785‐532‐3935
http://bioinformatics.k‐state.edu

Provides training and support for genomics, transcriptomics, and NGS approaches to address biological questions. As the US‐certified service center for BioNano Genomics, we produce genome physical maps based on imaging of ultra‐long DNA molecules.

Saint Luke’s Health System

Biostatistics Core within the Cardiovascular Research Center at MAHI (SLH), which currently includes 12 statisticians (3 with doctorates, 9 with master’s degrees), has extensive experience in the analysis of clinical and health services data.  The statisticians are highly versed in issues pertaining to observational data, including selection bias, missing data, propensity and instrumental variable models, latent class and variable analyses, Bayesian analyses, multivariable and hierarchical modeling, multiple imputation, etc. A thorough assessment of potential biases is essential in every study and often comprises the majority of analytic work. The team stays abreast of current methodological research and has a strong track record of applying new techniques to “real world” problems.  The statisticians have and use state-of-the-art statistical packages, including SAS, SPSS, WinBUGS, R, S-Plus, M-Plus, PASS, STATA, TreeAge, etc. Finally, the statisticians have a strong commitment to working with trainees to enable them to refine their data management and analytic skills under close supervision, a critical step to independence.

University of Kansas

KU Gene Mapping: Genotype Mapping and Structural Genomics

  • Genome Sequencing Core lab in Haworth
  • KINBRE Bioinformatics facility in Haworth

KU RNAi: The Natural History museum DNA sequencing lab, the COBRE genomics sequencing lab and the Genomics center have capabilities in this area.

KU Transcriptomics: Gene expression

  • Genome Sequencing Core lab in Haworth
  • KINBRE Bioinformatics facility in Haworth

KU Epigenomics

  • Genome Sequencing Core lab in Haworth
  • KINBRE Bioinformatics facility in Haworth

KU Biostatistics: The CRMDA quantitative psych facility/core fulfills this function.

KU Applied Bioinformatic Core: The molecular graphics core, CRMDA, and Luke Huan’s laboratory make up about a 75% rating.

KU Large Dataset Capabilities: We have large dataset storage and handling capabilities in the Center for Research Computing and with the Research File Storage system.

University of Kansas Medical Center

Bioinformatics Computing Resources (KUMC)

Hardware

Linux x86 64 Cluster

  • 3 nodes with 8-core and 48GB memory each

Linux x86 64 servers

  • 12-core with 72GB memory
  • 24-core with 128GB memory
  • 12-core with 144GB memory

Windows Workstations

40TB Data Storage

Software

We use a number of open-source software packages such as R for data processing and analysis. Licenses of proprietary software are as follows.

Matlab

  • Bioinformatics Toolbox
  • Computer Vision System Toolbox
  • Curve Fitting Toolbox
  • DSP System Toolbox
  • Fixed-Point Toolbox
  • Image Processing Toolbox
  • Neural Network Toolbox
  • Optimization Toolbox
  • Parallel Computing Toolbox
  • Signal Processing Toolbox
  • SimBiology
  • Statistics Toolbox
  • Wavelet Toolbox

Partek Genomics Suite

Acumenta The Literature LabTM

Ingenuity IPA

IBM SPSS

CLC Genomics Workbench

Bioinformatics Services

We provide a variety of services in bioinformatics and computational biology. Some of these services are listed below. Please feel free to contact the Bioinformatics Core for more information on these services. The Bioinformatics Core will also be happy to discuss with you the feasibility of supporting customer applications specific to your research.

High-throughput sequencing

  • RNA-Seq: provides an unbiased deep coverage and base level resolution of the whole transcriptome. Has a low background signal and does not have an upper limit of quantification.
  • Chip-Seq: combines chromatin immunoprecipitation with high-throughput sequencing to provide an unbiased whole genome mapping of the binding sites of DNA-associated proteins.
  • Whole Genome Sequencing: sequences the whole DNA sequence of an organism’s genome.
  • De novo Sequencing: provides the primary genetic sequence of an organism.
  • Metagenomic Sequencing: sequencing and identifying the genomes of whole microbial communities.
  • Methyl-Seq: analysis of methylation patterns on a genome wide scale.

Microarray analysis

  • Affymetrix 3′ Expression Arrays: target the 3′ end of genes.
  • Affymetrix Exon Arrays: provides expression levels for every known exon in the genome.
  • Affymetrix miRMA Arrays: provides measurements of small non-coding RNA transcripts involved in gene regulation.
  • Affymetrix Genome-Wide Human SNP Array: copy number analysis
  • Affymetrix GeneChip Tiling Arrays: gene regulation analysis

Biological Functional and Pathway Analysis: we have software from Ingenuity Systems (IPA) that can analyze your expression data to ascertain the top biological functions and pathways associated with them.

Biological Literature Survey: we have software from Acumenta (Literature Lab) that helps perform data mining tasks on experimentally derived gene lists.

miRNA target prediction: we use in house software and open source software such as TargetScan, miRanda for detecting genomic targets for miRNAs.

Transcription Factor Binding Site Prediction: we use in house software and open source software such as MEME, Homer, PGMotifScan to identify protein DNA interaction sites.

University of Missouri

Informatics Institute: MUII’s mission has three components: education, research, and outreach. To have a global impact in advancing computational research in biology and medicine, the stakeholders across the University of Missouri System believe it is critical to the mission of the university and the development of the community to offer a doctoral program in informatics, build an international-level research program, and service to scientific communities for informatics needs. For information on the Core Faculty, see: http://muii.missouri.edu/people.php  MU faculty and students have developed a wide array of informatics tools and databases. Please see the listing at: http://muii.missouri.edu/resources.php

Informatics Research Core Facility: The mission of the IRCF is to facilitate research and education through the development of computational resources. Computational resource development ranges from software creation to database design to hardware configuration consulting. IRCF staff members are skilled in these activities and are available for a range of services. The IRCF is committed to serving the informatics needs of the research community. Learn more at http://ircf.RNet.missouri.edu:8000/.

Bioinformatics Consortium: The University of Missouri Bioinformatics Consortium (UMBC) maintains a high performance computing, networking and storage infrastructure to support research across the UM-System. The UMBC provides centralized, high capacity data storage and analytical tools that can be used over high-speed Internet2 connections in support of research. The UMBC coordinates the provision of high-performance computational systems to analyze massive sets of data, very large storage devices to house major data collections, high speed networking services to facilitate location independent access and collaboration among investigators, software applications supporting system-wide computation research and technical support staff. Scalable high capacity storage is available to house many widely used and specialized genomic databases and analysis tools and packages to support the analysis needs of the research community. These systems, and various specialized laboratory equipment (e.g., high throughput sequencing systems and electron microscope), are connected via the 10G research network that is accessible to all four University campuses as well as the national Internet2 network and other international networks widely used by the higher education research community. For more information, see: http://umbc.rnet.missouri.edu/

MU is acquiring additional HPC resources. A significant upgrade to the ‘general purpose’ HPC cluster has been purchased, and will be installed in the weeks ahead. In addition, a cluster to meet the high-memory requirements of MU’s genomics researchers will be purchased soon. In addition, funds have been provided to purchase additional disk storage for research computing.

MU’s current HPC system has 190 nodes that include Dell 1850/1950 dual processor/dual core with 2.8 GHz Intel Xeon EMT64 processors and 640 GB RAM. Additional HPC infrastructure access is available through the University of Missouri Bioinformatics Consortium (UMBC) system that consists of 8 nodes of IBM 3850M2 quad core compute nodes (64 processors) with 24GB RAM on 7 nodes and 48GB RAM on 1 node, and two IBM DS3400 storage nodes with 24TB data capacity connected through a GPFS parallel file system on a QDR Infiniband low latency interconnect. In late 2013, MU purchased over 500 TB of usable disk storage from EMC/Isilon and HP/3PAR, thereby bringing total research computing disk storage to more than a petabyte of usable storage. This storage environment supports current generation QDR Infiniband (up to 80 gbps bandwidth) and two IBM 3850X5 compute nodes each with 500GB RAM and an IBM 3850M3 node containing 2 Tesla M2070 GPU boards that enables MU researchers to begin testing home-built algorithms and applications using GPU technology. As a result of the upgrades made to the research CI over the past two years, the systems are positioned so further upgrades can be made without major processing disruptions. Learn more at   http://umbc.RNet.missouri.edu/resources/.

  • Storage Resources (connected to the computing resources via high-speed data links).
  • 110 TB EMC storage array managed by the IBRIX distributed file system.
  • 948 TB Isilon storage cluster.
  • 24 TB IBM DS3400 storage array.
  • 12 TB SGI TP9500 InfiniteStorage storage array.
  • 112 TB 3PAR storage array.

Networking: MU’s research network infrastructure and connectivity enable MU researchers to leverage and participate in various national-level advanced cyberinfrastructure efforts such as InCommon Federated Identity Management Service, XSEDE for HPC/Big Data resources/expertise access, and GENI Future Internet Testbed. In fact, MU researchers are leading several GENI experiment efforts and are developing Gigabit apps using national-level testbeds involving Health Care and Advanced Manufacturing communities, as part of the US Ignite initiative supported by NSF and The White House Office of Science and Technology Policy. MOREnet also connects Rnet today at 10 Gbps network speeds (100 Gbps connectivity future option is possible with MOREnet’s current fiber and optical infrastructure) to international R&E network peering points such as StarLight in Chicago, and thus MU researchers have the ability to create and participate in international R&E testbeds with researchers worldwide.

Internet2: Internet2 is an advanced networking consortium led by the research and education community spanning US and international institutions who are leaders in the worlds of research, academia, industry and government. The Internet2 community is developing breakthrough network technologies that support the most exacting applications of today—and spark the most essential innovations of tomorrow. MU has belonged to Internet2 since its inception in 2000. Learn more at http://www.internet2.edu.

Internet2 Innovation Campus: MU is one of a select few universities designated as an Internet2 Innovation Campus. These campuses form the building blocks of a nationwide Research & Education (R&E) innovation platform, helping to create an environment for innovation at leading research universities. Leveraging the Internet2 Network and enabling services like InCommon federated identity management; Internet2 recently began offering a portfolio of services and discounts to Internet2 members, including cloud services and video services. Learn more at http://www.internet2.edu/network/  or http://internet2.edu/pubs/IS-enabling-innovation-campus.pdf

Rnet: MU was among the first US research universities to create a separate research network (RNet) in addition to the traditional campus enterprise network. Serving MU’s research community since 1999, RNet exists and is administered separately from, but is interconnected to, MU’s high-speed network. RNet has an autonomous set of virtual local area networks (VLANs) that co-reside within the internet address space of the University, but are on a separate high-speed routing and switch infrastructure. There is no charge for using RNet. Major research labs and scientific instruments at MU campus are connected to RNet with 1 – 10 gigabits per second interfaces, which can be configured and have researcher-friendly firewall policies. IPv6 routing capabilities are supported within Rnet due to the dual-stack mode setup and a separate IPv6 address space is reserved for researcher flows. Rnet is also connected to “TigerNet”, which provides high-speed Ethernet connectivity and wireless connectivity to most on-campus offices, classrooms, conference rooms, computing sites, residence halls, and some fraternities and sororities. See also:  http://doit.missouri.edu/research/RNet.html.

RNet Science DMZ (Demilitarized Zone) Protection Environment: A Science DMZ environment is being developed (through 2013-14 NSF CC-NIE funds) with minimal firewall restrictions to enable “friction-free” RNet data flows internally within the campus, and externally to other regional and national network locations. Currently, Science DMZ capabilities include: 100 Gbps connectivity to Internet2 Innovation Platform, perfSONAR multi-domain measurement points to troubleshoot network bottlenecks, Bro based intrusion-detection monitoring of RNet flows, Data Transfer Nodes with RoCE/iWARP capabilities for fast data transfers over wide-area networks, and a OpenFlow switch infrastructure. Several software-defined networking research collaborations with leading industry vendors such as Cisco, Brocade and NEC as well as with remote campuses such as The Ohio State University and Clemson University are aiding in the maturation of the OpenFlow support for domain science researcher use cases on the MU campus. Shibboleth-based authentication and authorization services also enable secure access to Science DMZ resources.

Rnet External Network Connectivity: RNet enables accessibility to high performance computing (HPC) resources throughout the four-campus University of Missouri System through a core fiber-optic network and 10 Gigabit (Gb) optics operated by the Missouri Research and Education Network (MOREnet). This MOREnet connectivity enables MU researchers to connect via high-speed networks and collaboration services such as videoconferencing with Missouri’s 900 node research and education network for higher education, K-12 education, telehealth sites and public libraries, as well as state government and their affiliates. Connectivity to Internet2 from Rnet is available directly if needed, and through MOREnet and the Great Plains Network (GPN) consortium. The direct connection to Internet2 is being upgraded with 100 Gbps connectivity with built-in redundancy provided by MOREnet for route protection to avert network disruption due to faults. The following figure depicts the 100GB network connectivity. See also:  http://doit.missouri.edu/research/RNet.html

Missouri Research and Education Network: MOREnet provides high-speed Internet access to the state of Missouri’s K-12 schools, colleges, universities, public libraries, state government and other affiliates. MOREnet also provides access to online reference resources, technical expertise, security education, videoconferencing, and more. Starting in the mid 1980’s MOREnet designed and the state’s telecommunications providers constructed an advanced, high-speed, high-bandwidth network throughout the state –laying the groundwork for Internet availability to thousands of rural Missourians. MOREnet was one of the first five state educational networks to receive designation as a Sponsored Educational Group Participant (SEGP) for Internet2. Learn more at www.more.net/.

Great Plains Network: Administratively housed at the University of Missouri, the Great Plains Network (GPN) was founded in 1997 to address the needs of the research and education community resulting from increasingly overwhelming use of the public Internet. GPN members include over 20 leading universities in eight states. GPN was the first regional connector to Internet2, and GPN continues to lead in support of research collaboration, education and advanced networking for member institutions. MU and the GPN have been instrumental in developing and proving scalability of Shibboleth as a security model for fine-grained authorization needs across multiple institutions and in large high-speed networks. Learn more at www.greatplains.net.