The purpose of this page is to describe a cloud-aligned research project with the purposes of advancing scientific understanding of the hydrology of the mountain ranges of High Mountain Asia. The project lead for this NASA-sponsored High Mountain Asia Team (HiMAT) is Dr. Anthony Arendt, a UW Research Scientist.
Objective and Approach
- Recent activity
- Juneau meeting with Chinese Academy of Sciences: Completed with presentation decks recovered
- Work in progress
- Accumulating a comprehensive overview of the NASA projects and interrelationships
- Selected contributor datasets placed in cloud storage
- Jupyter Hub installed on the cloud
- Example scripts running on Jupyter Hub
- Idea: Progress towards comparative processes
- High-Level Objectives
- Work to characterize degree of certainty in HIMAT program results / outputs: “How sure are you?”
HiMAT data infrastructure
A core feature of the HiMAT project is the construction and utilization of data sharing tools to foster efficient collaboration, reproducible research and enhanced stakeholder engagement. Our cloud-based data infrastructure aims to addresses several of the challenges that often limit effective collaboration in such projects:
cross-team collaboration necessitates the sharing of preliminary data products which are not yet fully validated and may not be ready to share with the public. Existing data infrastructures primarily store completed datasets on public-facing servers. So there is a considerable gap in our provisioning of privately accessible, cloud-based computational tools for this kind of research.
datasets to be generated by HiMAT are particularly voluminous, for example high resolution satellite imagery. Many existing data centers are not set up to handle data of this size, and even if they are, it is unreasonable to be downloading datasets this large to local machines. Therefore we need methods to co-locate our processing/analysis with the location at which the data are stored.
the development collaborative tools calls for some degree of customization in our computational infrastructure if we are to integrate our products and provide decision support to the region. Therefore investigators need to have full access to both front and backend computational components, without the need to submit requests to third party agencies.
To address these challenges we are designing a multi-tiered approach to data handling: