Hi
Hopefully @StewartBernard_550c does not mind me bastardising his email about our meeting this afternoon .....
There are large external (Phakisa Oceans, Oceans and Coastal Information Management System - OCIMS) and internal projects underway to develop significant marine/aquatic earth observation (& modelling) capabilities. There is also significant new satellite capability to exploit, notably the European Sentinel series (Sentinels 3,2,1 in order of priority). We are looking to best partner with CHPC around capabilities for this. These capabilities are:
Acquisition (±20GB pd typically, growing as more Sentinels are launched or our capabilities mature). Data will mostly be acquired at L1 & L2, ideally through a combination of EUMETCAST-Terrestrial (operational push service, see attached) and downloads from FTP etc. Occasional demand for large data downloads e.g. initially or reprocess...
Processing. Routine near real time processing using Python and Python APIs to e.g. SNAP (ESA), SeaDAS (NASA) etc plus our own algorithms in Python, plus the more infrequent ( 3 - 24 months) research/reprocessing of larger chunks of data e.g. 10 - 100TB.
Storage. Storage of both raw data, intermediate & final products, typically organised hierarchically with metadata in a database. Most of the raw & intermediate products are only processed intermittently after the initial process, with a core front end serving data set of ±2TB queried constantly in real time.
Serving. The primary product servers for OCIMS are in Pretoria so just pass through, but some need for basic data serving here. The data sets will be freely available (logistics aside)
Summary
The SAGrid capability with Sean as contact has capability that can be used reasonably soon i.e. < 1 month, while the existing ACCESS server can continue to be used for immediate acquisition & processing of small volumes. SAGrid capability includes preferential bandwidth, storage of potentially several 100TB, and Infiniband link to the SAGrid compute cluster, with web access to data. Very excellent. Ongoing interaction to be continued with DIRISA around longer term capability but this not likely to be available with a year....
Actions:
@StewartBernard_550c - set up connect between EUMETCAST & @SeanMurray_59b6/Andy to test EUMETCAST-Terrestrial on SAGrid site once ready. Docs attached - first test requires giving EUMETCAST login to basic machine within our network. Our thinking was better to test on SAGrid (rather than ACCESS) as operational system will be located there.
@StewartBernard_550c/@SeanMurray_59b6 - contact @brucellino as SAGrid lead to help formalise and get the EO capability recognised
@SeanMurray_59b6/@StewartBernard_550c - @SeanMurray_59b6 to provide docs for SAGrid workflows to get code approved - Github for CODE-RADE etc
A login node to external users, either ldap01.chpc.ac.za or one of the other sun boxes dedicated.
Some training on submitting jobs.
Questions
- I dont suppose they need their own VO ?
- is there one in europe already for eumetcast ?
Storage
- need to sort out the storage, currently a sun box is connected to the 107B of storage (EMC).
- the above storage is going to be used for stratum ?(1/2/3) of cvmfs lustre, 104TB, must come back, status unknown at the moment, its like running around a bush.
The SAGrid 3rd of the 1.2PB to be ordered can be used as well, but I have no clue on the timelines there.
The rest can come as and when we need it.
Hopefully that gets comments started.
Sean