Student research opportunities

R package development and refinement

Project Code: CECS_656

This project is available at the following levels:
Summer Scholar, Masters

Keywords:

Software implementation, Statistical computation or data mining

Supervisor:

Dr Warren Jin

Outline:

The project has several options on software development, including RAutoClass (a clustering package), ARFSA or HydrologicalMetrics. The following description exemplifies on the last one.

Hydrological flow metrics are measures of specific characteristics of a hydrograph. They have been widely used in Environmetrics community, such as flow regime characterisation, flood event description, floodplain/wetland clustering, and environmental flow determination. During the project work, the Environmetrics group in CCI has developed/implemented a lot of flow metrics, including many of the 120 hydrological metrics in the River Analysis Package developed by eWater CRC, some related with bankfull flow and independent flood events.
This project will systemise these flow metrics, and hopefully develop some new metrics, e.g. to characterise flow regime changes or improve flood frequency analysis. Depending on time, this project may also investigate different options on how to handle missing data.

These flow metrics will be implemented/refined in R statistical language, well-documented, and broadly-tested. They will be available for re-use in other projects in CCI in the future. We are keen to investigate the possibility of wrapping this up as an R package for a wider distribution and ideally subsequent publication.

The project will develop, implement, and test a series of hydrological flow metrics in R. These flow metrics intend to capture widely used characteristics such as central tendency and dispersion in magnitude, frequency, duration and timing of flow events, changes of flow regimes, independent flood events. The R source code will be well documented, tested and possibly distributed as an R package to the public.

Goals of this project


  • Implement and test a suite of methods for statistical analyses of real-world data

  • Possibly extend an R package

  • Document R source code, and possibly distributed as an R package to the public, and publish the work in a conference/journal.

Requirements/Prerequisites


  • Familiarity with the script language, ideally, R computing language

  • Basics of related knowledge, like time series, or spatio-temporal modelling or clustering

  • Interest in solving real-world problems

Student Gain

  • Software package development experience

  • Stronger R and C++ programming skills, that will be
    valuable for future statistical data analysis or data mining

  • State-of-art of time series analysis, clustering, or spatio-temporal techniques with applications;

  • Real world problem solving;

Background Literature


  • R free software for Statistical Computing

  • eWater CRC, River analysis package

  • More information could found at his ANU home page or CSIRO staff page (open to CSIRO people only)

  • CSIRO (www.csiro.au), as Australia’s national science agency is one of the largest and most diverse research agencies in the world. It operates large multi-disciplinary research teams. By doing a project with CSIRO you will have access to world class facilities and be able to work alongside CSIRO scientists while you are enjoying generous personal development and learning opportunities.

Links

CCI: CSIRO Computational Informatics

Contact:



Updated:  4 September 2015 / Responsible Officer:  JavaScript must be enabled to display this email address. / Page Contact:  JavaScript must be enabled to display this email address. / Powered by: Snorkel 1.4