src.toolbox.steps.custom.write_report#

Writes reports on the current data passed through the pipeline.

Classes#

WriteDataReport

Writes a report summarizing the generic plots and statistics of the data.

Functions#

current_info(→ dict)

Returns current operator information from when the report is being generated.

write_conf_py(→ None)

Write a minimal Sphinx conf.py suitable for PDF builds.

run_sphinx(→ None)

Build a PDF from a Sphinx source directory using the latexpdf builder.

build_qc_dict(→ dict)

Return a dictionary of all QC variable names and their corresponding QC attributes.

flatten_qc_dict(→ list)

Flatten QC dictionary into list of table rows.

run_info_page(→ None)

Writes a page dedicated to pipeline run information.

add_log(→ None)

Add and format the logfile as a table.

qc_section(→ None)

Wrapper for the QC section.

img_rst(doc, fname[, fields])

Inserts image information into the .rst using directive.

basic_geo(doc, data, g_extent, ext, outdir)

inset_geo(doc, data[, outdir, g_extent, scale, ext])

Creates an inset geographic of two plots for additional positional awareness.

qc_hist(doc, data, outdir, var[, xlims, hislim, bins, ext])

Create quick quality control histogram figure.

make_plots(→ None)

Wrapper for plotting glider QC variables quickly.

Module Contents#

src.toolbox.steps.custom.write_report.current_info() dict[source]#

Returns current operator information from when the report is being generated.

src.toolbox.steps.custom.write_report.write_conf_py(source_dir, project='Pipeline Report', author='Unknown', master_doc='index', subtitle=None) None[source]#

Write a minimal Sphinx conf.py suitable for PDF builds.

To be passed into Sphinx.

Parameters:
  • source_dir (str or Path) – Directory containing the .rst file(s), where this will be saved.

  • project (str) – Project title.

  • author (str) – Author name.

  • master_doc (str) – Root rst file (without .rst).

src.toolbox.steps.custom.write_report.run_sphinx(source_dir, build_dir=None) None[source]#

Build a PDF from a Sphinx source directory using the latexpdf builder.

This step requires Sphinx binaries to be installed and usable on the current workstation. Requires a conf.py to be located in the source directory.

Parameters:
  • source_dir (str or Path) – Directory containing the .rst and conf.py files.

  • build_dir (str or Path) – Directory where Sphinx output can be placed. Defaults to source_dir/_build.

src.toolbox.steps.custom.write_report.build_qc_dict(data: xarray.Dataset) dict[source]#

Return a dictionary of all QC variable names and their corresponding QC attributes.

Can be expanded in the future if additional attributes related to testing are added. Tests are ID’d using _flag_cts suffix in variable test parameters

Parameters:

data (Xarray DataSet) – The top level data containing all the relevant QC variables.

Returns:

  • qc_dict (dict) – Nested dictionaries of QC variables with test names and results.

    Structure: {

    ”VAR_QC”: {
    “qc_name”: {

    “params”: {…}, “flag_counts”: {…}, “stats”: {…},

    }, “qc_name_2”: {

    },

    }

    }

  • TODO (Move to utils? Does it belong here?)

src.toolbox.steps.custom.write_report.flatten_qc_dict(qc_dict: dict) list[source]#

Flatten QC dictionary into list of table rows.

Intended for use in report metrics (RstCloth).

Parameters:

qc_dict (dict) – Dictionary of QC results.

Returns:

rows

A list of rows suitable for tabular display. Each row is a list:

[qc_var, qc_name, flag, formatted_count]

  • qc_var : str, the QC variable name

  • qc_name : str, the name of the QC test

  • flag : str, QC flag value

  • formatted_count : str, count formatted with thousands separator

Return type:

list of list

src.toolbox.steps.custom.write_report.run_info_page(rs, params_dict: dict, glatters: dict) None[source]#

Writes a page dedicated to pipeline run information.

Parameters:
  • rs (RstCloth) – Active RstCloth stream to which the page is written.

  • params_dict (dict) – Dictionary of global pipeline parameters.

  • glatters (dict) – Dictionary describing the glider and mission. OG1 includes “platform_vocabulary” for consistency.

src.toolbox.steps.custom.write_report.add_log(logfile, rs, ncols=4) None[source]#

Add and format the logfile as a table.

Note: Requires a designated log_file be initialized in the global pipeline configuration parameters.

src.toolbox.steps.custom.write_report.qc_section(doc, data: xarray.Dataset) None[source]#

Wrapper for the QC section.

Parameters:
  • doc (RstCloth object) – The active RstCloth stream to be written to

  • data (xarray.core.dataset.Dataset) – The entire dataset, including attributes

src.toolbox.steps.custom.write_report.img_rst(doc, fname: str, fields: list = None)[source]#

Inserts image information into the .rst using directive.

See rst directives for image information (https://docutils.sourceforge.io/docs/ref/rst/directives.html#images) See RstCloth for info about directive (https://rstcloth.readthedocs.io/en/latest/rstcloth.html)

Parameters:
  • doc (RstCloth object) – The active RstCloth stream to be written to

  • fname (str) – The path or filename

  • fields (list of tuple) – Image parameters to be written below the directive

Example

img_rst(doc,

“../examples/data/OG1/testing/fig.png”, fields=[(“height”,”100px”),(“width”,”100px”)])

would write out .. image:: fig.*

height:

100px

width:

100px

src.toolbox.steps.custom.write_report.basic_geo(doc, data, g_extent, ext, outdir)[source]#
src.toolbox.steps.custom.write_report.inset_geo(doc, data, outdir: str = './', g_extent: list = [7, 25, 54, 65], scale: str = '110m', ext: str = '.png')[source]#

Creates an inset geographic of two plots for additional positional awareness.

Unlike basic_geo(), this function will create an inset to make it clearer where the glider is operating.

If the chart looks chunky, consider increasing the resolution in the scale arg.

Parameters:
  • doc (RstCloth object) – The active RstCloth stream to be written to

  • data (xarray.core.dataset.Dataset) – The entire dataset, including attributes

  • outdir (str) – The path to return figures to. Defaults to current directory.

  • g_extent (list) – Geographic extent for cartopy geographic plot ([lon1, lon2, lat1, lat2]). Defaults to Baltic Sea.

  • scale (str) – Resolution for cartopy to use when adding elements (“10m”, “50m”, “110m”)

  • ext (str) – Image filetype extension (.png, .svg, etc.)

src.toolbox.steps.custom.write_report.qc_hist(doc, data: xarray.Dataset, outdir: str, var: str, xlims: list = [-0.6, 9.6], hislim=range(10), bins=None, ext='.png')[source]#

Create quick quality control histogram figure.

Left axis: Quick plot of QC variable’s parent Right axis: Bins of each flag type, labeled with # of points

Parameters:
  • doc (RstCloth object) – The active RstCloth stream to be written to

  • data (xarray.core.dataset.Dataset) – The entire dataset, including attributes

  • var (str) – The QC variable as listed in data

  • ext (str) – Image filetype extension (.png, .svg, etc.)

  • hislim (array-like) – All potential flags of the selected schema (default Argo = 0 to 9, 10 total)

  • bins (array-like) – The sequence of bin edges for collection, matching the dimension of hislim

  • xlims (list) – Histogram axis bounds. Defaults to Argo (10 flags) with 0.1 padding on each side

src.toolbox.steps.custom.write_report.make_plots(doc, data: xarray.Dataset, outdir: str, extent: list = [7, 25, 54, 65]) None[source]#

Wrapper for plotting glider QC variables quickly.

There are millions of points per variable, which xarray can plot very quickly in specific ways.

Parameters:
  • doc (RstCloth object) – The active RstCloth stream to be written to

  • data (xarray.core.dataset.Dataset) – The entire dataset, including attributes

  • outdir (str) – The path to return figures to

  • ext (str) – Image filetype extension (.png, .svg, etc.)

  • g_extent (list) – Geographic extent for cartopy geographic plot. Defaults to Baltic Sea.

  • TODO (Define long-term storage for this. Is diagnostics the right place?)

class src.toolbox.steps.custom.write_report.WriteDataReport[source]#

Bases: toolbox.steps.base_step.BaseStep

Writes a report summarizing the generic plots and statistics of the data.

Base template: * Title page (automatically handled by sphinx) * Quality control summary * Basic plots * Run metadata and pipeline parameters * Logfile

Parameters:
  • title (str) – Name of the report (on title page and filename)

  • output_path (str) – Directory to write the report to (must end with a “/”)

  • build (bool) – Whether to run Sphinx to build the PDF after writing the .rst and conf.py files

step_name = 'Write Data Report'[source]#
run() xarray.DataArray[source]#