Average
The average command merges multiple data pages (scans) into averaged spectra based on metadata grouping. This is a destructive operation — all original pages are replaced with averaged results.
(Note that the original data file remains unchanged; only the in-memory dataset is modified.)
Basic Usage
average [options]
All options are optional; without --groupby, all pages are averaged into a single spectrum.
Command Options
| Option | Description |
|---|---|
--groupby <key1> [<key2> ...] | Metadata keys to group by before averaging (default: none — average all pages together). |
--minsize <n> | Minimum number of scans required to form a group (default: 1). Groups smaller than this are discarded. |
--maxsize <n> | Maximum number of scans per group (default: unlimited). Larger groups are split into subgroups. |
--domain <domain> | Domain in which to perform averaging (default: reciprocal). Options: reciprocal, fourier. |
--axis <axis> | Axis along which to average (default: auto-selected for domain). |
Grouping Logic
- Group by metadata: Pages sharing identical values for all
--groupbykeys are grouped together - Split large groups: If a group exceeds
--maxsize, it is divided into subgroups of size ≤--maxsize - Discard small groups: Final groups smaller than
--minsizeare discarded, except:- If splitting creates a last subgroup smaller than
--minsize, it is merged with the previous subgroup (even if this exceeds--maxsize)
- If splitting creates a last subgroup smaller than
Averaging Method
For each group:
- All data rows with matching indices are averaged across pages
- Metadata is merged:
- Numerical metadata values are averaged
- Non-numerical metadata uses the first value in the group
- Special metadata variables are added (see below)
Generated Metadata
The averaged page includes new metadata variables:
| Variable | Description |
|---|---|
.navg | Number of scans averaged |
.g | Group ID (sequential numbering) |
.sg | Subgroup ID (if group was split due to --maxsize) |
.f | Generated page name |
.fn | Same as .f |
.f1, .f2, … | Parts of the page name split by _ |
.g1, .g2, … | Grouping key values |
Page Naming
Averaged pages are named by concatenating groupby values with _:
- If groupby values exist:
value1_value2_value3 - If no groupby specified:
spectrum_<groupid>
Examples
# Average all scans into a single spectrum
average
# Group by sample name, average each sample separately
average --groupby sample
# Group by temperature and composition
average --groupby temperature composition
# Require at least 3 scans per group, max 10 scans per average
average --groupby sample --minsize 3 --maxsize 10
# Average in Fourier domain
average --domain fourier --groupby phase
Behavior
The command:
- Groups all pages based on
--groupbykeys - Splits groups exceeding
--maxsize - Averages data within each group
- Replaces all pages with the averaged results
Destructive operation: All original pages are replaced with averaged groups. The original data file remains unchanged; only the in-memory dataset is modified. The workspace will contain only the averaged spectra after execution.
Use Cases
- Improve SNR: Merge replicate scans to reduce random noise
- Sample series: Average multiple measurements per sample, grouped by sample ID
- Temperature/pressure series: Group scans by experimental condition
- Quality control: Discard unreliable single scans using
--minsize
Tips and Best Practices
-
Align first: Use
cut,interpolate, orrebinto ensure all scans have identical axes before averaging -
Group carefully: Choose metadata keys that uniquely identify experimental conditions:
# Good: temperature is controlled variable average --groupby temperature # Better: multiple grouping keys for complex experiments average --groupby sample_id temperature pressure - Check metadata: Use consistent metadata naming across scans:
temp,T,temperatureare treated as different keys- Verify metadata with the file inspection commands before averaging
- Control group size:
--maxsizelimits memory usage and prevents over-averaging of drifting data--minsizeensures statistical validity (typically 2–3 minimum)
- Preserve originals: Save original data before averaging if you may need individual scans later
Output Structure
After averaging:
- Original page count may be reduced significantly
- Each new page represents one averaged group
- Metadata header includes list of original filenames
- Column structure matches the original (same axes and data columns)
See also: