hyperspy.io_plugins.nexus module

Nexus file reading, writing and inspection.

hyperspy.io_plugins.nexus._byte_to_string(value)

Decode a byte string.

Parameters:

value (byte str) –

Returns:

decoded version of input value

Return type:

str

hyperspy.io_plugins.nexus._extract_hdf_dataset(group, dataset, lazy=False)

Import data from hdf path.

Parameters:
  • group (hdf group) – group from which to load the dataset

  • dataset (str) – path to the dataset within the group

  • lazy (bool {default:True}) – If true use lazy opening, if false read into memory

Returns:

A signal dictionary which can be used to instantiate a signal.

Return type:

dict

hyperspy.io_plugins.nexus._find_data(group, search_keys=None, hardlinks_only=False, absolute_path=None)

Read from a nexus or hdf file and return a list of the dataset entries.

The method iterates through group attributes and returns NXdata or hdf datasets of size >=2 if they’re not already NXdata blocks and returns a list of the entries This is a convenience method to inspect a file to see which datasets are present rather than loading all the sets in the file as signals h5py.visit or visititems does not visit soft links or external links so an implementation of a recursive search is required. See https://github.com/h5py/h5py/issues/671

Parameters:
  • group (hdf group or File) –

  • search_keys (string, list of strings or None, default: None) – Only return items which contain the strings .e.g search_list = [“instrument”,”Fe”] will return hdf entries with instrument or Fe in their hdf path.

  • hardlinks_only (bool , default : False) – Option to ignore links (soft or External) within the file.

  • absolute_path (string, list of strings or None, default: None) – Return items with the exact specified absolute path

Returns:

nx_dataset_list is a list of all NXdata paths hdf_dataset_list is a list of all hdf_datasets not linked to an NXdata set.

Return type:

nx_dataset_list, hdf_dataset_list

hyperspy.io_plugins.nexus._find_search_keys_in_dict(tree, search_keys=None)

Search through a dict for search keys.

This is a convenience method to inspect a file for a value rather than loading the file as a signal

Parameters:
  • tree (h5py File object) –

  • search_keys (string or list of strings) – Only return items which contain the strings .e.g search_keys = [“instrument”,”Fe”] will return hdf entries with instrument or Fe in their hdf path.

Returns:

When search_list is specified only full paths containing one or more search_keys will be returned

Return type:

dict

hyperspy.io_plugins.nexus._fix_exclusion_keys(key)

Exclude hyperspy specific keys.

Signal and DictionaryBrowser break if a a key is a dict method - e.g. {“keys”:2.0}.

This method prepends the key with fix_ so the information is still present to work around this issue

Parameters:

key (str) –

Return type:

str

hyperspy.io_plugins.nexus._get_nav_list(data, dataentry)

Get the list with information of each axes of the dataset

Parameters:
  • data (hdf dataset) – the dataset to be loaded.

  • dataentry (hdf group) – the group with corresponding attributes.

Returns:

nav_list – contains information about each axes.

Return type:

list

Return the link target path.

If a hdf group is a soft link or has a target attribute this method will return the target path. If no link is found return None.

Returns:

Soft link path if it exists, otherwise None

Return type:

str

hyperspy.io_plugins.nexus._is_int(s)

Check that s in an integer.

Parameters:

s (python object to test) –

Returns:

True or False

Return type:

bool

hyperspy.io_plugins.nexus._is_linear_axis(data)

Check if the data is linearly incrementing.

Parameters:

data (dask or numpy array) –

Returns:

True or False

Return type:

bool

hyperspy.io_plugins.nexus._is_numeric_data(data)

Check that data contains numeric data.

Parameters:

data (dask or numpy array) –

Returns:

True or False

Return type:

bool

hyperspy.io_plugins.nexus._load_metadata(group, lazy=False, skip_array_metadata=False)

Search through a hdf group and return the group structure.

h5py.visit or visititems does not visit soft links or external links so an implementation of a recursive search is required. See https://github.com/h5py/h5py/issues/671

Parameters:
  • group (hdf group) – location to load the metadata from

  • lazy (bool , default : False) – Option for lazy loading

  • skip_array_metadata (bool, default : False) – whether to skip loading array metadata

Returns:

dictionary of group contents

Return type:

dict

hyperspy.io_plugins.nexus._nexus_dataset_to_signal(group, nexus_dataset_path, lazy=False)

Load an NXdata set as a hyperspy signal.

Parameters:
  • group (hdf group containing the NXdata) –

  • nexus_data_path (str) – Path to the NXdata set in the group

  • lazy (bool, default : True) – lazy loading of data

Returns:

A signal dictionary which can be used to instantiate a signal.

Return type:

dict

hyperspy.io_plugins.nexus._parse_from_file(value, lazy=False)

To convert values from the hdf file to compatible formats.

When reading string arrays we convert or keep string arrays as byte_strings (some io_plugins only supports byte-strings arrays so this ensures inter-compatibility across io_plugins) Arrays of length 1 - return the single value stored. Large datasets are returned as dask arrays if lazy=True.

Parameters:
  • value (input read from hdf file (array,list,tuple,string,int,float)) –

  • lazy (bool {default: False}) – The lazy flag is only applied to values of size >=2

Returns:

parsed value.

Return type:

str,int, float, ndarray dask Array

hyperspy.io_plugins.nexus._parse_to_file(value)

Convert to a suitable format for writing to HDF5.

For example unicode values are not compatible with hdf5 so conversion to byte strings is required.

Parameters:

file (value - input object to write to the hdf) –

Return type:

parsed value

hyperspy.io_plugins.nexus._text_split(s, sep)

Split a string based of list of seperators.

Parameters:
  • s (str) –

  • sep (str - seperator or list of seperators e.g. '.' or ['_','/']) –

Returns:

String sections split based on the seperators

Return type:

list

hyperspy.io_plugins.nexus._write_nexus_attr(dictionary, group, skip_keys=None)

Recursively iterate through dictionary and write “attrs” dictionaries.

This step is called after the groups and datasets have been created

Parameters:
  • dictionary (dict) – Input dictionary to be written to the hdf group

  • group (hdf group) – location to store the attrs sections of the dictionary

hyperspy.io_plugins.nexus._write_nexus_groups(dictionary, group, skip_keys=None, **kwds)

Recursively iterate throuh dictionary and write groups to nexus.

Parameters:
  • dictionary (dict) – dictionary contents to store to hdf group

  • group (hdf group) – location to store dictionary

  • skip_keys (str or list of str) – the key(s) to skip when writing into the group

  • **kwds (additional keywords) – additional keywords to pass to h5py.create_dataset method

hyperspy.io_plugins.nexus._write_signal(signal, nxgroup, signal_name, **kwds)

Store the signal data as an NXdata dataset.

Parameters:
  • signal (Hyperspy signal) –

  • nxgroup (HDF group) – Entry at which to save signal data

  • signal_name (str) – Name under which to store the signal entry in the file

hyperspy.io_plugins.nexus.file_reader(filename, lazy=False, dataset_key=None, dataset_path=None, metadata_key=None, skip_array_metadata=False, nxdata_only=False, hardlinks_only=False, use_default=False, **kwds)

Read NXdata class or hdf datasets from a file and return signal(s).

Note

Loading all datasets can result in a large number of signals Please review your datasets and use the dataset_key to target the datasets of interest. “keys” is a special keywords and prepended with “fix” in the metadata structure to avoid any issues.

Datasets are all arrays with size>2 (arrays, lists)

Parameters:
  • filename (str) – Input filename

  • dataset_key (None, str, list of strings, default : None) – If None all datasets are returned. If a string or list of strings is provided only items whose path contain the string(s) are returned. For example dataset_key = [“instrument”, “Fe”] will return data entries with instrument or Fe in their hdf path.

  • dataset_path (None, str, list of strings, default : None) – If None, no absolute path is searched. If a string or list of strings is provided items with the absolute paths specified will be returned. For example, dataset_path = [‘/data/spectrum/Mn’], it returns the exact dataset with this path. It is not filtered by dataset_key, i.e. with dataset_key = [‘Fe’], it still returns the specific dataset at ‘/data/spectrum/Mn’. It is empty if no dataset matching the absolute path provided is present.

  • metadata_key (: None, str, list of strings, default : None) – Only return items from the original metadata whose path contain the strings .e.g metadata_key = [“instrument”, “Fe”] will return all metadata entries with “instrument” or “Fe” in their hdf path.

  • skip_array_metadata (bool, default : False) – Whether to skip loading metadata with an array entry. This is useful as metadata may contain large array that is redundant with the data.

  • nxdata_only (bool, default : False) – If True only NXdata will be converted into a signal if False NXdata and any hdf datasets will be loaded as signals

  • hardlinks_only (bool, default : False) – If True any links (soft or External) will be ignored when loading.

  • use_default (bool, default : False) – If True and a default NXdata is defined in the file load this as a signal. This will ignore the other keyword options. If True and no default is defined the file will be loaded according to the keyword options.

Returns:

dict

Return type:

signal dictionary or list of signal dictionaries

See also

  • list_datasets_in_file()

  • read_metadata_from_file()

hyperspy.io_plugins.nexus.file_writer(filename, signals, save_original_metadata=True, skip_metadata_key=None, use_default=False, *args, **kwds)

Write the signal and metadata as a nexus file.

This will save the signal in NXdata format in the file. As the form of the metadata can vary and is not validated it will be stored as an NXcollection (an unvalidated collection)

Parameters:
  • filename (str) – Path of the file to write

  • signals (signal or list of signals) – Signal(s) to be written

  • save_original_metadata (bool , default : False) – Option to save hyperspy.original_metadata with the signal. A loaded Nexus file may have a large amount of data when loaded which you may wish to omit on saving

  • skip_metadata_key (str or list of str, default : None) – the key(s) to skip when it is saving original metadata. This is useful when some metadata’s keys are to be ignored.

  • use_default (bool , default : False) – Option to define the default dataset in the file. If set to True the signal or first signal in the list of signals will be defined as the default (following Nexus v3 data rules).

See also

  • file_reader()

  • list_datasets_in_file()

  • read_metadata_from_file()

hyperspy.io_plugins.nexus.list_datasets_in_file(filename, dataset_key=None, hardlinks_only=False, verbose=True)

Read from a nexus or hdf file and return a list of the dataset paths.

This method is used to inspect the contents of a Nexus file. The method iterates through group attributes and returns NXdata or hdf datasets of size >=2 if they’re not already NXdata blocks and returns a list of the entries. This is a convenience method to inspect a file to list datasets present rather than loading all the datasets in the file as signals.

Parameters:
  • filename (str) – path of the file to read

  • dataset_key (str, list of strings or None , default: None) – If a str or list of strings is provided only return items whose path contain the strings. For example, dataset_key = [“instrument”, “Fe”] will only return hdf entries with “instrument” or “Fe” somewhere in their hdf path.

  • hardlinks_only (bool, default : False) – If true any links (soft or External) will be ignored when loading.

  • verbose (boolean, default : True) – Prints the results to screen

Returns:

list of paths to datasets

Return type:

list

See also

  • file_reader()

  • file_writer()

  • read_metadata_from_file()

hyperspy.io_plugins.nexus.read_metadata_from_file(filename, metadata_key=None, lazy=False, verbose=False, skip_array_metadata=False)

Read the metadata from a nexus or hdf file.

This method iterates through the file and returns a dictionary of the entries. This is a convenience method to inspect a file for a value rather than loading the file as a signal.

Parameters:
  • filename (str) – path of the file to read

  • metadata_key (None,str or list_of_strings , default : None) – None will return all datasets found including linked data. Providing a string or list of strings will only return items which contain the string(s). For example, search_keys = [“instrument”,”Fe”] will return hdf entries with “instrument” or “Fe” in their hdf path.

  • verbose (bool, default : False) – Pretty Print the results to screen

  • skip_array_metadata (bool, default : False) – Whether to skip loading array metadata. This is useful as a lot of large array may be present in the metadata and it is redundant with dataset itself.

Returns:

Metadata dictionary.

Return type:

dict

See also

  • file_reader()

  • file_writer()

  • list_datasets_in_file()