MRCZ format#
Note
To read this format, the optional dependencies blosc
and mrcz
are
required.
The mrcz
format is an extension of the CCP-EM MRC2014 file format.
CCP-EM MRC2014 file format.
It uses the blosc meta-compression library to bitshuffle and compress files in
a blocked, multi-threaded environment. The supported data types are float32
,
int8
, uint16
, int16
and complex64
.
It supports arbitrary meta-data, which is serialized into JSON.
MRCZ also supports asynchronous reads and writes.
Repository |
|
PyPI |
|
Citation |
|
Preprint |
Support for this format is not enabled by default. In order to enable it, the mrcz library needs to be installed and optionally blosc to use compression.
API functions#
- rsciio.mrcz.file_reader(filename, lazy=False, mmap_mode='c', endianess='<', **kwds)#
File reader for the MRCZ format for tomographic data.
- Parameters:
- filename
str
,pathlib.Path
Filename of the file to read or corresponding pathlib.Path.
- lazybool, default=False
Whether to open the file lazily or not. The file will stay open until closed in
compute()
or closed manually.get_file_handle()
can be used to access the file handler and close it manually.- mmap_mode{
None
, “r+”, “r”, “w+”, “c”}, default=None Argument passed to
numpy.memmap
. A memory-mapped array is stored on disk, and not directly loaded into memory. However, it can be accessed and sliced like any ndarray. Lazy loading does not support in-place writing (i.e lazy loading and the"r+"
mode are incompatible). The MRCZ reader currently only supports C-ordering memory-maps. IfNone
(default), the value is"r"
whenlazy=True
, otherwise it is"c"
.- endianess
str
, default=”<” "<"
or">"
, depending on how the bits are written to the file.- **kwds
dict
, optional The keyword arguments are passed to
mrcz.readMRC()
.
- filename
- Returns:
list
ofdict
List of dictionaries containing the following fields:
‘data’ – multidimensional
numpy.ndarray
ordask.array.Array
‘axes’ – list of dictionaries describing the axes containing the fields ‘name’, ‘units’, ‘index_in_array’, and either ‘size’, ‘offset’, and ‘scale’ or a numpy array ‘axis’ containing the full axes vector
‘metadata’ – dictionary containing the parsed metadata
‘original_metadata’ – dictionary containing the full metadata tree from the input file
When the file contains several datasets, each dataset will be loaded as separate dictionary.
Examples
>>> from rsciio.mrcz import file_reader >>> new_signal = file_reader('file.mrcz')
- rsciio.mrcz.file_writer(filename, signal, endianess='<', do_async=False, compressor=None, clevel=1, n_threads=None)#
Write signal to MRCZ format.
- Parameters:
- filename
str
,pathlib.Path
Filename of the file to write to or corresponding pathlib.Path.
- signal
dict
Dictionary containing the signal object. Should contain the following fields:
‘data’ – multidimensional numpy array
‘axes’ – list of dictionaries describing the axes containing the fields ‘name’, ‘units’, ‘index_in_array’, and either ‘size’, ‘offset’, and ‘scale’ or a numpy array ‘axis’ containing the full axes vector
‘metadata’ – dictionary containing the metadata tree
- endianess
str
, default=”<” "<"
or">"
, depending on how the bits are written to the file.- do_asyncbool, Default=False
Currently supported within RosettaSciIO for writing only, this will save the file in a background thread and return immediately. Warning: there is no method currently implemented within RosettaSciIO to tell if an asychronous write has finished.
- compressor{
None
, “zlib”, “zstd”, “lz4”}, Default=None The compression codec.
- clevel
int
, Default=1 The compression level, an
int
from 1 to 9.- n_threads
int
The number of threads to use for
blosc
compression. Defaults to the maximum number of virtual cores (including Intel Hyperthreading) on your system, which is recommended for best performance. Ifdo_async = True
you may wish to leave one thread free for the Python GIL.
- filename
Notes
The recommended compression codec is
zstd
(zStandard) withclevel=1
for general use. If speed is critical, uselz4
(LZ4) withclevel=9
. Integer data compresses more redably than floating-point data, and in general the histogram of values in the data reflects how compressible it is.To save files that are compatible with other programs that can use MRC such as GMS, IMOD, Relion, MotionCorr, etc. save with
compressor=None
, extension.mrc
. JSON metadata will not be recognized by other MRC-supporting software but should not cause crashes.Examples
>>> from rsciio.mrcz import file_writer >>> file_writer('file.mrcz', signal, do_async=True, compressor='zstd', clevel=1)