Source code for mojito.reader

r"""
Read files
==========

File contents
-------------

The Mojito L1 files contain the following quantities:

- Metadata regarding the L01 pipeline that produced it, such as the pipeline
  name and version (see :class:`MojitoL1File`)
- Time-delay interferometry (TDI) observables, namely single-link :math:`\eta`,
  Michelson XYZ and quasi-orthogonal AET variables (see :class:`TDI`)
- Estimates of the light travel times (LTTs) and their derivatives, used to
  build the response function (see :class:`LTT`)
- Estimates of the spacecraft orbits, including the spacecraft positions and
  velocities in the BCRS, used to build the response function (see
  :class:`Orbits`)
- Estimates of the overall instrumental noise in single-link :math:`\eta`,
  Michelson XYZ and quasi-orthogonal AET variables as complex time and
  frequency-dependent covariance matrices, see :class:`NoiseEstimates`

Note that not all Mojito L1 files contain all the above quantities. Typically,

- A signal brick only contains the TDI, LTT, orbits quantities,
- A noise brick contains all of the above quantities.

Files obtained by combining multiple bricks contain the quantities from all the
original bricks.

Reading a single file
---------------------

Use :class:`MojitoL1File` to read data from a Mojito L1 file.

Then, access the different quantities using the corresponding attributes.

.. code-block:: python

    from mojito import MojitoL1File

    with MojitoL1File("path/to/file.h5") as f:

        # TDI observables
        x2 = f.tdis.x2[:]  # TDI X2 observable in Hz
        y2 = f.tdis.y2[:]  # TDI Y2 observable in Hz
        z2 = f.tdis.z2[:]  # TDI Z2 observable in Hz

        # Complete set of TDI observables in Doppler units
        xyz = f.tdis.xyz_doppler  # TDI XYZ in Doppler units
        aet = f.tdis.aet_doppler  # TDI AET in Doppler units

Reading incomplete files
------------------------

Note that the reader does not perform any validation of the file contents. As a
consequence, it can be used to read incomplete Mojito L1 files, e.g., files that
only contain TDI observables without orbits or noise estimates.

To check whether a given quantity is present in the file, use the corresponding
:attr:`is_complete` attribute. For example, to check whether the file contains
the complete set of TDI observables, use ``f.tdis.is_complete``.

You can check the presence of all quantities using the
:attr:`MojitoL1File.is_complete` attribute, which is True if and only if all
groups and quantities are present in the file.

Reading multiple files
----------------------

The reader can also be used to read multiple Mojito L1 files at once, by passing
a list of file paths to :class:`MojitoL1File`. In this case, the reader will
look for the requested quantity in all files, and combine them appropriately if
they are present in multiple files. This is useful to read files obtained by
combining multiple bricks.

The relevant combination operations are performed. For example, TDI observables
are summed across files, while quality flags are combined with a logical OR
operation.

.. code-block:: python

    from mojito import MojitoL1File

    with MojitoL1File(["file1.h5", "file2.h5"]) as f:

        # First 1000 samples of TDI X2 observable in Hz, summed across files
        x2 = f.tdis.x2[:1000]

        # First 1000 samples of TDI XYZ observables in Doppler units, summed
        # across files and lazily normalized by the laser frequency
        xyz = f.tdis.xyz_doppler[:1000]

        # First 1000 samples of quality flags for TDI XYZ observables
        xyz_flags = f.tdis.xyz_flags[:1000]

        # First 1000 samples of TDI time grid (must be consistent across files)
        time = f.tdis.time_sampling.time[:1000]

All operations are performed lazily, i.e., only the requested slice of data is
read from disk and combined when the corresponding attribute is accessed. See
:mod:`mojito.lazy` for more details on the lazy dataset classes used for this.

.. warning::

    Note that *some* consistency checks are available, but they are performed
    when accessing specific properties or by calling
    :meth:`MojitoL1File.check_consistent`, not at file open time.

Reference
---------

.. autoclass:: MojitoL1File
    :members:

.. autoclass:: TDI
    :members:

.. autoclass:: LTT
    :members:

.. autoclass:: Orbits
    :members:

.. autoclass:: NoiseEstimates
    :members:

"""

import logging
from functools import cached_property
from pathlib import Path
from types import TracebackType
from typing import Literal

from h5py import File, Group

from .lazy import (
    DatasetLike,
    LazyBooleanOrDataset,
    LazyScaledDataset,
    LazyStackedDataset,
    LazySumDataset,
)
from .sampling import LogUniformFrequencySampling, UniformTimeSampling
from .utils import _get_attrs, _get_datasets, _get_groups, assert_datasets_almost_equal

logger = logging.getLogger(__name__)



[docs]
class TDI:
    """Provides access to TDI observables stored in Mojito L1 files.

    No consistency checks are performed across groups. Use the
    :meth:`check_consistent` method to check for consistency.

    Parameters
    ----------
    groups
        List of HDF5 groups containing TDI observables datasets.
    laser_frequency
        Approximate laser frequency used in the simulation [Hz].

    Attributes
    ----------
    groups
        List of HDF5 groups containing TDI observables datasets.
    DATASETS
        List of dataset names expected in each group.
    GROUPS
        List of group names expected in each group.

    Raises
    ------
    ValueError
        If there are no groups.
    """

    DATASETS = [
        "eta_12",
        "eta_23",
        "eta_31",
        "eta_13",
        "eta_32",
        "eta_21",
        "A2",
        "E2",
        "T2",
        "X2",
        "Y2",
        "Z2",
        "eta_flags",
        "tdi_flags",
    ]

    GROUPS = ["sampling"]

    def __init__(self, groups: list[Group], laser_frequency: float) -> None:
        self.groups = groups
        self._laser_frequency = laser_frequency

        # Check that there's at least one group
        if not self.groups:
            raise ValueError("At least one group is required")


[docs]
    @cached_property
    def time_sampling(self) -> UniformTimeSampling:
        """Uniform time sampling of TDI observables."""
        sampling_groups = _get_groups("sampling", self.groups, require="one")
        return UniformTimeSampling.from_h5_group(sampling_groups[0])



[docs]
    @cached_property
    def eta_12(self) -> DatasetLike:
        r"""TDI :math:`\eta_{12}` observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI :math:`\eta_{12}` observable.
        """
        datasets = _get_datasets("eta_12", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def eta_23(self) -> DatasetLike:
        r"""TDI :math:`\eta_{23}` observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI :math:`\eta_{23}` observable.
        """
        datasets = _get_datasets("eta_23", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def eta_31(self) -> DatasetLike:
        r"""TDI :math:`\eta_{31}` observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI :math:`\eta_{31}` observable.
        """
        datasets = _get_datasets("eta_31", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def eta_13(self) -> DatasetLike:
        r"""TDI :math:`\eta_{13}` observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI :math:`\eta_{13}` observable.
        """
        datasets = _get_datasets("eta_13", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def eta_32(self) -> DatasetLike:
        r"""TDI :math:`\eta_{32}` observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI :math:`\eta_{32}` observable.
        """
        datasets = _get_datasets("eta_32", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def eta_21(self) -> DatasetLike:
        r"""TDI :math:`\eta_{21}` observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI :math:`\eta_{21}` observable.
        """
        datasets = _get_datasets("eta_21", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def a2(self) -> DatasetLike:
        """TDI A2 observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI A2 observable.
        """
        datasets = _get_datasets("A2", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def e2(self) -> DatasetLike:
        """TDI E2 observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI E2 observable.
        """
        datasets = _get_datasets("E2", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def t2(self) -> DatasetLike:
        """TDI T2 observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI T2 observable.
        """
        datasets = _get_datasets("T2", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def x2(self) -> DatasetLike:
        """TDI X2 observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI X2 observable.
        """
        datasets = _get_datasets("X2", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def y2(self) -> DatasetLike:
        """TDI Y2 observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI Y2 observable.
        """
        datasets = _get_datasets("Y2", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def z2(self) -> DatasetLike:
        """TDI Z2 observable [Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            TDI Z2 observable.
        """
        datasets = _get_datasets("Z2", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def eta_flags(self) -> DatasetLike:
        r"""Quality flags applicable for all :math:`\eta` observables.

        Flags are either 0 (data can be safely used) or 1 (data should not be
        used, i.e. gap).

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Flags for :math:`\eta` observables.
        """
        datasets = _get_datasets("eta_flags", self.groups, require="one")
        return LazyBooleanOrDataset(datasets)



[docs]
    @cached_property
    def xyz_flags(self) -> DatasetLike:
        """Quality flags applicable for all XYZ observables.

        Flags are either 0 (data can be safely used) or 1 (data should not be
        used, i.e. gap).

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Flags for XYZ observables.
        """
        datasets = _get_datasets("tdi_flags", self.groups, require="one")
        return LazyBooleanOrDataset(datasets)



[docs]
    @cached_property
    def aet_flags(self) -> DatasetLike:
        """Quality flags applicable for all AET observables.

        Flags are either 0 (data can be safely used) or 1 (data should not be
        used, i.e. gap).

        .. note::

            In the current implementation, the same quality flags are used for
            both XYZ and AET observables, see the :attr:`xyz_flags` attribute.

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Flags for AET observables.
        """
        datasets = _get_datasets("tdi_flags", self.groups, require="one")
        return LazyBooleanOrDataset(datasets)



[docs]
    @cached_property
    def eta_doppler(self) -> DatasetLike:
        r"""TDI :math:`\eta` observables in Doppler units [dimensionless].

        The single-link :math:`\eta` observables are ordered according to
        :data:`lisaconstants.indexing.MOSAS`.

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 6)`
            TDI :math:`\eta` observables in Doppler units.
        """
        stacked_eta = LazyStackedDataset(
            [
                self.eta_12,
                self.eta_23,
                self.eta_31,
                self.eta_13,
                self.eta_32,
                self.eta_21,
            ]
        )
        return LazyScaledDataset(stacked_eta, 1.0 / self._laser_frequency)



[docs]
    @cached_property
    def xyz_doppler(self) -> DatasetLike:
        """TDI XYZ observables in Doppler units [dimensionless].

        XYZ are obtained by stacking and normalizing X2, Y2, Z2 by the
        approximate laser frequency. This is a good approximation of the Doppler
        observables.

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 3)`
            TDI XYZ observables in Doppler units.
        """
        stacked_xyz = LazyStackedDataset([self.x2, self.y2, self.z2])
        return LazyScaledDataset(stacked_xyz, 1.0 / self._laser_frequency)



[docs]
    @cached_property
    def aet_doppler(self) -> DatasetLike:
        """TDI AET observables in Doppler units [dimensionless].

        AET are obtained by stacking and normalizing A2, E2, T2 by the
        approximate laser frequency. This is a good approximation of the Doppler
        observables.

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 3)`
            TDI AET observables in Doppler units.
        """
        stacked_aet = LazyStackedDataset([self.a2, self.e2, self.t2])
        return LazyScaledDataset(stacked_aet, 1.0 / self._laser_frequency)


    @property
    def is_complete(self) -> bool:
        """Check if all groups and datasets are present in each group."""
        return all(
            all(name in group for name in self.DATASETS) for group in self.groups
        ) and all(
            group_name in group for group in self.groups for group_name in self.GROUPS
        )


[docs]
    def check_consistent(self) -> None:
        """Check if time samplings are equal across groups.

        Raises
        ------
        ValueError
            If time samplings are inconsistent across groups.
        """
        # Check that all groups have consistent time sampling
        sampling_groups = _get_groups("sampling", self.groups)
        samplings = [UniformTimeSampling.from_h5_group(g) for g in sampling_groups]
        sampling_set = set(samplings)
        if len(sampling_set) > 1:
            raise ValueError("Inconsistent time samplings across files")





[docs]
class LTT:
    """Provides access to light travel times stored in Mojito L1 files.

    No consistency checks are performed across groups. In particular, we do not
    check that LTT estimates are identical across groups (we only return one).
    Use the :meth:`check_consistent` method to check for consistency.

    Parameters
    ----------
    groups
        List of HDF5 groups containing the LTT observables datasets.

    Attributes
    ----------
    groups
        List of HDF5 groups containing the LTT observables datasets.
    DATASETS
        List of dataset names expected in each group.
    GROUPS
        List of group names expected in each group.

    Raises
    ------
    ValueError
        If there are no groups.
    """

    DATASETS = [
        "ltt_12",
        "ltt_23",
        "ltt_31",
        "ltt_13",
        "ltt_32",
        "ltt_21",
        "ltt_derivative_12",
        "ltt_derivative_23",
        "ltt_derivative_31",
        "ltt_derivative_13",
        "ltt_derivative_32",
        "ltt_derivative_21",
    ]

    GROUPS = ["sampling"]

    def __init__(self, groups: list[Group]) -> None:
        self.groups = groups

        # Check that there's at least one group
        if not self.groups:
            raise ValueError("At least one group is required")


[docs]
    @cached_property
    def time_sampling(self) -> UniformTimeSampling:
        """Uniform time sampling of LTT observables."""
        sampling_groups = _get_groups("sampling", self.groups, require="one")
        return UniformTimeSampling.from_h5_group(sampling_groups[0])



[docs]
    @cached_property
    def ltt_12(self) -> DatasetLike:
        """Improved estimate of LTT 12 [s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT 12.
        """
        datasets = _get_datasets("ltt_12", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltt_23(self) -> DatasetLike:
        """Improved estimate of LTT 23 [s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT 23.
        """
        datasets = _get_datasets("ltt_23", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltt_31(self) -> DatasetLike:
        """Improved estimate of LTT 31 [s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT 31.
        """
        datasets = _get_datasets("ltt_31", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltt_13(self) -> DatasetLike:
        """Improved estimate of LTT 13 [s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT 13.
        """
        datasets = _get_datasets("ltt_13", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltt_32(self) -> DatasetLike:
        """Improved estimate of LTT 32 [s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT 32.
        """
        datasets = _get_datasets("ltt_32", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltt_21(self) -> DatasetLike:
        """Improved estimate of LTT 21 [s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT 21.
        """
        datasets = _get_datasets("ltt_21", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltt_derivative_12(self) -> DatasetLike:
        """Improved estimate of LTT derivative 12 [s/s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT derivative 12.
        """
        datasets = _get_datasets("ltt_derivative_12", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltt_derivative_23(self) -> DatasetLike:
        """Improved estimate of LTT derivative 23 [s/s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT derivative 23.
        """
        datasets = _get_datasets("ltt_derivative_23", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltt_derivative_31(self) -> DatasetLike:
        """Improved estimate of LTT derivative 31 [s/s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT derivative 31.
        """
        datasets = _get_datasets("ltt_derivative_31", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltt_derivative_13(self) -> DatasetLike:
        """Improved estimate of LTT derivative 13 [s/s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT derivative 13.
        """
        datasets = _get_datasets("ltt_derivative_13", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltt_derivative_32(self) -> DatasetLike:
        """Improved estimate of LTT derivative 32 [s/s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT derivative 32.
        """
        datasets = _get_datasets("ltt_derivative_32", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltt_derivative_21(self) -> DatasetLike:
        """Improved estimate of LTT derivative 21 [s/s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size,)`
            Improved estimate of LTT derivative 21.
        """
        datasets = _get_datasets("ltt_derivative_21", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def ltts(self) -> DatasetLike:
        """Improved estimates of all LTTs [s].

        Links are ordered according to :data:`lisaconstants.indexing.LINKS`.

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 6)`
            Improved estimates of all LTTs.
        """
        return LazyStackedDataset(
            [
                self.ltt_12,
                self.ltt_23,
                self.ltt_31,
                self.ltt_13,
                self.ltt_32,
                self.ltt_21,
            ]
        )



[docs]
    @cached_property
    def ltt_derivatives(self) -> DatasetLike:
        """Improved estimates of all LTT derivatives [s/s].

        Links are ordered according to :data:`lisaconstants.indexing.LINKS`.

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 6)`
            Improved estimates of all LTT derivatives.
        """
        return LazyStackedDataset(
            [
                self.ltt_derivative_12,
                self.ltt_derivative_23,
                self.ltt_derivative_31,
                self.ltt_derivative_13,
                self.ltt_derivative_32,
                self.ltt_derivative_21,
            ]
        )


    @property
    def is_complete(self) -> bool:
        """Check if all groups and datasets are present in each group."""
        return all(
            all(name in group for name in self.DATASETS) for group in self.groups
        ) and all(
            group_name in group for group in self.groups for group_name in self.GROUPS
        )


[docs]
    def check_consistent(self, *, quick: bool = False, chunk: int = 100_000) -> None:
        """Check if time samplings and LTT estimates are equal across groups.

        This can be an expensive operation, since LTT estimates need to be read
        from disk. If you only want to check that time samplings are consistent
        across groups, set ``quick=True`` to only check time samplings.

        To limit memory usage when checking LTT estimates, you can set the
        ``chunk`` size to read and compare the LTT estimates in smaller chunks.

        Parameters
        ----------
        quick
            Whether to only check time samplings for consistency, without
            checking that LTT estimates are identical across groups. This is a
            much faster check, since it does not require reading LTT estimates
            from disk, but it is less strict.
        chunk
            Chunk size to use when checking LTT estimates for consistency. This
            limits memory usage when checking large datasets.

        Raises
        ------
        ValueError
            If time samplings are inconsistent across groups, or if LTT
            estimates are inconsistent across groups when ``quick=False``.
        """
        # Check that all groups have consistent time sampling
        sampling_groups = _get_groups("sampling", self.groups)
        samplings = [UniformTimeSampling.from_h5_group(g) for g in sampling_groups]
        sampling_set = set(samplings)
        if len(sampling_set) > 1:
            raise ValueError("Inconsistent time samplings across files")

        # Do not go further if only a quick check is requested
        if quick:
            return

        # Check that all groups have identical LTT estimates (if not quick)
        for name in self.DATASETS:
            datasets = _get_datasets(name, self.groups, require="one")
            assert_datasets_almost_equal(datasets, chunk=chunk)





[docs]
class Orbits:
    """Provides access to spacecraft orbits stored in Mojito L1 files.

    No consistency checks are performed across groups. In particular, we do not
    check that orbit estimates are identical across groups (we only return one).
    Use the :meth:`check_consistent` method to check for consistency.

    Parameters
    ----------
    groups
        List of HDF5 groups containing the orbits datasets.

    Attributes
    ----------
    groups
        List of HDF5 groups containing the orbits datasets.
    DATASETS
        List of dataset names expected in each group.
    GROUPS
        List of group names expected in each group.

    Raises
    ------
    ValueError
        If there are no groups.
    """

    DATASETS = [
        "sc_position_1",
        "sc_position_2",
        "sc_position_3",
        "sc_velocity_1",
        "sc_velocity_2",
        "sc_velocity_3",
    ]

    GROUPS = ["sampling"]

    def __init__(self, groups: list[Group]) -> None:
        self.groups = groups

        # Check that there's at least one group
        if not self.groups:
            raise ValueError("At least one group is required")


[docs]
    @cached_property
    def time_sampling(self) -> UniformTimeSampling:
        """Uniform time sampling of the orbits."""
        sampling_groups = _get_groups("sampling", self.groups, require="one")
        return UniformTimeSampling.from_h5_group(sampling_groups[0])



[docs]
    @cached_property
    def position_1(self) -> DatasetLike:
        """Spacecraft 1 position in BCRS [m].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 3)`
            Spacecraft 1 position.
        """
        datasets = _get_datasets("sc_position_1", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def position_2(self) -> DatasetLike:
        """Spacecraft 2 position in BCRS [m].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 3)`
            Spacecraft 2 position.
        """
        datasets = _get_datasets("sc_position_2", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def position_3(self) -> DatasetLike:
        """Spacecraft 3 position in BCRS [m].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 3)`
            Spacecraft 3 position.
        """
        datasets = _get_datasets("sc_position_3", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def velocity_1(self) -> DatasetLike:
        """Spacecraft 1 velocity in BCRS [m/s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 3)`
            Spacecraft 1 velocity.
        """
        datasets = _get_datasets("sc_velocity_1", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def velocity_2(self) -> DatasetLike:
        """Spacecraft 2 velocity in BCRS [m/s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 3)`
            Spacecraft 2 velocity.
        """
        datasets = _get_datasets("sc_velocity_2", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def velocity_3(self) -> DatasetLike:
        """Spacecraft 3 velocity in BCRS [m/s].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 3)`
            Spacecraft 3 velocity.
        """
        datasets = _get_datasets("sc_velocity_3", self.groups, require="one")
        return datasets[0]



[docs]
    @cached_property
    def positions(self) -> DatasetLike:
        """Positions of all spacecraft [m].

        Spacecraft are ordered as :data:`lisaconstants.indexing.SPACECRAFT`.

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 3, 3)`
            Spacecraft positions. The second axis indexes the spacecraft, and
            the third axis indexes the Cartesian components.
        """
        return LazyStackedDataset(
            [
                self.position_1,
                self.position_2,
                self.position_3,
            ],
            axis=1,
        )



[docs]
    @cached_property
    def velocities(self) -> DatasetLike:
        """Velocities of all spacecraft [m/s].

        Spacecraft are ordered as :data:`lisaconstants.indexing.SPACECRAFT`.

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, 3, 3)`
            Spacecraft velocities. The second axis indexes the spacecraft, and
            the third axis indexes the Cartesian components.
        """
        return LazyStackedDataset(
            [
                self.velocity_1,
                self.velocity_2,
                self.velocity_3,
            ],
            axis=1,
        )


    @property
    def is_complete(self) -> bool:
        """Check if all groups and datasets are present in each group."""
        return all(
            all(name in group for name in self.DATASETS) for group in self.groups
        ) and all(
            group_name in group for group in self.groups for group_name in self.GROUPS
        )


[docs]
    def check_consistent(self, *, quick: bool = False, chunk: int = 100_000) -> None:
        """Check if time samplings and orbit estimates are equal across groups.

        This can be an expensive operation, since orbit estimates need to be
        read from disk. If you only want to check that time samplings are
        consistent across groups, set ``quick=True`` to only check time
        samplings.

        To limit memory usage when checking orbit estimates, you can set the
        ``chunk`` size to read and compare the orbit estimates in smaller
        chunks.

        Parameters
        ----------
        quick
            Whether to only check time samplings for consistency, without
            checking that orbit estimates are identical across groups. This is a
            much faster check, since it does not require reading orbit estimates
            from disk, but it is less strict.
        chunk
            Chunk size to use when checking orbit estimates for consistency.
            This limits memory usage when checking large datasets.

        Raises
        ------
        ValueError
            If time samplings are inconsistent across groups, or if orbit
            estimates are inconsistent across groups when ``quick=False``.
        """
        # Check that all groups have consistent time sampling
        sampling_groups = _get_groups("sampling", self.groups)
        samplings = [UniformTimeSampling.from_h5_group(g) for g in sampling_groups]
        sampling_set = set(samplings)
        if len(sampling_set) > 1:
            raise ValueError("Inconsistent time samplings across files")

        # Do not go further if only a quick check is requested
        if quick:
            return

        # Check that all groups have identical orbit estimates (if not quick)
        for name in self.DATASETS:
            datasets = _get_datasets(name, self.groups, require="one")
            assert_datasets_almost_equal(datasets, chunk=chunk)





[docs]
class NoiseEstimates:
    """Provides access to noise estimates stored in Mojito L1 files.

    No consistency checks are performed across groups. Use the
    :meth:`check_consistent` method to check for consistency.

    Parameters
    ----------
    groups
        List of HDF5 groups containing the noise estimates datasets.

    Attributes
    ----------
    groups
        List of HDF5 groups containing the noise estimates datasets.
    DATASETS
        List of dataset names expected in each group.
    GROUPS
        List of group names expected in each group.

    Raises
    ------
    ValueError
        If there are no groups.
    """

    DATASETS = [
        "XYZ",
        "AET",
        "eta",
    ]

    GROUPS = [
        "sampling",
        "log_frequency_sampling",
    ]

    def __init__(self, groups: list[Group]) -> None:
        self.groups = groups

        # Check that there's at least one group
        if not self.groups:
            raise ValueError("At least one group is required")


[docs]
    @cached_property
    def time_sampling(self) -> UniformTimeSampling:
        """Uniform time sampling of the noise estimates."""
        sampling_groups = _get_groups("sampling", self.groups, require="one")
        return UniformTimeSampling.from_h5_group(sampling_groups[0])



[docs]
    @cached_property
    def freq_sampling(self) -> LogUniformFrequencySampling:
        """Log-uniform frequency sampling of the noise estimates."""
        sampling_groups = _get_groups(
            "log_frequency_sampling", self.groups, require="one"
        )
        return LogUniformFrequencySampling.from_h5_group(sampling_groups[0])



[docs]
    @cached_property
    def xyz(self) -> DatasetLike:
        """Noise covariance estimate for TDI XYZ [Hz^2/Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, freq_sampling.size, 3, 3)`
            Noise covariance estimate for TDI XYZ.
        """
        datasets = _get_datasets("XYZ", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def aet(self) -> DatasetLike:
        """Noise covariance estimate for TDI AET [Hz^2/Hz].

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, freq_sampling.size, 3, 3)`
            Noise covariance estimate for TDI AET.
        """
        datasets = _get_datasets("AET", self.groups, require="one")
        return LazySumDataset(datasets)



[docs]
    @cached_property
    def eta(self) -> DatasetLike:
        r"""Noise covariance estimate for single-link :math:`eta` [Hz^2/Hz].

        The single-link :math:`eta` observables are ordered according to
        :data:`lisaconstants.indexing.MOSAS`.

        Returns
        -------
        `DatasetLike of shape (time_sampling.size, freq_sampling, 6, 6)`
            Noise covariance estimate for single-link :math:`eta`.
        """
        datasets = _get_datasets("eta", self.groups, require="one")
        return LazySumDataset(datasets)


    @property
    def is_complete(self) -> bool:
        """Check if all groups and datasets are present in each group."""
        return all(
            all(name in group for name in self.DATASETS) for group in self.groups
        ) and all(
            group_name in group for group in self.groups for group_name in self.GROUPS
        )


[docs]
    def check_consistent(self) -> None:
        """Check if time and frequency samplings are equal across groups."""

        # Check that all groups have consistent time sampling
        sampling_groups = _get_groups("sampling", self.groups)
        samplings = [UniformTimeSampling.from_h5_group(g) for g in sampling_groups]
        sampling_set = set(samplings)
        if len(sampling_set) > 1:
            raise ValueError("Inconsistent time samplings across files")

        # Check that all groups have consistent frequency sampling
        freq_sampling_groups = _get_groups("log_frequency_sampling", self.groups)
        freq_samplings = [
            LogUniformFrequencySampling.from_h5_group(g) for g in freq_sampling_groups
        ]
        freq_sampling_set = set(freq_samplings)
        if len(freq_sampling_set) > 1:
            raise ValueError("Inconsistent frequency samplings across files")




FileOpenMode = Literal["r", "r+", "a", "w", "w-"]



[docs]
class MojitoL1File:
    """Provides access to Mojito L1 files.

    Can open a single Mojito L1 file or combine multiple files lazily,
    aggregating datasets via the correct mathematical operations.

    >>> with MojitoL1File(["file1.h5", "file2.h5"]) as f:
    ...     etas = f.tdis.eta_doppler[:]
    ...     times = f.tdis.time_sampling.t()

    Multiple files can only be accessed in read-only mode, as data is
    aggregated. However, single files can be opened in write mode (using the
    ``mode`` parameter) to create or modify Mojito L1 files. Check :mod:`writer`
    for more details on how to create Mojito L1 files.

    .. warning::

        No consistency checks are performed across files, so users must ensure
        that combined files are compatible (e.g., time samplings must be
        identical, orbits and LTTs must be consistent, laser frequencies must be
        identical, etc.).

    All quantities are accessed lazily. As a consequence, it is possible to read
    incomplete files (e.g., files missing some groups, attributes or datasets).
    Errors will only be raised when trying to access missing quantities.

    Use the :attr:`is_complete` property of each data class (e.g.,
    :attr:`TDI.is_complete`) to check if all expected datasets are present in
    each group. The global :attr:`MojitoL1File.is_complete` property checks if
    all expected datasets are present in all expected groups.

    Note that most attributes are cached for efficiency. They will become
    invalid after closing the file, so you must not access them then.

    Parameters
    ----------
    paths
        Path or list of paths to Mojito L1 files. If a list is provided, files
        are combined lazily.
    mode
        File open mode (default "r" for read-only). Must be "r" for multi-file
        input, as multi-file aggregation is only supported in read-only mode.

    Attributes
    ----------
    files
        The underlying HDF5 file objects.

    Raises
    ------
    ValueError
        If no paths are provided or if multi-file input is used with a mode
        other than "r".
    """

    def __init__(
        self,
        paths: str | Path | list[str | Path],
        mode: FileOpenMode = "r",
    ) -> None:

        # Normalize paths to list
        if isinstance(paths, (str, Path)):
            path_list = [str(paths)]
        else:
            path_list = [str(p) for p in paths]

        # Check that at least one path is provided
        if not path_list:
            raise ValueError("At least one path must be provided")

        # Multi-file always uses read-only mode
        if len(path_list) > 1 and mode != "r":
            raise ValueError("Multi-file input must use read-only mode")

        # Use a try-except block to ensure that all opened files are closed if
        # an error occurs during initialization (e.g., a file cannot be opened)
        self.files: list[File] = []
        try:
            # Open all files
            for path in path_list:
                f = File(path, mode=mode)
                self.files.append(f)
        except Exception:
            for f in self.files:
                f.close()
            raise

    def __enter__(self) -> "MojitoL1File":
        return self

    def __exit__(
        self,
        exc_type: type[BaseException] | None,
        exc_val: BaseException | None,
        exc_tb: TracebackType | None,
    ) -> None:
        """Close the files when exiting the context manager."""
        self.close()


[docs]
    def close(self) -> None:
        """Close all open files."""
        for f in self.files:
            f.close()


    @property
    def is_combined(self) -> bool:
        """Check if this MojitoL1File combines multiple files."""
        return len(self.files) > 1


[docs]
    @cached_property
    def pipeline_names(self) -> list[str]:
        """Name of the pipeline for each file."""
        pipeline_names = _get_attrs("pipeline_name", self.files, cast=str)
        return pipeline_names



[docs]
    @cached_property
    def lolipops_versions(self) -> list[str]:
        """Versions of lolipops used to generate the files."""
        versions = _get_attrs("lolipops_version", self.files, cast=str)
        return versions



[docs]
    @cached_property
    def lolipops_version(self) -> str:
        """Version of lolipops used to generate the files."""
        self._check_lolipops_versions_consistent()
        return self.lolipops_versions[0]



[docs]
    @cached_property
    def laser_frequencies(self) -> list[float]:
        """Laser frequencies used in the simulations [Hz]."""
        laser_frequencies = _get_attrs("laser_frequency", self.files, cast=float)
        return laser_frequencies



[docs]
    @cached_property
    def laser_frequency(self) -> float:
        """Laser frequency used in the simulation [Hz]."""
        self._check_laser_frequencies_consistent()
        return self.laser_frequencies[0]



[docs]
    @cached_property
    def tdis(self) -> TDI:
        r"""Time-delay interferometry observables and quality flags.

        We provide the second-generation Michelson combinations XYZ, the
        quasi-orthogonal second-generation channels AET, and the six single-link
        :math:`\eta` observables.

        We also provide quality flags for the :math:`\eta`, XYZ, and AET
        observables.
        """
        tdi_groups = _get_groups("tdis", self.files, require="one")
        return TDI(tdi_groups, self.laser_frequency)



[docs]
    @cached_property
    def ltts(self) -> LTT:
        r"""Light travel time estimates (and derivatives).

        We provide improved estimates of the six one-way light travel times
        between the three spacecraft, including relativistic corrections, as
        well as their time derivatives.
        """
        ltt_groups = _get_groups("ltts", self.files, require="one")
        return LTT(ltt_groups)



[docs]
    @cached_property
    def orbits(self) -> Orbits:
        r"""Spacecraft orbits.

        We provide the positions and velocities of the three spacecraft in the
        Barycentric Celestial Reference System (BCRS).
        """
        orbits_groups = _get_groups("orbits", self.files, require="one")
        return Orbits(orbits_groups)



[docs]
    @cached_property
    def noise_estimates(self) -> NoiseEstimates:
        r"""Estimated noise covariance matrices.

        We provide estimates for TDI XYZ, TDI AET, and single-link :math:`eta`.

        Assuming local stationarity, the noise covariance matrix depends on time
        and frequency only. Noise estimates are stored as diagonal covariance
        matrices. We provide both strain noise (i.e., noise at the test masses)
        and readout noise (i.e., electronics noise added to test mass
        measurements).
        """
        noise_estimates_groups = _get_groups(
            "noise_estimates", self.files, require="one"
        )
        return NoiseEstimates(noise_estimates_groups)


    @property
    def is_complete(self) -> bool:
        """Check that the HDF5 file is a complete L1 file."""
        try:
            return (
                self.tdis.is_complete
                and self.ltts.is_complete
                and self.orbits.is_complete
                and self.noise_estimates.is_complete
            )
        except KeyError:
            return False


[docs]
    def check_consistent(self) -> None:
        """Check that all files are consistent.

        We check that all groups are consistent, and that laser frequencies and
        versions are identical across files.
        """
        self.tdis.check_consistent()
        self.ltts.check_consistent()
        self.orbits.check_consistent()
        self.noise_estimates.check_consistent()
        self._check_laser_frequencies_consistent()
        self._check_lolipops_versions_consistent()


    def _check_laser_frequencies_consistent(self) -> None:
        """Check that laser frequencies are identical across files.

        Raises
        ------
        AssertionError
            If laser frequencies are not identical across files.
        """
        laser_freq_set = set(self.laser_frequencies)
        if len(laser_freq_set) > 1:
            raise AssertionError("Inconsistent laser frequencies across files")

    def _check_lolipops_versions_consistent(self) -> None:
        """Check that lolipops versions are identical across files.

        Raises
        ------
        AssertionError
            If lolipops versions are not identical across files.
        """
        version_set = set(self.lolipops_versions)
        if len(version_set) > 1:
            raise AssertionError("Inconsistent lolipops versions across files")