pyprobe.result#

A module for the Result class.

Functions

combine_results(results[, concat_method])

Combine multiple Result objects into a single Result object.

Classes

Result(*, lf, info[, column_definitions])

A class for holding any data in PyProBE.

class Result(*, lf, info, column_definitions=<factory>)#

Bases: BaseModel

A class for holding any data in PyProBE.

A Result object is the base type for every data object in PyProBE. This class includes all of the main methods for returning and describing any data in PyProBE.

Key attributes for returning data:

data: The data as a Polars DataFrame.
get(): Get a column from the data as a NumPy array.

Key attributes for describing the data:

info: A dictionary containing information about the cell.
column_definitions: A dictionary of column definitions.
print_definitions(): Print the column definitions.
columns: A list of column names.

Parameters:

lf (LazyFrame)
info (dict[str, Any | None])
column_definitions (dict[str, str])

info: dict[str, Any | None]#: Dictionary containing information about the cell.

column_definitions: dict[str, str]#: A dictionary containing the definitions of the columns in the data.

collect()#

Collect the lazy dataframe into a polars DataFrame.

Use this method to resolve the lazy computations in the Result object. This can improve performance if you are reading a large amount of data from disk, and will be performing multiple calls to access the data.

Returns:: The collected dataframe.
Return type:: pl.DataFrame

property columns: list[str]#

The columns in the data.

Returns:: The columns in the data.
Return type:: List[str]

property quantities: list[str]#

The quantities of the data, with unit information removed.

Returns:: The quantities of the data.
Return type:: List[str]

property df: DataFrame#

Return the data as a Polars DataFrame.

Returns:: The data as a Polars DataFrame.
Return type:: pl.DataFrame

check_columns(columns)#

Check whether a column exists in the data.

Convert units if selected quantity exists in data with different unit.

Parameters:: columns (List[str]) – The columns to check.
Raises:: ValueError – If a column does not exist in the data.
Return type:: None

property data: DataFrame#

Return the data as a polars DataFrame.

Returns:: The data as a polars DataFrame.
Return type:: pl.DataFrame
Raises:: ValueError – If no data exists for this filter.

plot()#

Plot the data using the pandas plot method.

Call this method on a Result object in the same way you would call the pandas plot method on a DataFrame. For example:

result.plot(x="Time [s]", y="Current [A]")

Refer to the pandas documentation for detailed information and examples.

Parameters:: data (Series | DataFrame)
Return type:: None

hvplot(custom_plots={}, **metadata)#

HvPlot is a library for creating fast and interactive plots. This method requires the hvplot library to be installed as an optional dependency. You can install it with PyProBE by running pip install 'PyProBE-Data[hvplot]', or install it seperately with pip install hvplot.

The default backend is bokeh, which can be changed by setting the backend with hvplot.extension('matplotlib') or hvplot.extension('plotly').

Example usage:

result.hvplot(x="Time [s]", y="Current [A]", kind="scatter")

This method is not compatible with the inline syntax for hvplot: result.hvplot.scatter(...).

See the hvplot documentation for information and examples.

get(*column_names)#

Return one or more columns of the data as separate 1D numpy arrays.

Parameters:

column_names (str) – The column name(s) to return.

Returns:

The column(s) as numpy array(s).

Return type:

Union[NDArray[np.float64], Tuple[NDArray[np.float64], …]]

Raises:

ValueError – If no column names are provided.
ValueError – If a column name is not in the data.

get_only(column_name)#

Deprecated since version 1.2.0: The get_only method is deprecated. Use the get method instead.

Parameters:: column_name (str)
Return type:: ndarray[tuple[Any, …], dtype[float64]]

define_column(column_name, definition)#

Define a new column when it is added to the dataframe.

Parameters:

column_name (str) – The name of the column.
definition (str) – The definition of the quantity stored in the column

Return type:

None

print_definitions()#

Print the definitions of the columns stored in this result object.

Return type:: None

clean_copy(dataframe=None, column_definitions=None)#

Create a copy of the result object with info dictionary but without data.

Parameters:

dataframe (Optional[Union[pl.DataFrame, pl.LazyFrame]) – The data to include in the new Result object.
column_definitions (Optional[dict[str, str]]) – The definitions of the columns in the new result object.

Returns:

A new result object with the specified data.

Return type:

Result

load_external_file(filepath)#

Load an external file into a LazyFrame.

Supported file types are CSV, Parquet, and Excel. For maximum performance, consider using Parquet files. If you have an Excel file, consider converting it to CSV before loading.

Parameters:: filepath (str) – The path to the external file.
Return type:: LazyFrame

add_data(new_data, date_column_name, datetime_format=None, importing_columns=None, existing_data_timezone=None, new_data_timezone=None, align_on=None, join_strategy='keep_existing', fill_strategy='interpolate')#

Add new data columns to the result object.

The data must be time series data with a date column. The new data is joined to the base dataframe on the date column. Choose which dates to keep with the join strategy, and how to fill missing values with the fill strategy.

Parameters:

new_data (DataFrame | LazyFrame | str) – The new data to add to the result object. Can be a DataFrame, LazyFrame, or a path to a file (CSV, Parquet, Excel).
date_column_name (str) – The name of the column in the new data containing the date.
datetime_format (str | None) – The format string for parsing the date column if it is a string. Defaults to None.
importing_columns (list[str] | dict[str, str] | None) – The columns to import from the external file. If a list, the columns will be imported as is. If a dict, the keys are the columns in the data you want to import and the values are the columns you want to rename them to. If None, all columns will be imported. Defaults to None.
existing_data_timezone (str | None) – The timezone of the existing data. If None, the timezone is inferred from the local machine. Defaults to None.
new_data_timezone (str | None) – The timezone of the new data. If None, and the new data is naive, it is assumed to be in the same timezone as the existing data. Defaults to None.
align_on (tuple[str, str] | None) – A tuple of column names to use for aligning the new data with the existing data. The first element is the column name in the existing data, and the second element is the column name in the new data. The new data will be shifted in time to maximize the cross-correlation between the two columns. Defaults to None.
join_strategy (Literal['keep_existing', 'keep_new', 'keep_both']) – The strategy for which dates to keep in the result: - “keep_existing”: Keep only dates from existing data - “keep_new”: Keep only dates from new data - “keep_both”: Keep all dates from both datasets Defaults to “keep_existing”.
fill_strategy (Literal['interpolate', 'forward_fill', 'backward_fill'] | None) – The strategy for filling missing values in the merged dataset columns after applying the join strategy (this may affect both existing and new columns): - “interpolate”: Interpolate missing values by date - “forward_fill”: Forward fill missing values - “backward_fill”: Backward fill missing values - None: Don’t fill missing values Defaults to “interpolate”.

Raises:

ValueError – If the base dataframe has no date column.
ValueError – If an invalid timezone string is provided.

Return type:

None

add_new_data_columns(new_data, date_column_name)#

Deprecated since version 2.3.1: Use add_data instead.

Parameters:

new_data (DataFrame | LazyFrame)
date_column_name (str)

Return type:

None

join(other, on, how='inner', coalesce=True)#

Join two Result objects on a column. A wrapper around the polars join method.

This will extend the data in the Result object horizontally. The column definitions of the two Result objects are combined, if there are any conflicts the column definitions of the calling Result object will take precedence.

Parameters:

other (Result) – The other Result object to join with.
on (Union[str, List[str]]) – The column(s) to join on.
how (str) – The type of join to perform. Default is ‘inner’.
coalesce (bool) – Whether to coalesce the columns. Default is True.

Return type:

None

extend(other, concat_method='diagonal')#

Extend the data in this Result object with the data in another Result object.

This method will concatenate the data in the two Result objects, with the Result object calling the method above the other Result object. The column definitions of the two Result objects are combined, if there are any conflicts the column definitions of the calling Result object will take precedence.

Parameters:

other (Result | List[Result]) – The other Result object(s) to extend with.
concat_method (str) – The method to use for concatenation. Default is ‘diagonal’. See the polars.concat method documentation for more information.

Return type:

None

classmethod build(data_list, info)#

Build a Result object from a list of dataframes.

Parameters:

data_list (List[List[pl.LazyFrame | pl.DataFrame | dict]]) – The data to include in the new result object. The first index indicates the cycle and the second index indicates the step.
info (dict[str, Optional[str | int | float]]) – A dict containing test info.

Returns:

A new result object with the specified data.

Return type:

Result

export_to_mat(filename)#

Export the data to a .mat file.

This method will export the data and info dictionary to a .mat file. The variables in the .mat file will be named ‘data’ and ‘info’. Column names and dictionary keys will have any non-alphanumeric characters replaced with an underscore, to comply with MATLAB variable naming rules.

Parameters:: filename (str) – The name of the file to export to.
Return type:: None

from_polars_io(info={}, column_definitions={}, **kwargs)#

Create a new Result object with data from a Polars IO function.

Refer to the Polars documentation for a list of available IO functions:

Parameters:

polars_io_func (Callable[..., pl.DataFrame | pl.LazyFrame]) – The Polars IO function to use to create the data.
info (dict[str, Any | None]) – The info dictionary for the new Result object. Empty by default.
column_definitions (dict[str, str]) – The column definitions for the new Result object. Empty by default.
**kwargs – The keyword arguments to pass to the Polars IO function.

Returns:

A new Result object with the specified data and info.

Return type:

Result

Example

From a saved .csv file:

result = Result.from_polars_io(: pl.scan_csv, info={“test”: “test”}, column_definitions={}, source=”data.csv”,

)

From a pandas DataFrame:

result = Result.from_polars_io(: pl.from_pandas, info={“test”: “test”}, column_definitions={}, data=pd.DataFrame({“a”: [1, 2, 3]}),

)

From a numpy array:

result = Result.from_polars_io(: pl.from_numpy, info={“test”: “test”}, column_definitions={}, data=np.array([[1, 2, 3], [4, 5, 6]]), schema=[“a”, “b”]

)

model_config = {'arbitrary_types_allowed': True}#: Configuration for the model, should be a dictionary conforming to [ConfigDict][pydantic.config.ConfigDict].

property live_dataframe: LazyFrame#: Deprecated since version 2.4.0: The live_dataframe property is deprecated. Use the lf property instead.

property base_dataframe: LazyFrame#: Deprecated since version 2.4.0: The base_dataframe property is deprecated. Use the lf property instead.

combine_results(results, concat_method='diagonal')#

Combine multiple Result objects into a single Result object.

This method should be used to combine multiple Result objects that have different entries in their info dictionaries. The info dictionaries of the Result objects will be integrated into the dataframe of the new Result object

Parameters:

results (List[Result]) – The Result objects to combine.
concat_method (str) – The method to use for concatenation. Default is ‘diagonal’. See the polars.concat method documentation for more information.

Returns:

A new result object with the combined data.

Return type:

Result