LLEPE¶

Parameters¶

class llepe.LLEPE(exp_data, phases_xml_filename, phase_names, aq_solvent_name, extractant_name, diluant_name, complex_names, extracted_species_ion_names, extracted_species_list=None, aq_solvent_rho=None, extractant_rho=None, diluant_rho=None, opt_dict=None, objective_function='Log-MSE', optimizer='scipy_minimize', temp_xml_file_path=None, dependant_params_dict=None, custom_objects_dict=None)[source]¶

Liquid-Liquid Extraction Parameter estimator

Note

The order in which the extracted species (ES) appear in the csv file must be the same order as they appear in the xml, complex_names and extracted_species_ion_names.

For example, say in exp_data, ES_1 is Nd ES_2 is Pr, and

aq_solvent_name = 'H2O(L)'
extractant_name = '(HA)2(org)'
diluent_name = 'dodecane'

Then:

The exp_data column ordering must be (names do not matter):

[h_i, h_eq, z_i, z_eq, Nd_aq_i, Nd_aq_eq, Nd_d_eq, Pr_aq_i, Pr_aq_eq, Pr_d_eq]

The aqueous speciesArray must be "H2O(L) H+ OH- Cl- Nd+++ Pr+++"

The organic speciesArray must be "(HA)2(org) dodecane Nd(H(A)2)3(org) Pr(H(A)2)3(org)"

complex_names = ['Nd(H(A)2)3(org)', 'Pr(H(A)2)3(org)']
extracted_species_ion_names = ['Nd+++', 'Pr+++']

Parameters

exp_data --

(str or pd.DataFrame) csv file name or DataFrame with experimental data

In the .csv file, the rows are different experiments and columns are the measured quantities.

The ordering of the columns needs to be:

[h_i, h_eq, z_i, z_eq, {ES_1}_aq_i, {ES_1}_aq_eq, {ES_1}_d_eq, {ES_2}_aq_i, {ES_2}_aq_eq, {ES_2}_d_eq,... {ES_N}_aq_i, {ES_N}_aq_eq, {ES_N}_d_eq]

Naming does not matter, just the order.

Where {ES_1}-{ES_N} are the extracted species names of interest e.g. Nd, Pr, La, etc.

Below is an explanation of the columns.

Index	Column	Meaning
0	h_i	Initial Concentration of H+ ions (mol/L)
1	h_eq	Equilibrium concentration of H+ ions (mol/L)
2	z_i	Initial concentration of extractant (mol/L)
3	z_eq	Equilibrium concentration of extractant (mol/L)
4	{ES}_aq_i	Initial concentration of ES ions (mol/L)
5	{ES}_aq_eq	Equilibrium concentration of ES ions in aqueous phase (mol/L)
6	{ES}_d_eq	Equilibrium Ratio between amount of ES atoms in organic to aqueous

phases_xml_filename --
(str) xml file with parameters for equilibrium calc

Would recommend copying and modifying xmls located in data/xmls or in Cantera's "data" folder

speciesArray fields need specific ordering.

In aqueous phase: aq_solvent_name, H+, OH-, Cl-, ES_1, ES_2, ..., ES_N

(ES_1-ES_N) represent ES ion names e.g. Nd+++, Pr+++

In organic phase : extractant_name, diluant_name, ES_1, ES_2, ..., ES_N

(ES_1-ES_N) represent ES complex names e.g. Nd(H(A)2)3(org), Pr(H(A)2)3(org)
phase_names --
(list) names of phases in xml file

Found in the xml file under <phase ... id={phase_name}>
aq_solvent_name -- (str) name of aqueous solvent in xml file
extractant_name -- (str) name of extractant in xml file
diluant_name -- (str) name of diluant in xml file
complex_names -- (list) names of complexes in xml file.
extracted_species_ion_names -- (list) names of extracted species ions in xml file
extracted_species_list --
(list) names of extracted species elements.

If None, extracted_species_list will be extracted_species_ion_names without '+' e.g. 'Nd+++'->'Nd'
aq_solvent_rho --
(float) density of solvent (g/L)

If None, molar volume/molecular weight is used from xml
extractant_rho --
(float) density of extractant (g/L)

If None, molar volume/molecular weight is used from xml
diluant_rho --
(float) density of diluant (g/L)

If None, molar volume/molecular weight is used from xml

opt_dict --

(dict) dictionary containing info about which species parameters are updated to fit model to experimental data

Should have the format as below. Dictionary keys under user defined parameter name must be named as shown below ('upper_element_name', 'upper_attrib_name', etc.). 'attrib_name's and 'attrib_value's can be None. {} denotes areas for user to fill in.

opt_dict = {"{user_defined_name_for_parameter_1}":
                {'upper_element_name': {param_upper_element},
                'upper_attrib_name': {param_upper_attrib_name},
                'upper_attrib_value': {param_upper_attrib_value},
                'lower_element_name': {param_lower_element},
                'lower_attrib_name': {param_lower_attrib_name},
                'lower_attrib_value': {param_lower_attrib_value},
                'input_format': {str format to input input_value}
                'input_value': {guess_value}},
            "{user_defined_name_for_parameter_2}":
                            ...
            ...
            }

See example files for more examples.

objective_function --

(function or str) function to compute objective

By default, the objective function is log mean squared error of distribution ratio

np.sum((np.log10(d_pred)-np.log10(d_meas))^2)

Function needs to take inputs:

objective_function(predicted_dict, measured_df, kwargs)

kwargs is optional

Function needs to return: (float) value computed by objective function

Below is the guide for referencing predicted values

To access	Use
hydrogen ion conc in aq	predicted_dict['h_eq']
extractant conc in org	predicted_dict['z_eq']
ES ion eq conc in aq	predicted_dict['{ES}_aq_eq']
ES complex eq conc in org	predicted_dict['{ES}_org_eq']
ES distribution ratio	predicted_dict['{ES}_d_eq']

Replace "{ES}" with extracted species element e.g. Nd, La, etc.

For measured values, use the same names, but replace predicted_dict with measured_df

optimizer --
(function or str) function to perform optimization
Note

The optimized variables are not directly the species parameters, but instead are first multiplied by the initial guess before sending becoming the species parameters.

For example, say
```
opt_dict = {'Nd(H(A)2)3(org):'h0':-4.7e6}
```
If the bounds on h0 need to be [-4.7e7,-4.7e5], then divide the bounds by the guess and get
```
"bounds": [(1e-1, 1e1)]
```
By default, the optimizer is scipy's optimize function with
```
default_kwargs= {"method": 'SLSQP',
                 "bounds": [(1e-1, 1e1)] * len(x_guess),
                 "constraints": (),
                 "options": {'disp': True,
                             'maxiter': 1000,
                             'ftol': 1e-6}}
```
Function needs to take inputs: optimizer(objective_function, x_guess, kwargs)

kwargs is optional

Function needs to return: ((np.ndarray, float)) Optimized parameters,
objective_function value
temp_xml_file_path --
(str) path to temporary xml file.

This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file

default is local temp folder
dependant_params_dict --
(dict) dictionary containing information about parameters dependant on opt_dict. Has a similar structure to opt_dict except instead of input values, it has 3 other fields: 'function', 'kwargs', and 'independent_params.

'function' is a function of the form

function(independent_param__value_list, custom_objects_dict, **kwargs)

'kwargs' are the extra arguments to pass to function

'independent_params' is a list of parameter names in opt_dict that the dependent_param is a function of.

'custom_objects_dict' is for accessing the estimator's internal custom_objects_dict and must be included in the arguments, even if the custom_objects_dict is not set and is None.

See example code for usage.
custom_objects_dict -- (dict) dictionary containing custom objects format: {<object_name_string>: <object>,...}

fit(objective_function=None, optimizer=None, objective_kwargs=None, optimizer_kwargs=None) → tuple[source]¶

Fits experimental to modeled data by minimizing objective function with optimizer. Returns dictionary with opt_dict structure

Parameters

objective_function -- (function) function to compute objective If 'None', last set objective or default function is used
optimizer -- (function) function to perform optimization If 'None', last set optimizer or default is used
optimizer_kwargs -- (dict) optional arguments for optimizer
objective_kwargs -- (dict) optional arguments for objective function

Returns tuple

(opt_dict (dict), opt_value (float)) optimized opt_dict: Has identical structure as opt_dict

get_aq_solvent_name() → str[source]¶

Returns aq_solvent_name

Returns: aq_solvent_name: (str) name of aqueous solvent in xml file

get_aq_solvent_rho() → str[source]¶

Returns aqueous solvent density (g/L)

Returns: aq_solvent_rho: (float) density of aqueous solvent

get_complex_names() → list[source]¶

Returns list of complex names

Returns: complex_names: (list) names of complexes in xml file.

get_custom_objects_dict()[source]¶

Returns the custom_objects_dict

Returns: custom_objects_dict: (dict) dictionary containing information about custom objects from user

get_dependant_params_dict()[source]¶

Returns the dependant_params_dict

Returns: dependant_params_dict: (dict) dictionary containing information about parameters dependant on opt_dict

get_diluant_name() → str[source]¶: Returns diluant name :return: diluant_name: (str) name of diluant in xml file

get_diluant_rho() → str[source]¶

Returns diluant density (g/L)

Returns: diluant_rho: (float) density of diluant

get_exp_df() → pandas.core.frame.DataFrame[source]¶

Returns the experimental DataFrame

Returns: (pd.DataFrame) Experimental data

get_extractant_name() → str[source]¶

Returns extractant name

Returns: extractant_name: (str) name of extractant in xml file

get_extractant_rho() → str[source]¶

Returns extractant density (g/L)

Returns: extractant_rho: (float) density of extractant

get_extracted_species_ion_names() → list[source]¶

Returns list of extracted species ion names

Returns: extracted_species_ion_names: (list) names of extracted species ions in xml file

get_extracted_species_list() → list[source]¶

Returns list of extracted species names

Returns: extracted_species_list: (list) names of extracted species in xml file

get_in_moles() → pandas.core.frame.DataFrame[source]¶

Returns the in_moles DataFrame which contains the initial mole fractions of each species for each experiment

Returns: in_moles: (pd.DataFrame) DataFrame with initial mole fractions

get_objective_function()[source]¶

Returns objective function

Returns: objective_function: (func) Objective function to quantify error between model and experimental data

get_opt_dict() → dict[source]¶

Returns the dictionary containing optimization information

Returns: (dict) dictionary containing info about which species parameters are updated to fit model to experimental data

get_optimizer()[source]¶

Returns objective function

Returns: optimizer: (func) Optimizer function to minimize objective function

get_phases() → list[source]¶

Returns the list of Cantera solutions

Returns: (list) list of Cantera solutions/phases

get_predicted_dict()[source]¶

Returns predicted dictionary of species concentrations that xml parameters predicts given current in_moles

Returns: predicted_dict: (dict) dictionary of species concentrations

get_temp_xml_file_path()[source]¶

Returns path to temporary xml file.

This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file.

Returns: temp_xml_file_path: (str) path to temporary xml file.

log_mean_squared_error(predicted_dict, meas_df)[source]¶

Default objective function for LLEPE

Returns the log mean squared error of predicted distribution ratios (d=n_org/n_aq) to measured d.

np.sum((np.log10(d_pred)-np.log10(d_meas))**2)

Parameters

predicted_dict -- (dict) contains predicted data
meas_df -- (pd.DataFrame) contains experimental data

Returns

(float) log mean squared error between predicted and measured

parity_plot(compared_value=None, c_data=None, c_label=None, plot_title=None, save_path=None, print_r_squared=False, data_labels=None, legend=True)[source]¶

Parity plot between measured and predicted compared_value. Default compared value is {ES_1}_aq_eq

Parameters

compared_value -- (str) Quantity to compare predicted and experimental data. Can be any column containing "eq" in exp_df e.g. h_eq, z_eq, {ES}_d_eq, etc.
plot_title --
(str or boolean)

If None (default): Plot title will be generated from compared_value
Recommend to just explore. If h_eq, plot_title is "H^+ eq conc".

If str: Plot title will be plot_title string

If "False": No plot title
c_data -- (list or np.ndarray) data for color axis
c_label -- (str) label for color axis
save_path -- (str) save path for parity plot
print_r_squared -- (boolean) To plot or not to plot r-squared value. Prints 2 places past decimal
data_labels -- labels for the data such as paper's name where experiment is pulled from.
legend -- whether to display legend for data_labels. Has no effect if data_labels is None

Return fig, ax

returns the figure and axes objects

static plot_3d_data(x_data, y_data, z_data, c_data=None, x_label=None, y_label=None, z_label=None, c_label=None)[source]¶

THis is for plotting 3d scatter plots. We suggest use matplotlib's ax.scatter to make 3d plots.

Parameters

x_data -- (list) list of data for x axis
y_data -- (list) list of data for y axis
z_data -- (list) list of data for z axis
c_data -- (list) list of data for color axis
x_label -- (str) label for x axis
y_label -- (str) label for y axis
z_label -- (str) label for z axis
c_label -- (str) label for color axis

Returns

r_squared(compared_value=None)[source]¶

r-squared value comparing measured and predicted compared value

Closer to 1, the better the model's predictions.

Parameters: compared_value -- (str) Quantity to compare predicted and experimental data. Can be any column containing "eq" in exp_df e.g. h_eq, z_eq, {ES}_d_eq, etc. default is {ES}_aq_eq

static scipy_minimize(objective, x_guess, optimizer_kwargs=None)[source]¶

The default optimizer for LLEPE

Uses scipy.minimize

By default, options are

default_kwargs= {"method": 'SLSQP',
                "bounds": [(1e-1, 1e1)]*len(x_guess),
                "constraints": (),
                "options": {'disp': True,
                            'maxiter': 1000,
                            'ftol': 1e-6}}

Parameters

objective -- (func) the objective function
x_guess -- (np.ndarray) the initial guess (always 1)
optimizer_kwargs -- (dict) dictionary of options for minimize

Returns

((np.ndarray, float)) Optimized parameters, objective_function value

set_aq_solvent_name(aq_solvent_name)[source]¶

Change aq_solvent_name to input aq_solvent_name

Parameters: aq_solvent_name -- (str) name of aqueous solvent in xml file

set_aq_solvent_rho(aq_solvent_rho)[source]¶

Changes aqueous solvent density (g/L) to input aq_solvent_rho

Parameters: aq_solvent_rho -- (float) density of aqueous solvent

set_complex_names(complex_names)[source]¶

Change complex names list to input complex_names

Parameters: complex_names -- (list) names of complexes in xml file.

set_custom_objects_dict(custom_objects_dict)[source]¶

Sets the custom_objects_dict

Parameters: custom_objects_dict -- (dict) dictionary containing information about about custom objects from user

set_dependant_params_dict(dependant_params_dict)[source]¶

Sets the dependant_params_dict

Parameters: dependant_params_dict -- (dict) dictionary containing information about parameters dependant on opt_dict

set_diluant_name(diluant_name)[source]¶

Change diluant_name to input diluant_name

Parameters: diluant_name -- (str) name of diluant in xml file

set_diluant_rho(diluant_rho)[source]¶

Changes diluant density (g/L) to input diluant_rho

Parameters: diluant_rho -- (float) density of diluant

set_exp_df(exp_data)[source]¶

Changes the experimental DataFrame to input exp_csv_filename data and renames columns to internal LLEPE names

h_i, h_eq, z_i, z_eq, {ES}_aq_i, {ES}_aq_eq, {ES}_d

See class docstring on "exp_csv_filename" for further explanations.

Parameters: exp_data -- (str or pd.DataFrame) file name/path or DataFrame for experimental data csv

set_extractant_name(extractant_name)[source]¶: Change extractant_name to input extractant_name :param extractant_name: (str) name of extractant in xml file

set_extractant_rho(extractant_rho)[source]¶

Changes extractant density (g/L) to input extractant_rho

Parameters: extractant_rho -- (float) density of extractant

set_extracted_species_ion_names(extracted_species_ion_names)[source]¶

Change list of extracted species ion names to input: extracted_species_ion_names

Parameters: extracted_species_ion_names -- (list) names of extracted species ions in xml file

set_extracted_species_list(extracted_species_list)[source]¶

Change list of extracted species ion names to input: extracted_species_ion_names

Parameters: extracted_species_list -- (list) names of extracted species in xml file

set_in_moles(feed_vol)[source]¶

Function that initializes mole fractions to input feed_vol

This function is called at initialization

Sets in_moles to a pd.DataFrame containing initial mole fractions

Columns for species and rows for different experiments

This function also calls update_predicted_dict

Parameters: feed_vol -- (float) feed volume of mixture (L)

set_objective_function(objective_function)[source]¶

Change objective function to input objective_function.

See class docstring on "objective_function" for instructions

Parameters: objective_function -- (func) Objective function to quantify error between model and experimental data

set_opt_dict(opt_dict)[source]¶

Change the dictionary to input opt_dict.

opt_dict specifies species parameters to be updated to fit model to data

See class docstring on "opt_dict" for more information.

Parameters: opt_dict -- (dict) dictionary containing info about which species parameters are updated to fit model to experimental data

set_optimizer(optimizer)[source]¶

Change optimizer function to input optimizer.

See class docstring on "optimizer" for instructions

Parameters: optimizer -- (func) Optimizer function to minimize objective function

set_phases(phases_xml_filename, phase_names)[source]¶

Change list of Cantera solutions by inputting new xml file name and phase names

Also runs set_in_moles to set feed volume to 1 L

Parameters

phases_xml_filename -- (str) xml file with parameters for equilibrium calc
phase_names -- (list) names of phases in xml file

set_temp_xml_file_path(temp_xml_file_path)[source]¶

Changes temporary xml file path to input temp_xml_file_path.

This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file.

Parameters: temp_xml_file_path -- (str) path to temporary xml file.

update_custom_objects_dict(info_dict)[source]¶

updates internal custom_objects_dict with info_dict

Parameters: info_dict -- Requires an identical structure to opt_dict Ignores items with keys containing "custom_object_name"
Returns: None

update_predicted_dict(phases_xml_filename=None, phase_names=None)[source]¶

Function that computes the predicted equilibrium concentrations the fed phases_xml_filename parameters predicts given the initial mole fractions set by in_moles()

Parameters

phases_xml_filename -- (str)xml file with parameters for equilibrium calc. If None, the current phases_xml_filename is used.
phase_names -- (list) names of phases in xml file. If None, the current phases_names is used.

update_xml(info_dict, phases_xml_filename=None, dependant_params_dict=None)[source]¶

updates xml file with info_dict

Parameters

info_dict -- (dict) Requires an identical structure to opt_dict Ignores items with keys containing "custom_object_name"
phases_xml_filename -- (str) xml filename if editing other xml If None, the current xml will be modified and the internal Cantera phases will be refreshed to the new values.
dependant_params_dict -- (dict) dictionary containing information about parameters dependant on info_dict