LLEPE¶
Parameters¶
-
class
llepe.
LLEPE
(exp_data, phases_xml_filename, phase_names, aq_solvent_name, extractant_name, diluant_name, complex_names, extracted_species_ion_names, extracted_species_list=None, aq_solvent_rho=None, extractant_rho=None, diluant_rho=None, opt_dict=None, objective_function='Log-MSE', optimizer='scipy_minimize', temp_xml_file_path=None, dependant_params_dict=None, custom_objects_dict=None)[source]¶ Liquid-Liquid Extraction Parameter estimator
Note
The order in which the extracted species (ES) appear in the csv file must be the same order as they appear in the xml, complex_names and extracted_species_ion_names.
For example, say in exp_data, ES_1 is Nd ES_2 is Pr, and
aq_solvent_name = 'H2O(L)' extractant_name = '(HA)2(org)' diluent_name = 'dodecane'
Then:
The exp_data column ordering must be (names do not matter):
[h_i, h_eq, z_i, z_eq, Nd_aq_i, Nd_aq_eq, Nd_d_eq, Pr_aq_i, Pr_aq_eq, Pr_d_eq]
The aqueous speciesArray must be "H2O(L) H+ OH- Cl- Nd+++ Pr+++"
The organic speciesArray must be "(HA)2(org) dodecane Nd(H(A)2)3(org) Pr(H(A)2)3(org)"
complex_names = ['Nd(H(A)2)3(org)', 'Pr(H(A)2)3(org)'] extracted_species_ion_names = ['Nd+++', 'Pr+++']
- Parameters
exp_data --
(str or pd.DataFrame) csv file name or DataFrame with experimental data
In the .csv file, the rows are different experiments and columns are the measured quantities.
The ordering of the columns needs to be:
[h_i, h_eq, z_i, z_eq, {ES_1}_aq_i, {ES_1}_aq_eq, {ES_1}_d_eq, {ES_2}_aq_i, {ES_2}_aq_eq, {ES_2}_d_eq,... {ES_N}_aq_i, {ES_N}_aq_eq, {ES_N}_d_eq]
Naming does not matter, just the order.
Where {ES_1}-{ES_N} are the extracted species names of interest e.g. Nd, Pr, La, etc.
Below is an explanation of the columns.
Index
Column
Meaning
0
h_i
Initial Concentration of H+ ions (mol/L)
1
h_eq
Equilibrium concentration of H+ ions (mol/L)
2
z_i
Initial concentration of extractant (mol/L)
3
z_eq
Equilibrium concentration of extractant (mol/L)
4
{ES}_aq_i
Initial concentration of ES ions (mol/L)
5
{ES}_aq_eq
Equilibrium concentration of ES ions in aqueous phase (mol/L)
6
{ES}_d_eq
Equilibrium Ratio between amount of ES atoms in organic to aqueous
phases_xml_filename --
(str) xml file with parameters for equilibrium calc
Would recommend copying and modifying xmls located in data/xmls or in Cantera's "data" folder
speciesArray fields need specific ordering.
In aqueous phase: aq_solvent_name, H+, OH-, Cl-, ES_1, ES_2, ..., ES_N
(ES_1-ES_N) represent ES ion names e.g. Nd+++, Pr+++
In organic phase : extractant_name, diluant_name, ES_1, ES_2, ..., ES_N
(ES_1-ES_N) represent ES complex names e.g. Nd(H(A)2)3(org), Pr(H(A)2)3(org)
phase_names --
(list) names of phases in xml file
Found in the xml file under <phase ... id={phase_name}>
aq_solvent_name -- (str) name of aqueous solvent in xml file
extractant_name -- (str) name of extractant in xml file
diluant_name -- (str) name of diluant in xml file
complex_names -- (list) names of complexes in xml file.
extracted_species_ion_names -- (list) names of extracted species ions in xml file
extracted_species_list --
(list) names of extracted species elements.
If
None
, extracted_species_list will be extracted_species_ion_names without '+' e.g. 'Nd+++'->'Nd'aq_solvent_rho --
(float) density of solvent (g/L)
If
None
, molar volume/molecular weight is used from xmlextractant_rho --
(float) density of extractant (g/L)
If
None
, molar volume/molecular weight is used from xmldiluant_rho --
(float) density of diluant (g/L)
If
None
, molar volume/molecular weight is used from xmlopt_dict --
(dict) dictionary containing info about which species parameters are updated to fit model to experimental data
Should have the format as below. Dictionary keys under user defined parameter name must be named as shown below ('upper_element_name', 'upper_attrib_name', etc.). 'attrib_name's and 'attrib_value's can be None. {} denotes areas for user to fill in.
opt_dict = {"{user_defined_name_for_parameter_1}": {'upper_element_name': {param_upper_element}, 'upper_attrib_name': {param_upper_attrib_name}, 'upper_attrib_value': {param_upper_attrib_value}, 'lower_element_name': {param_lower_element}, 'lower_attrib_name': {param_lower_attrib_name}, 'lower_attrib_value': {param_lower_attrib_value}, 'input_format': {str format to input input_value} 'input_value': {guess_value}}, "{user_defined_name_for_parameter_2}": ... ... }
See example files for more examples.
objective_function --
(function or str) function to compute objective
By default, the objective function is log mean squared error of distribution ratio
np.sum((np.log10(d_pred)-np.log10(d_meas))^2)
Function needs to take inputs:
objective_function(predicted_dict, measured_df, kwargs)
kwargs
is optionalFunction needs to return: (float) value computed by objective function
Below is the guide for referencing predicted values
To access
Use
hydrogen ion conc in aq
predicted_dict['h_eq']
extractant conc in org
predicted_dict['z_eq']
ES ion eq conc in aq
predicted_dict['{ES}_aq_eq']
ES complex eq conc in org
predicted_dict['{ES}_org_eq']
ES distribution ratio
predicted_dict['{ES}_d_eq']
Replace "{ES}" with extracted species element e.g. Nd, La, etc.
For measured values, use the same names, but replace
predicted_dict
withmeasured_df
optimizer --
(function or str) function to perform optimization
Note
The optimized variables are not directly the species parameters, but instead are first multiplied by the initial guess before sending becoming the species parameters.
For example, say
opt_dict = {'Nd(H(A)2)3(org):'h0':-4.7e6}
If the bounds on h0 need to be [-4.7e7,-4.7e5], then divide the bounds by the guess and get
"bounds": [(1e-1, 1e1)]
By default, the optimizer is scipy's optimize function with
default_kwargs= {"method": 'SLSQP', "bounds": [(1e-1, 1e1)] * len(x_guess), "constraints": (), "options": {'disp': True, 'maxiter': 1000, 'ftol': 1e-6}}
Function needs to take inputs:
optimizer(objective_function, x_guess, kwargs)
kwargs
is optional- Function needs to return: ((np.ndarray, float)) Optimized parameters,
objective_function value
temp_xml_file_path --
(str) path to temporary xml file.
This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file
default is local temp folder
dependant_params_dict --
(dict) dictionary containing information about parameters dependant on opt_dict. Has a similar structure to opt_dict except instead of input values, it has 3 other fields: 'function', 'kwargs', and 'independent_params.
'function' is a function of the form
function(independent_param__value_list, custom_objects_dict, **kwargs)
'kwargs' are the extra arguments to pass to function
'independent_params' is a list of parameter names in opt_dict that the dependent_param is a function of.
'custom_objects_dict' is for accessing the estimator's internal custom_objects_dict and must be included in the arguments, even if the custom_objects_dict is not set and is None.
See example code for usage.
custom_objects_dict -- (dict) dictionary containing custom objects format: {<object_name_string>: <object>,...}
-
fit
(objective_function=None, optimizer=None, objective_kwargs=None, optimizer_kwargs=None) → tuple[source]¶ Fits experimental to modeled data by minimizing objective function with optimizer. Returns dictionary with opt_dict structure
- Parameters
objective_function -- (function) function to compute objective If 'None', last set objective or default function is used
optimizer -- (function) function to perform optimization If 'None', last set optimizer or default is used
optimizer_kwargs -- (dict) optional arguments for optimizer
objective_kwargs -- (dict) optional arguments for objective function
- Returns tuple
(opt_dict (dict), opt_value (float)) optimized opt_dict: Has identical structure as opt_dict
-
get_aq_solvent_name
() → str[source]¶ Returns aq_solvent_name
- Returns
aq_solvent_name: (str) name of aqueous solvent in xml file
-
get_aq_solvent_rho
() → str[source]¶ Returns aqueous solvent density (g/L)
- Returns
aq_solvent_rho: (float) density of aqueous solvent
-
get_complex_names
() → list[source]¶ Returns list of complex names
- Returns
complex_names: (list) names of complexes in xml file.
-
get_custom_objects_dict
()[source]¶ Returns the custom_objects_dict
- Returns
custom_objects_dict: (dict) dictionary containing information about custom objects from user
-
get_dependant_params_dict
()[source]¶ Returns the dependant_params_dict
- Returns
dependant_params_dict: (dict) dictionary containing information about parameters dependant on opt_dict
-
get_diluant_name
() → str[source]¶ Returns diluant name :return: diluant_name: (str) name of diluant in xml file
-
get_diluant_rho
() → str[source]¶ Returns diluant density (g/L)
- Returns
diluant_rho: (float) density of diluant
-
get_exp_df
() → pandas.core.frame.DataFrame[source]¶ Returns the experimental DataFrame
- Returns
(pd.DataFrame) Experimental data
-
get_extractant_name
() → str[source]¶ Returns extractant name
- Returns
extractant_name: (str) name of extractant in xml file
-
get_extractant_rho
() → str[source]¶ Returns extractant density (g/L)
- Returns
extractant_rho: (float) density of extractant
-
get_extracted_species_ion_names
() → list[source]¶ Returns list of extracted species ion names
- Returns
extracted_species_ion_names: (list) names of extracted species ions in xml file
-
get_extracted_species_list
() → list[source]¶ Returns list of extracted species names
- Returns
extracted_species_list: (list) names of extracted species in xml file
-
get_in_moles
() → pandas.core.frame.DataFrame[source]¶ Returns the in_moles DataFrame which contains the initial mole fractions of each species for each experiment
- Returns
in_moles: (pd.DataFrame) DataFrame with initial mole fractions
-
get_objective_function
()[source]¶ Returns objective function
- Returns
objective_function: (func) Objective function to quantify error between model and experimental data
-
get_opt_dict
() → dict[source]¶ Returns the dictionary containing optimization information
- Returns
(dict) dictionary containing info about which species parameters are updated to fit model to experimental data
-
get_optimizer
()[source]¶ Returns objective function
- Returns
optimizer: (func) Optimizer function to minimize objective function
-
get_phases
() → list[source]¶ Returns the list of Cantera solutions
- Returns
(list) list of Cantera solutions/phases
-
get_predicted_dict
()[source]¶ Returns predicted dictionary of species concentrations that xml parameters predicts given current in_moles
- Returns
predicted_dict: (dict) dictionary of species concentrations
-
get_temp_xml_file_path
()[source]¶ Returns path to temporary xml file.
This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file.
- Returns
temp_xml_file_path: (str) path to temporary xml file.
-
log_mean_squared_error
(predicted_dict, meas_df)[source]¶ Default objective function for LLEPE
Returns the log mean squared error of predicted distribution ratios (d=n_org/n_aq) to measured d.
np.sum((np.log10(d_pred)-np.log10(d_meas))**2)
- Parameters
predicted_dict -- (dict) contains predicted data
meas_df -- (pd.DataFrame) contains experimental data
- Returns
(float) log mean squared error between predicted and measured
-
parity_plot
(compared_value=None, c_data=None, c_label=None, plot_title=None, save_path=None, print_r_squared=False, data_labels=None, legend=True)[source]¶ Parity plot between measured and predicted compared_value. Default compared value is {ES_1}_aq_eq
- Parameters
compared_value -- (str) Quantity to compare predicted and experimental data. Can be any column containing "eq" in exp_df e.g. h_eq, z_eq, {ES}_d_eq, etc.
plot_title --
(str or boolean)
- If None (default): Plot title will be generated from compared_value
Recommend to just explore. If h_eq, plot_title is "H^+ eq conc".
If str: Plot title will be plot_title string
If "False": No plot title
c_data -- (list or np.ndarray) data for color axis
c_label -- (str) label for color axis
save_path -- (str) save path for parity plot
print_r_squared -- (boolean) To plot or not to plot r-squared value. Prints 2 places past decimal
data_labels -- labels for the data such as paper's name where experiment is pulled from.
legend -- whether to display legend for data_labels. Has no effect if data_labels is None
- Return fig, ax
returns the figure and axes objects
-
static
plot_3d_data
(x_data, y_data, z_data, c_data=None, x_label=None, y_label=None, z_label=None, c_label=None)[source]¶ THis is for plotting 3d scatter plots. We suggest use matplotlib's ax.scatter to make 3d plots.
- Parameters
x_data -- (list) list of data for x axis
y_data -- (list) list of data for y axis
z_data -- (list) list of data for z axis
c_data -- (list) list of data for color axis
x_label -- (str) label for x axis
y_label -- (str) label for y axis
z_label -- (str) label for z axis
c_label -- (str) label for color axis
- Returns
-
r_squared
(compared_value=None)[source]¶ r-squared value comparing measured and predicted compared value
Closer to 1, the better the model's predictions.
- Parameters
compared_value -- (str) Quantity to compare predicted and experimental data. Can be any column containing "eq" in exp_df e.g. h_eq, z_eq, {ES}_d_eq, etc. default is {ES}_aq_eq
-
static
scipy_minimize
(objective, x_guess, optimizer_kwargs=None)[source]¶ The default optimizer for LLEPE
Uses scipy.minimize
By default, options are
default_kwargs= {"method": 'SLSQP', "bounds": [(1e-1, 1e1)]*len(x_guess), "constraints": (), "options": {'disp': True, 'maxiter': 1000, 'ftol': 1e-6}}
- Parameters
objective -- (func) the objective function
x_guess -- (np.ndarray) the initial guess (always 1)
optimizer_kwargs -- (dict) dictionary of options for minimize
- Returns
((np.ndarray, float)) Optimized parameters, objective_function value
-
set_aq_solvent_name
(aq_solvent_name)[source]¶ Change aq_solvent_name to input aq_solvent_name
- Parameters
aq_solvent_name -- (str) name of aqueous solvent in xml file
-
set_aq_solvent_rho
(aq_solvent_rho)[source]¶ Changes aqueous solvent density (g/L) to input aq_solvent_rho
- Parameters
aq_solvent_rho -- (float) density of aqueous solvent
-
set_complex_names
(complex_names)[source]¶ Change complex names list to input complex_names
- Parameters
complex_names -- (list) names of complexes in xml file.
-
set_custom_objects_dict
(custom_objects_dict)[source]¶ Sets the custom_objects_dict
- Parameters
custom_objects_dict -- (dict) dictionary containing information about about custom objects from user
-
set_dependant_params_dict
(dependant_params_dict)[source]¶ Sets the dependant_params_dict
- Parameters
dependant_params_dict -- (dict) dictionary containing information about parameters dependant on opt_dict
-
set_diluant_name
(diluant_name)[source]¶ Change diluant_name to input diluant_name
- Parameters
diluant_name -- (str) name of diluant in xml file
-
set_diluant_rho
(diluant_rho)[source]¶ Changes diluant density (g/L) to input diluant_rho
- Parameters
diluant_rho -- (float) density of diluant
-
set_exp_df
(exp_data)[source]¶ Changes the experimental DataFrame to input exp_csv_filename data and renames columns to internal LLEPE names
h_i, h_eq, z_i, z_eq, {ES}_aq_i, {ES}_aq_eq, {ES}_d
See class docstring on "exp_csv_filename" for further explanations.
- Parameters
exp_data -- (str or pd.DataFrame) file name/path or DataFrame for experimental data csv
-
set_extractant_name
(extractant_name)[source]¶ Change extractant_name to input extractant_name :param extractant_name: (str) name of extractant in xml file
-
set_extractant_rho
(extractant_rho)[source]¶ Changes extractant density (g/L) to input extractant_rho
- Parameters
extractant_rho -- (float) density of extractant
-
set_extracted_species_ion_names
(extracted_species_ion_names)[source]¶ - Change list of extracted species ion names to input
extracted_species_ion_names
- Parameters
extracted_species_ion_names -- (list) names of extracted species ions in xml file
-
set_extracted_species_list
(extracted_species_list)[source]¶ - Change list of extracted species ion names to input
extracted_species_ion_names
- Parameters
extracted_species_list -- (list) names of extracted species in xml file
-
set_in_moles
(feed_vol)[source]¶ Function that initializes mole fractions to input feed_vol
This function is called at initialization
Sets in_moles to a pd.DataFrame containing initial mole fractions
Columns for species and rows for different experiments
This function also calls update_predicted_dict
- Parameters
feed_vol -- (float) feed volume of mixture (L)
-
set_objective_function
(objective_function)[source]¶ Change objective function to input objective_function.
See class docstring on "objective_function" for instructions
- Parameters
objective_function -- (func) Objective function to quantify error between model and experimental data
-
set_opt_dict
(opt_dict)[source]¶ Change the dictionary to input opt_dict.
opt_dict specifies species parameters to be updated to fit model to data
See class docstring on "opt_dict" for more information.
- Parameters
opt_dict -- (dict) dictionary containing info about which species parameters are updated to fit model to experimental data
-
set_optimizer
(optimizer)[source]¶ Change optimizer function to input optimizer.
See class docstring on "optimizer" for instructions
- Parameters
optimizer -- (func) Optimizer function to minimize objective function
-
set_phases
(phases_xml_filename, phase_names)[source]¶ Change list of Cantera solutions by inputting new xml file name and phase names
Also runs set_in_moles to set feed volume to 1 L
- Parameters
phases_xml_filename -- (str) xml file with parameters for equilibrium calc
phase_names -- (list) names of phases in xml file
-
set_temp_xml_file_path
(temp_xml_file_path)[source]¶ Changes temporary xml file path to input temp_xml_file_path.
This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file.
- Parameters
temp_xml_file_path -- (str) path to temporary xml file.
-
update_custom_objects_dict
(info_dict)[source]¶ updates internal custom_objects_dict with info_dict
- Parameters
info_dict -- Requires an identical structure to opt_dict Ignores items with keys containing "custom_object_name"
- Returns
None
-
update_predicted_dict
(phases_xml_filename=None, phase_names=None)[source]¶ Function that computes the predicted equilibrium concentrations the fed phases_xml_filename parameters predicts given the initial mole fractions set by in_moles()
- Parameters
phases_xml_filename -- (str)xml file with parameters for equilibrium calc. If
None
, the current phases_xml_filename is used.phase_names -- (list) names of phases in xml file. If
None
, the current phases_names is used.
-
update_xml
(info_dict, phases_xml_filename=None, dependant_params_dict=None)[source]¶ updates xml file with info_dict
- Parameters
info_dict -- (dict) Requires an identical structure to opt_dict Ignores items with keys containing "custom_object_name"
phases_xml_filename -- (str) xml filename if editing other xml If
None
, the current xml will be modified and the internal Cantera phases will be refreshed to the new values.dependant_params_dict -- (dict) dictionary containing information about parameters dependant on info_dict