REEPS¶
Parameters¶
-
class
reeps.
REEPS
(exp_csv_filename, phases_xml_filename, phase_names, aq_solvent_name, extractant_name, diluant_name, complex_names, rare_earth_ion_names, re_species_list=None, aq_solvent_rho=None, extractant_rho=None, diluant_rho=None, opt_dict=None, objective_function='Log-MSE', optimizer='SLSQP', temp_xml_file_path=None)[source]¶ Rare earth elements (REE or RE) Takes in experimental data Returns parameters for GEM
Note
The order in which the REEs appear in the csv file must be the same order as they appear in the xml, complex_names and rare_earth_ion_names.
For example, say in exp_csv_filename's csv, RE_1 is Nd RE_2 is Pr, and
aq_solvent_name = 'H2O(L)' extractant_name = '(HA)2(org)' diluent_name = 'dodecane'
Then:
The csvs column ordering must be:
[h_i, h_eq, z_i, z_eq, Nd_aq_i, Nd_aq_eq, Nd_d_eq, Pr_aq_i, Pr_aq_eq, Pr_d_eq]
The aqueous speciesArray must be "H2O(L) H+ OH- Cl- Nd+++ Pr+++"
The organic speciesArray must be "(HA)2(org) dodecane Nd(H(A)2)3(org) Pr(H(A)2)3(org)"
complex_names = ['Nd(H(A)2)3(org)', 'Pr(H(A)2)3(org)'] rare_earth_ion_names = ['Nd+++', 'Pr+++']
- Parameters
exp_csv_filename --
(str) csv file name with experimental data
In the .csv file, the rows are different experiments and columns are the measured quantities.
The ordering of the columns needs to be:
[h_i, h_eq, z_i, z_eq, {RE_1}_aq_i, {RE_1}_aq_eq, {RE_1}_d_eq, {RE_2}_aq_i, {RE_2}_aq_eq, {RE_2}_d_eq,... {RE_N}_aq_i, {RE_N}_aq_eq, {RE_N}_d_eq]
Naming does not matter, just the order.
Where {RE_1}-{RE_N} are the rare earth element names of interest i.e. Nd, Pr, La, etc.
Below is an explanation of the columns.
Index
Column
Meaning
0
h_i
Initial Concentration of H+ ions (mol/L)
1
h_eq
Equilibrium concentration of H+ ions (mol/L)
2
z_i
Initial concentration of extractant (mol/L)
3
z_eq
Equilibrium concentration of extractant (mol/L)
4
{RE}_aq_i
Initial concentration of RE ions (mol/L)
5
{RE}_aq_eq
Equilibrium concentration of RE ions in aqueous phase (mol/L)
6
{RE}_d_eq
Equilibrium Ratio between amount of RE atoms in organic to aqueous
phases_xml_filename --
(str) xml file with parameters for equilibrium calc
Would recommend copying and modifying xmls located in data/xmls or in Cantera's "data" folder
speciesArray fields need specific ordering.
In aqueous phase: aq_solvent_name, H+, OH-, Cl-, RE_1, RE_2, ..., RE_N
For aqueous phase, RE_1-RE_N represent RE ion names i.e. Nd+++, Pr+++
In organic phase : extractant_name, diluant_name, RE_1, RE_2, ..., RE_N
For organic phase, RE_1-RE_N represent RE complex names i.e. Nd(H(A)2)3(org), Pr(H(A)2)3(org)
phase_names --
(list) names of phases in xml file
Found in the xml file under <phase ... id={phase_name}>
aq_solvent_name -- (str) name of aqueous solvent in xml file
extractant_name -- (str) name of extractant in xml file
diluant_name -- (str) name of diluant in xml file
complex_names --
(list) names of complexes in xml file.
Ensure the ordering is correct
rare_earth_ion_names --
(list) names of rare earth ions in xml file
Ensure the ordering is correct
re_species_list --
(list) names of rare earth elements.
- If
None
, re_species_list will be rare_earth_ion_names without '+' i.e. 'Nd+++'->'Nd'
Ensure the ordering is correct
- If
aq_solvent_rho --
(float) density of solvent (g/L)
If
None
, molar volume/molecular weight is used from xmlextractant_rho --
(float) density of extractant (g/L)
If
None
, molar volume/molecular weight is used from xmldiluant_rho --
(float) density of diluant (g/L)
If
None
, molar volume/molecular weight is used from xmlopt_dict --
(dict) dictionary containing info about which species parameters are updated to fit model to experimental data
Should have the format as below
opt_dict = {"species1": {"parameter1": "guess1", "parameter2": "guess2", ... "parameterN": "guessN"} "species2": {"parameter1": "guess1", "parameter2": "guess2", ... "parameterN": "guessN"} ... "speciesN": {"parameter1": "guess1", "parameter2": "guess2", ... "parameterN": "guessN"} }
objective_function --
(function or str) function to compute objective
By default, the objective function is log mean squared error of distribution ratio
np.sum((np.log10(d_pred)-np.log10(d_meas))^2)
Function needs to take inputs:
objective_function(predicted_dict, measured_df, kwargs)
kwargs
is optionalFunction needs to return: (float) value computed by objective function
Below is the guide for referencing predicted values
To access
Use
hydrogen ion conc in aq
predicted_dict['h_eq']
extractant conc in org
predicted_dict['z_eq']
RE ion eq conc in aq
predicted_dict['{RE}_aq_eq']
RE complex eq conc in org
predicted_dict['{RE}_org_eq']
RE distribution ratio
predicted_dict['{RE}_d_eq']
Replace "{RE}" with rare earth element i.e. Nd, La, etc.
For measured values, use the same names, but replace
predicted_dict
withmeasured_df
optimizer --
(function or str) function to perform optimization
Note
The optimized variables are not directly the species parameters, but instead are first multiplied by the initial guess before sending becoming the species parameters.
For example, say
opt_dict = {'Nd(H(A)2)3(org):'h0':-4.7e6}
If the bounds on h0 need to be [-4.7e7,-4.7e5], then divide the bounds by the guess and get
"bounds": [(1e-1, 1e1)]
Though fit() returns a structure identical to opt_dict with correct scaled values, in case bounds and constraints are used, you must note the optimized x's are first multiplied by the initial guess before written to the xml.
By default, the optimizer is scipy's optimize function with
default_kwargs= {"method": 'SLSQP', "bounds": [(1e-1, 1e1)] * len(x_guess), "constraints": (), "options": {'disp': True, 'maxiter': 1000, 'ftol': 1e-6}}
Function needs to take inputs:
optimizer(objective_function, x_guess, kwargs)
kwargs
is optionalFunction needs to return: (np.ndarray) Optimized parameters
temp_xml_file_path --
(str) path to temporary xml file.
This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file
default is local temp folder
-
fit
(objective_function=None, optimizer=None, objective_kwargs=None, optimizer_kwargs=None) → dict[source]¶ Fits experimental to modeled data by minimizing objective function with optimizer. Returns dictionary with opt_dict structure
- Parameters
objective_function -- (function) function to compute objective
optimizer -- (function) function to perform optimization
optimizer_kwargs -- (dict) arguments for optimizer
objective_kwargs -- (dict) arguments for objective function
- Returns opt_dict
(dict) optimized opt_dict. Has identical structure as opt_dict
-
get_aq_solvent_name
() → str[source]¶ Returns aq_solvent_name
- Returns
aq_solvent_name: (str) name of aqueous solvent in xml file
-
get_aq_solvent_rho
() → str[source]¶ Returns aqueous solvent density (g/L)
- Returns
aq_solvent_rho: (float) density of aqueous solvent
-
get_complex_names
() → list[source]¶ Returns list of complex names
- Returns
complex_names: (list) names of complexes in xml file.
-
get_diluant_name
() → str[source]¶ Returns diluant name :return: diluant_name: (str) name of diluant in xml file
-
get_diluant_rho
() → str[source]¶ Returns diluant density (g/L)
- Returns
diluant_rho: (float) density of diluant
-
get_exp_df
() → pandas.core.frame.DataFrame[source]¶ Returns the experimental DataFrame
- Returns
(pd.DataFrame) Experimental data
-
get_extractant_name
() → str[source]¶ Returns extractant name
- Returns
extractant_name: (str) name of extractant in xml file
-
get_extractant_rho
() → str[source]¶ Returns extractant density (g/L)
- Returns
extractant_rho: (float) density of extractant
-
get_in_moles
() → pandas.core.frame.DataFrame[source]¶ Returns the in_moles DataFrame which contains the initial mole fractions of each species for each experiment
- Returns
in_moles: (pd.DataFrame) DataFrame with initial mole fractions
-
get_objective_function
()[source]¶ Returns objective function
- Returns
objective_function: (func) Objective function to quantify error between model and experimental data
-
get_opt_dict
() → dict[source]¶ Returns the dictionary containing optimization information
- Returns
(dict) dictionary containing info about which species parameters are updated to fit model to experimental data
-
get_optimizer
()[source]¶ Returns objective function
- Returns
optimizer: (func) Optimizer function to minimize objective function
-
get_phases
() → list[source]¶ Returns the list of Cantera solutions
- Returns
(list) list of Cantera solutions/phases
-
get_predicted_dict
()[source]¶ Returns predicted dictionary of species concentrations that xml parameters predicts given current in_moles
- Returns
predicted_dict: (dict) dictionary of species concentrations
-
get_rare_earth_ion_names
() → list[source]¶ Returns list of rare earth ion names
- Returns
rare_earth_ion_names: (list) names of rare earth ions in xml file
-
get_re_species_list
() → list[source]¶ Returns list of rare earth element names
- Returns
re_species_list: (list) names of rare earth elements in xml file
-
get_temp_xml_file_path
()[source]¶ Returns path to temporary xml file.
This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file.
- Returns
temp_xml_file_path: (str) path to temporary xml file.
-
log_mean_squared_error
(predicted_dict, meas_df)[source]¶ Default objective function for REEPS
Returns the log mean squared error of predicted distribution ratios (d=n_org/n_aq) to measured d.
np.sum((np.log10(d_pred)-np.log10(d_meas))**2)
- Parameters
predicted_dict -- (dict) contains predicted data
meas_df -- (pd.DataFrame) contains experimental data
- Returns
(float) log mean squared error between predicted and measured
-
parity_plot
(compared_value=None, save_path=None, print_r_squared=False)[source]¶ Parity plot between measured and predicted compared_value. Default compared value is {RE_1}_aq_eq
- Parameters
compared_value -- (str) Quantity to compare predicted and experimental data. Can be any column containing "eq" in exp_df i.e. h_eq, z_eq, {RE}_d_eq, etc.
save_path -- (str) save path for parity plot
print_r_squared -- (boolean) To plot or not to plot r-squared value. Prints 2 places past decimal
-
r_squared
(compared_value=None)[source]¶ r-squared value comparing measured and predicted compared value
Closer to 1, the better the model's predictions.
- Parameters
compared_value -- (str) Quantity to compare predicted and experimental data. Can be any column containing "eq" in exp_df i.e. h_eq, z_eq, {RE}_d_eq, etc.
-
set_aq_solvent_name
(aq_solvent_name)[source]¶ Change aq_solvent_name to input aq_solvent_name
- Parameters
aq_solvent_name -- (str) name of aqueous solvent in xml file
-
set_aq_solvent_rho
(aq_solvent_rho)[source]¶ Changes aqueous solvent density (g/L) to input aq_solvent_rho
- Parameters
aq_solvent_rho -- (float) density of aqueous solvent
-
set_complex_names
(complex_names)[source]¶ Change complex names list to input complex_names
- Parameters
complex_names -- (list) names of complexes in xml file.
-
set_diluant_name
(diluant_name)[source]¶ Change diluant_name to input diluant_name
- Parameters
diluant_name -- (str) name of diluant in xml file
-
set_diluant_rho
(diluant_rho)[source]¶ Changes diluant density (g/L) to input diluant_rho
- Parameters
diluant_rho -- (float) density of diluant
-
set_exp_df
(exp_csv_filename)[source]¶ Changes the experimental DataFrame to input exp_csv_filename data and renames columns to internal REEPS names
h_i, h_eq, z_i, z_eq, {RE}_aq_i, {RE}_aq_eq, {RE}_d
See class docstring on "exp_csv_filename" for further explanations.
- Parameters
exp_csv_filename -- (str) file name/path for experimental data csv
-
set_extractant_name
(extractant_name)[source]¶ Change extractant_name to input extractant_name :param extractant_name: (str) name of extractant in xml file
-
set_extractant_rho
(extractant_rho)[source]¶ Changes extractant density (g/L) to input extractant_rho
- Parameters
extractant_rho -- (float) density of extractant
-
set_in_moles
(feed_vol)[source]¶ Function that initializes mole fractions to input feed_vol
This function is called at initialization
Sets in_moles to a pd.DataFrame containing initial mole fractions
Columns for species and rows for different experiments
This function also calls update_predicted_dict
- Parameters
feed_vol -- (float) feed volume of mixture (g/L)
-
set_objective_function
(objective_function)[source]¶ Change objective function to input objective_function.
See class docstring on "objective_function" for instructions
- Parameters
objective_function -- (func) Objective function to quantify error between model and experimental data
-
set_opt_dict
(opt_dict)[source]¶ Change the dictionary to input opt_dict.
opt_dict specifies species parameters to be updated to fit model to data
See class docstring on "opt_dict" for more information.
- Parameters
opt_dict -- (dict) dictionary containing info about which species parameters are updated to fit model to experimental data
-
set_optimizer
(optimizer)[source]¶ Change optimizer function to input optimizer.
See class docstring on "optimizer" for instructions
- Parameters
optimizer -- (func) Optimizer function to minimize objective function
-
set_phases
(phases_xml_filename, phase_names)[source]¶ Change list of Cantera solutions by inputting new xml file name and phase names
Also runs set_in_moles to set initial molality to 1 g/L
- Parameters
phases_xml_filename -- (str) xml file with parameters for equilibrium calc
phase_names -- (list) names of phases in xml file
-
set_rare_earth_ion_names
(rare_earth_ion_names)[source]¶ - Change list of rare earth ion names to input
rare_earth_ion_names
- Parameters
rare_earth_ion_names -- (list) names of rare earth ions in xml file
-
set_re_species_list
(re_species_list)[source]¶ - Change list of rare earth ion names to input
rare_earth_ion_names
- Parameters
re_species_list -- (list) names of rare earth elements in xml file
-
set_temp_xml_file_path
(temp_xml_file_path)[source]¶ Changes temporary xml file path to input temp_xml_file_path.
This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file.
- Parameters
temp_xml_file_path -- (str) path to temporary xml file.
-
static
slsqp_optimizer
(objective, x_guess)[source]¶ The default optimizer for REEPS
Uses scipy.minimize with options
default_kwargs= {"method": 'SLSQP', "bounds": [(1e-1, 1e1)*len(x_guess)], "constraints": (), "options": {'disp': True, 'maxiter': 1000, 'ftol': 1e-6}}
- Parameters
objective -- (func) the objective function
x_guess -- (np.ndarray) the initial guess (always 1)
- Returns
(np.ndarray) Optimized parameters
-
update_predicted_dict
(phases_xml_filename=None, phase_names=None)[source]¶ Function that computes the predicted equilibrium concentrations the fed phases_xml_filename parameters predicts given the initial mole fractions set by in_moles()
- Parameters
phases_xml_filename -- (str)xml file with parameters for equilibrium calc. If
None
, the current phases_xml_filename is used.phase_names -- (list) names of phases in xml file. If
None
, the current phases_names is used.
-
update_xml
(info_dict, phases_xml_filename=None)[source]¶ updates xml file with info_dict
- Parameters
info_dict -- (dict) info in {species_names:{thermo_prop:val}} Requires an identical structure to opt_dict
phases_xml_filename -- (str) xml filename if editing other xml If
None
, the current xml will be modified and the internal Cantera phases will be refreshed to the new values.