REEPS¶

Parameters¶

class reeps.REEPS(exp_csv_filename, phases_xml_filename, phase_names, aq_solvent_name, extractant_name, diluant_name, complex_names, rare_earth_ion_names, re_species_list=None, aq_solvent_rho=None, extractant_rho=None, diluant_rho=None, opt_dict=None, objective_function='Log-MSE', optimizer='SLSQP', temp_xml_file_path=None)[source]¶

Rare earth elements (REE or RE) Takes in experimental data Returns parameters for GEM

Note

The order in which the REEs appear in the csv file must be the same order as they appear in the xml, complex_names and rare_earth_ion_names.

For example, say in exp_csv_filename's csv, RE_1 is Nd RE_2 is Pr, and

aq_solvent_name = 'H2O(L)'
extractant_name = '(HA)2(org)'
diluent_name = 'dodecane'

Then:

The csvs column ordering must be:

[h_i, h_eq, z_i, z_eq, Nd_aq_i, Nd_aq_eq, Nd_d_eq, Pr_aq_i, Pr_aq_eq, Pr_d_eq]

The aqueous speciesArray must be "H2O(L) H+ OH- Cl- Nd+++ Pr+++"

The organic speciesArray must be "(HA)2(org) dodecane Nd(H(A)2)3(org) Pr(H(A)2)3(org)"

complex_names = ['Nd(H(A)2)3(org)', 'Pr(H(A)2)3(org)']
rare_earth_ion_names = ['Nd+++', 'Pr+++']

Parameters

exp_csv_filename --

(str) csv file name with experimental data

In the .csv file, the rows are different experiments and columns are the measured quantities.

The ordering of the columns needs to be:

[h_i, h_eq, z_i, z_eq, {RE_1}_aq_i, {RE_1}_aq_eq, {RE_1}_d_eq, {RE_2}_aq_i, {RE_2}_aq_eq, {RE_2}_d_eq,... {RE_N}_aq_i, {RE_N}_aq_eq, {RE_N}_d_eq]

Naming does not matter, just the order.

Where {RE_1}-{RE_N} are the rare earth element names of interest i.e. Nd, Pr, La, etc.

Below is an explanation of the columns.

Index	Column	Meaning
0	h_i	Initial Concentration of H+ ions (mol/L)
1	h_eq	Equilibrium concentration of H+ ions (mol/L)
2	z_i	Initial concentration of extractant (mol/L)
3	z_eq	Equilibrium concentration of extractant (mol/L)
4	{RE}_aq_i	Initial concentration of RE ions (mol/L)
5	{RE}_aq_eq	Equilibrium concentration of RE ions in aqueous phase (mol/L)
6	{RE}_d_eq	Equilibrium Ratio between amount of RE atoms in organic to aqueous

phases_xml_filename --
(str) xml file with parameters for equilibrium calc

Would recommend copying and modifying xmls located in data/xmls or in Cantera's "data" folder

speciesArray fields need specific ordering.

In aqueous phase: aq_solvent_name, H+, OH-, Cl-, RE_1, RE_2, ..., RE_N

For aqueous phase, RE_1-RE_N represent RE ion names i.e. Nd+++, Pr+++

In organic phase : extractant_name, diluant_name, RE_1, RE_2, ..., RE_N

For organic phase, RE_1-RE_N represent RE complex names i.e. Nd(H(A)2)3(org), Pr(H(A)2)3(org)
phase_names --
(list) names of phases in xml file

Found in the xml file under <phase ... id={phase_name}>
aq_solvent_name -- (str) name of aqueous solvent in xml file
extractant_name -- (str) name of extractant in xml file
diluant_name -- (str) name of diluant in xml file
complex_names --
(list) names of complexes in xml file.

Ensure the ordering is correct
rare_earth_ion_names --
(list) names of rare earth ions in xml file

Ensure the ordering is correct
re_species_list --
(list) names of rare earth elements.

If None, re_species_list will be rare_earth_ion_names without '+'
i.e. 'Nd+++'->'Nd'

Ensure the ordering is correct
aq_solvent_rho --
(float) density of solvent (g/L)

If None, molar volume/molecular weight is used from xml
extractant_rho --
(float) density of extractant (g/L)

If None, molar volume/molecular weight is used from xml
diluant_rho --
(float) density of diluant (g/L)

If None, molar volume/molecular weight is used from xml

opt_dict --

(dict) dictionary containing info about which species parameters are updated to fit model to experimental data

Should have the format as below

opt_dict = {"species1":
                       {"parameter1": "guess1",
                       "parameter2": "guess2",
                        ...
                        "parameterN": "guessN"}
           "species2":
                         {"parameter1": "guess1",
                         "parameter2": "guess2",
                         ...
                         "parameterN": "guessN"}
           ...
           "speciesN":
                        {"parameter1": "guess1",
                        "parameter2": "guess2",
                        ...
                        "parameterN": "guessN"}
                         }

objective_function --

(function or str) function to compute objective

By default, the objective function is log mean squared error of distribution ratio

np.sum((np.log10(d_pred)-np.log10(d_meas))^2)

Function needs to take inputs:

objective_function(predicted_dict, measured_df, kwargs)

kwargs is optional

Function needs to return: (float) value computed by objective function

Below is the guide for referencing predicted values

To access	Use
hydrogen ion conc in aq	predicted_dict['h_eq']
extractant conc in org	predicted_dict['z_eq']
RE ion eq conc in aq	predicted_dict['{RE}_aq_eq']
RE complex eq conc in org	predicted_dict['{RE}_org_eq']
RE distribution ratio	predicted_dict['{RE}_d_eq']

Replace "{RE}" with rare earth element i.e. Nd, La, etc.

For measured values, use the same names, but replace predicted_dict with measured_df

optimizer --
(function or str) function to perform optimization
Note

The optimized variables are not directly the species parameters, but instead are first multiplied by the initial guess before sending becoming the species parameters.

For example, say
```
opt_dict = {'Nd(H(A)2)3(org):'h0':-4.7e6}
```
If the bounds on h0 need to be [-4.7e7,-4.7e5], then divide the bounds by the guess and get
```
"bounds": [(1e-1, 1e1)]
```
Though fit() returns a structure identical to opt_dict with correct scaled values, in case bounds and constraints are used, you must note the optimized x's are first multiplied by the initial guess before written to the xml.
By default, the optimizer is scipy's optimize function with
```
default_kwargs= {"method": 'SLSQP',
                 "bounds": [(1e-1, 1e1)] * len(x_guess),
                 "constraints": (),
                 "options": {'disp': True,
                             'maxiter': 1000,
                             'ftol': 1e-6}}
```
Function needs to take inputs: optimizer(objective_function, x_guess, kwargs)

kwargs is optional

Function needs to return: (np.ndarray) Optimized parameters
temp_xml_file_path --
(str) path to temporary xml file.

This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file

default is local temp folder

fit(objective_function=None, optimizer=None, objective_kwargs=None, optimizer_kwargs=None) → dict[source]¶

Fits experimental to modeled data by minimizing objective function with optimizer. Returns dictionary with opt_dict structure

Parameters

objective_function -- (function) function to compute objective
optimizer -- (function) function to perform optimization
optimizer_kwargs -- (dict) arguments for optimizer
objective_kwargs -- (dict) arguments for objective function

Returns opt_dict

(dict) optimized opt_dict. Has identical structure as opt_dict

get_aq_solvent_name() → str[source]¶

Returns aq_solvent_name

Returns: aq_solvent_name: (str) name of aqueous solvent in xml file

get_aq_solvent_rho() → str[source]¶

Returns aqueous solvent density (g/L)

Returns: aq_solvent_rho: (float) density of aqueous solvent

get_complex_names() → list[source]¶

Returns list of complex names

Returns: complex_names: (list) names of complexes in xml file.

get_diluant_name() → str[source]¶: Returns diluant name :return: diluant_name: (str) name of diluant in xml file

get_diluant_rho() → str[source]¶

Returns diluant density (g/L)

Returns: diluant_rho: (float) density of diluant

get_exp_df() → pandas.core.frame.DataFrame[source]¶

Returns the experimental DataFrame

Returns: (pd.DataFrame) Experimental data

get_extractant_name() → str[source]¶

Returns extractant name

Returns: extractant_name: (str) name of extractant in xml file

get_extractant_rho() → str[source]¶

Returns extractant density (g/L)

Returns: extractant_rho: (float) density of extractant

get_in_moles() → pandas.core.frame.DataFrame[source]¶

Returns the in_moles DataFrame which contains the initial mole fractions of each species for each experiment

Returns: in_moles: (pd.DataFrame) DataFrame with initial mole fractions

get_objective_function()[source]¶

Returns objective function

Returns: objective_function: (func) Objective function to quantify error between model and experimental data

get_opt_dict() → dict[source]¶

Returns the dictionary containing optimization information

Returns: (dict) dictionary containing info about which species parameters are updated to fit model to experimental data

get_optimizer()[source]¶

Returns objective function

Returns: optimizer: (func) Optimizer function to minimize objective function

get_phases() → list[source]¶

Returns the list of Cantera solutions

Returns: (list) list of Cantera solutions/phases

get_predicted_dict()[source]¶

Returns predicted dictionary of species concentrations that xml parameters predicts given current in_moles

Returns: predicted_dict: (dict) dictionary of species concentrations

get_rare_earth_ion_names() → list[source]¶

Returns list of rare earth ion names

Returns: rare_earth_ion_names: (list) names of rare earth ions in xml file

get_re_species_list() → list[source]¶

Returns list of rare earth element names

Returns: re_species_list: (list) names of rare earth elements in xml file

get_temp_xml_file_path()[source]¶

Returns path to temporary xml file.

This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file.

Returns: temp_xml_file_path: (str) path to temporary xml file.

log_mean_squared_error(predicted_dict, meas_df)[source]¶

Default objective function for REEPS

Returns the log mean squared error of predicted distribution ratios (d=n_org/n_aq) to measured d.

np.sum((np.log10(d_pred)-np.log10(d_meas))**2)

Parameters

predicted_dict -- (dict) contains predicted data
meas_df -- (pd.DataFrame) contains experimental data

Returns

(float) log mean squared error between predicted and measured

parity_plot(compared_value=None, save_path=None, print_r_squared=False)[source]¶

Parity plot between measured and predicted compared_value. Default compared value is {RE_1}_aq_eq

Parameters

compared_value -- (str) Quantity to compare predicted and experimental data. Can be any column containing "eq" in exp_df i.e. h_eq, z_eq, {RE}_d_eq, etc.
save_path -- (str) save path for parity plot
print_r_squared -- (boolean) To plot or not to plot r-squared value. Prints 2 places past decimal

r_squared(compared_value=None)[source]¶

r-squared value comparing measured and predicted compared value

Closer to 1, the better the model's predictions.

Parameters: compared_value -- (str) Quantity to compare predicted and experimental data. Can be any column containing "eq" in exp_df i.e. h_eq, z_eq, {RE}_d_eq, etc.

set_aq_solvent_name(aq_solvent_name)[source]¶

Change aq_solvent_name to input aq_solvent_name

Parameters: aq_solvent_name -- (str) name of aqueous solvent in xml file

set_aq_solvent_rho(aq_solvent_rho)[source]¶

Changes aqueous solvent density (g/L) to input aq_solvent_rho

Parameters: aq_solvent_rho -- (float) density of aqueous solvent

set_complex_names(complex_names)[source]¶

Change complex names list to input complex_names

Parameters: complex_names -- (list) names of complexes in xml file.

set_diluant_name(diluant_name)[source]¶

Change diluant_name to input diluant_name

Parameters: diluant_name -- (str) name of diluant in xml file

set_diluant_rho(diluant_rho)[source]¶

Changes diluant density (g/L) to input diluant_rho

Parameters: diluant_rho -- (float) density of diluant

set_exp_df(exp_csv_filename)[source]¶

Changes the experimental DataFrame to input exp_csv_filename data and renames columns to internal REEPS names

h_i, h_eq, z_i, z_eq, {RE}_aq_i, {RE}_aq_eq, {RE}_d

See class docstring on "exp_csv_filename" for further explanations.

Parameters: exp_csv_filename -- (str) file name/path for experimental data csv

set_extractant_name(extractant_name)[source]¶: Change extractant_name to input extractant_name :param extractant_name: (str) name of extractant in xml file

set_extractant_rho(extractant_rho)[source]¶

Changes extractant density (g/L) to input extractant_rho

Parameters: extractant_rho -- (float) density of extractant

set_in_moles(feed_vol)[source]¶

Function that initializes mole fractions to input feed_vol

This function is called at initialization

Sets in_moles to a pd.DataFrame containing initial mole fractions

Columns for species and rows for different experiments

This function also calls update_predicted_dict

Parameters: feed_vol -- (float) feed volume of mixture (g/L)

set_objective_function(objective_function)[source]¶

Change objective function to input objective_function.

See class docstring on "objective_function" for instructions

Parameters: objective_function -- (func) Objective function to quantify error between model and experimental data

set_opt_dict(opt_dict)[source]¶

Change the dictionary to input opt_dict.

opt_dict specifies species parameters to be updated to fit model to data

See class docstring on "opt_dict" for more information.

Parameters: opt_dict -- (dict) dictionary containing info about which species parameters are updated to fit model to experimental data

set_optimizer(optimizer)[source]¶

Change optimizer function to input optimizer.

See class docstring on "optimizer" for instructions

Parameters: optimizer -- (func) Optimizer function to minimize objective function

set_phases(phases_xml_filename, phase_names)[source]¶

Change list of Cantera solutions by inputting new xml file name and phase names

Also runs set_in_moles to set initial molality to 1 g/L

Parameters

phases_xml_filename -- (str) xml file with parameters for equilibrium calc
phase_names -- (list) names of phases in xml file

set_rare_earth_ion_names(rare_earth_ion_names)[source]¶

Change list of rare earth ion names to input: rare_earth_ion_names

Parameters: rare_earth_ion_names -- (list) names of rare earth ions in xml file

set_re_species_list(re_species_list)[source]¶

Change list of rare earth ion names to input: rare_earth_ion_names

Parameters: re_species_list -- (list) names of rare earth elements in xml file

set_temp_xml_file_path(temp_xml_file_path)[source]¶

Changes temporary xml file path to input temp_xml_file_path.

This xml file is a duplicate of the phases_xml_file name and is modified during the optimization process to avoid changing the original xml file.

Parameters: temp_xml_file_path -- (str) path to temporary xml file.

static slsqp_optimizer(objective, x_guess)[source]¶

The default optimizer for REEPS

Uses scipy.minimize with options

default_kwargs= {"method": 'SLSQP',
                "bounds": [(1e-1, 1e1)*len(x_guess)],
                "constraints": (),
                "options": {'disp': True,
                            'maxiter': 1000,
                            'ftol': 1e-6}}

Parameters

objective -- (func) the objective function
x_guess -- (np.ndarray) the initial guess (always 1)

Returns

(np.ndarray) Optimized parameters

update_predicted_dict(phases_xml_filename=None, phase_names=None)[source]¶

Function that computes the predicted equilibrium concentrations the fed phases_xml_filename parameters predicts given the initial mole fractions set by in_moles()

Parameters

phases_xml_filename -- (str)xml file with parameters for equilibrium calc. If None, the current phases_xml_filename is used.
phase_names -- (list) names of phases in xml file. If None, the current phases_names is used.

update_xml(info_dict, phases_xml_filename=None)[source]¶

updates xml file with info_dict

Parameters

info_dict -- (dict) info in {species_names:{thermo_prop:val}} Requires an identical structure to opt_dict
phases_xml_filename -- (str) xml filename if editing other xml If None, the current xml will be modified and the internal Cantera phases will be refreshed to the new values.