{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# LLEPE Tutorial - Getting started\n", "## Introduction\n", "In this notebook, you will learn how to use LLEPE to fit thermodynamic parameters to experimental data and explore how well the parameters fit.\n", "## Installation\n", "Create a conda environment with the following command. The environment name in this example is \"thermo_env\".
\n", "```$ conda create --name thermo_env python=3.7```
\n", "Then run the following line to activate the environment" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "```$ conda activate thermo_env```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In your terminal run
\n", "```$ git clone https://xgitlab.cels.anl.gov/summer-2020/parameter-estimation.git```
\n", "Navigate into the folder with
\n", "```$ cd parameter-estimation```
\n", "And run
\n", "```pip install -e.```
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Import and instantiate LLEPE" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, you will need to import the package and instantiate LLEPE with a few parameters." ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "ename": "ModuleNotFoundError", "evalue": "No module named 'llepe'", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mModuleNotFoundError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0;32mfrom\u001b[0m \u001b[0mllepe\u001b[0m \u001b[0;32mimport\u001b[0m \u001b[0mLLEPE\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 2\u001b[0m opt_dict = {'Nd(H(A)2)3(org)_h0': {'upper_element_name': 'species',\n\u001b[1;32m 3\u001b[0m \u001b[0;34m'upper_attrib_name'\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'name'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 4\u001b[0m \u001b[0;34m'upper_attrib_value'\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'Nd(H(A)2)3(org)'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;34m'lower_element_name'\u001b[0m\u001b[0;34m:\u001b[0m \u001b[0;34m'h0'\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", "\u001b[0;31mModuleNotFoundError\u001b[0m: No module named 'llepe'" ] } ], "source": [ "from llepe import LLEPE\n", "opt_dict = {'Nd(H(A)2)3(org)_h0': {'upper_element_name': 'species',\n", " 'upper_attrib_name': 'name',\n", " 'upper_attrib_value': 'Nd(H(A)2)3(org)',\n", " 'lower_element_name': 'h0',\n", " 'lower_attrib_name': None,\n", " 'lower_attrib_value': None,\n", " 'input_format': '{0}',\n", " 'input_value': -4.7e6}}\n", "llepe_parameters = {'exp_csv_filename': '../../data/csvs/Nd_exp_data.csv',\n", " 'phases_xml_filename': '../../data/xmls/twophase.xml',\n", " 'opt_dict': opt_dict,\n", " 'phase_names': ['HCl_electrolyte', 'PC88A_liquid'],\n", " 'aq_solvent_name': 'H2O(L)',\n", " 'extractant_name': '(HA)2(org)',\n", " 'diluant_name': 'dodecane',\n", " 'complex_names': ['Nd(H(A)2)3(org)'],\n", " 'extracted_species_ion_names': ['Nd+++'],\n", " 'aq_solvent_rho': 1000.0,\n", " 'extractant_rho': 960.0,\n", " 'diluant_rho': 750.0}\n", "estimator = LLEPE(**llepe_parameters)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Parameters explanation " ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### exp_csv_filename" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "exp_csv_filename is the file name for the csv containing experimental data.
\n", "Let us get the pandas dataframe created by LLEPE." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
h_ih_eqz_iz_eqNd_aq_iNd_aq_eqNd_d_eq
00.010.08830410.9216960.0500010.02391.0921
10.010.10509410.9049060.0999980.06830.4641
20.010.10901710.9009830.1500060.11700.2821
30.010.10601210.9039880.2000040.16800.1905
40.010.11893410.8910660.3000110.26370.1377
\n", "
" ], "text/plain": [ " h_i h_eq z_i z_eq Nd_aq_i Nd_aq_eq Nd_d_eq\n", "0 0.01 0.088304 1 0.921696 0.050001 0.0239 1.0921\n", "1 0.01 0.105094 1 0.904906 0.099998 0.0683 0.4641\n", "2 0.01 0.109017 1 0.900983 0.150006 0.1170 0.2821\n", "3 0.01 0.106012 1 0.903988 0.200004 0.1680 0.1905\n", "4 0.01 0.118934 1 0.891066 0.300011 0.2637 0.1377" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "searcher.get_exp_df()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The rows are for experiments, and the columns are for the measured quantaties.
\n", "LLEPE is looking for the ordering of these columns so it is important your experimental file has this ordering. Column names do not matter.
\n", "Below is a table explaining the meaning of the column headers and the needed column order.
\n", "If you have more than one rare earth element, append the data to the end in the same order (aq_i, aq_eq, d_eq)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "| Order | Column | Meaning |\n", "|-------|------------|--------------------------------------------------------------------|\n", "| 0 | h_i | Initial Concentration of H+ ions (mol/L) |\n", "| 1 | h_eq | Equilibrium concentration of H+ ions (mol/L) |\n", "| 2 | z_i | Initial concentration of extractant (mol/L) |\n", "| 3 | z_eq | Equilibrium concentration of extractant (mol/L) |\n", "| 4 | \\{RE\\}\\_aq_i | Initial concentration of RE ions (mol/L) |\n", "| 5 | \\{RE\\}\\_aq_eq | Equilibrium concentration of RE ions in aqueous phase (mol/L) |\n", "| 6 | \\{RE\\}\\_d_eq | Equilibrium Ratio between amount of RE atoms in organic to aqueous |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### phases_xml_filename" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is the xml file containing information to be loaded into Cantera, the thermodynamic modeling package.
\n", "Please see parameter-estimation/data/xmls for file examples.
\n", "We can explore what has been loaded." ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[, ]\n" ] } ], "source": [ "print(searcher.get_phases())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "It is a list of two Cantera solutions so we will dig in a little further and see what species these solutions contain." ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "scrolled": true }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "HCl_electrolyte\n", "['H2O(L)', 'H+', 'OH-', 'Cl-', 'Nd+++']\n", "PC88A_liquid\n", "['(HA)2(org)', 'dodecane', 'Nd(H(A)2)3(org)']\n" ] } ], "source": [ "for phase in searcher.get_phases():\n", " print(phase.name)\n", " print(phase.species_names)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can explore Cantera solutions further by visiting https://cantera.org/ and seeing Cantera's documentation." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### opt_dict" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This is a dictionary that contains the information about what species and what thermodynamic properties are to be modified.
\n", "The number after the thermodynamic property is the initial guess for the optimizer.
\n", "In this example, we chose to optimize the standard enthalpy (h0) of the neodymium-PC88A complex ('Nd(H(A)2)3(org)') and give it an initial guess of -4.7e6. Thus,
\n", "```python \n", "opt_dict={'Nd(H(A)2)3(org)': {'h0': -4.7e6}}```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Say we wanted to also modify the extractant ('(HA)2(org)'), but this time change both the standard enthalpy (h0) and the molar volume (molarVolume), then the dictionary would be\n", "```python \n", "opt_dict={'Nd(H(A)2)3(org)': {'h0': -4.7e6, 'molarVolume':1.01},\n", " '(HA)2(org)': {'h0': -4.7e6, 'molarVolume':1.01}}```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### phase_names" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This a list of the phase names in the xml file and can be found in the field phase id." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Names and rhos" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "| Parameter | Meaning | Example value |\n", "|---------------------|----------------------------------------------|-------------------|\n", "| aq_solvent_name | Name of solvent in aqueous phase | 'H2O(L)' |\n", "| extractant_name | Name of extractant in organic phase | '(HA)2(org)' |\n", "| diluant_name | Name of diluant in organic phase | 'dodecane' |\n", "| complex_name | Name of rare earth complex in organic phase | 'Nd(H(A)2)3(org)' |\n", "| rare_earth_ion_name | Name of rare earth ion name in aqueous phase | 'Nd+++' |\n", "| rhos | Density of species (g/L) | 1000 for 'H2O(L)' |" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "For the variables containing \"rho\", these parameters can be left \"None\", and molecular weight and molar volume will be used to calculate density.
However, molar volume values may be wrong and mess up calculations so it is recommended to find density values and replace the default values." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Fitting thermodynamic properties to data" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that the thermodynamic properties have been set, we now need to set up the optimizer.
The default optimizer is from scipy.optimize.minimize with the arguments below. The optimizer optimizes a value multiplied by the initial guess.
Say $x$ is the variable controlled by the minimizer, the value that is entering the objective function is $x\\times\\mathrm{Guess\\,value}$. So for our case, the values tested are $(4.6\\times 10^6)x$. This is more important for bounds and constraints." ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [], "source": [ "minimizer_kwargs = {\"method\": 'SLSQP',\n", " \"bounds\": [(1e-1, 1e1)],\n", " \"constraints\": (),\n", " \"options\": {'disp': True, \n", " 'maxiter': 1000, \n", " 'ftol': 1e-6}}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "With the minimizer arguments defined, we can perform our fit.
\n", "This minimizes the log mean squared error between the predicted and experimental Distribution ratio (D)." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Optimization terminated successfully. (Exit mode 0)\n", " Current function value: 0.025193288852542232\n", " Iterations: 4\n", " Function evaluations: 16\n", " Gradient evaluations: 4\n", "{'Nd(H(A)2)3(org)': {'h0': -4704699.156668724}}\n" ] } ], "source": [ "est_enthalpy = searcher.fit()\n", "print(est_enthalpy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can see that the fit function returns an identical structure to opt_dict" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Updating the xml" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now we have our new values, let us write them to our original xml to replace the old values" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "searcher.update_xml(est_enthalpy)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Visualization and analysis" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also see how well this new xml data fits to the experimental data with a parity plot." ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "image/png": "\n", "text/plain": [ "
" ] }, "metadata": {}, "output_type": "display_data" } ], "source": [ "searcher.parity_plot()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can also find what the r-squared value is. The closer to 1, the better the prediction model." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0.9970803631106648\n" ] } ], "source": [ "print(searcher.r_squared())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Yay! Good job! That is an amazing fit." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.5" } }, "nbformat": 4, "nbformat_minor": 4 }