f

09e01d64 · Samuel Simko · 606770d4 · 09e01d64
Commit 09e01d64 authored 2 years ago by Samuel Simko
--- a/rapport/rapport.tex
+++ b/rapport/rapport.tex
@@ -414,24 +414,28 @@ For the RNN, we found that using a LSTM cell instead of a RNN cell greatly incre
 	\centering
 	\begin{tabular}{|c|c|c|}
 		\hline
-		Algorithm & Best validation loss & Testing loss \\
+		Algorithm & Testing loss \\
 		\hline
-		Linear Regression (Baseline) & 2.64e-06 &  \\
+		Linear Regression (Baseline) & 2.64e-06 \\
 		\hline
-		SVR & \ldots & 4.78e-05 \\
+		SVR & 4.78e-05 \\
 		\hline
-		MLP &  & 7.09e-04 \\
+		MLP & 7.09e-04 \\
 		\hline
-		RNN & 6.03e-04 & 6.39e-04 \\
+		RNN & 6.39e-04 \\
 		\hline
 	\end{tabular}
 	\caption{Validation and testing losses for each algorithm used}
 	\label{fig:tabloss}
 \end{figure}

-In figure \ref{fig:tabloss}, we plot the best validation loss achieved during the cross-validation step.
+In Figure \ref{fig:tabloss}, we plot the testing loss achieved for each algorithm.
+For each algorithm, the hyperparameters used were the result of the Optuna optimization
+on cross-validation.

-We will perform a hypothesis test in order to figure out if the models achieved perform the same as the
+We see in the Figure \ref{fig:tabloss} that the Linear regression performed the best out of all our models.
+
+To see We will perform a hypothesis test in order to figure out if the models achieved perform the same as the
 baseline.

 We note $H_0$ the null hypotheses (The model and Linear Regression are equal in performance),
@@ -445,21 +449,23 @@ We will do a paired sample t-test. In order to do so, we make the following assu
 		distributed. This seems to be the case if we plot the histogram (Figure \ref{hist});
 \end{itemize}

-We use a level of significance of $0.05$. The paired t-test will tell us if the mean loss
+We use a level of significance of $\alpha = 0.05$. The paired t-test will tell us if the mean loss
 of the two models are the same.

 For the SVR test predictions, we get a p-value of 2.178e-05 for the Energy\_ attribute,
 while 0.1766. As the p-value is above the level
+of significance, we can reject the null hypothesis.
+
+For the lasso test predictions, we get a p-value of 0.040.
+As the p-value is below the level
 of significance, We cannot reject the null hypothesis.

-For the lasso test predictions, we get a p-value of 0.040. As the p-value is below the level
-of significance, we can reject the null hypothesis.
+For the MLP, we get a p-value of 0.13 for both dependent variables. We can reject the null
+hypothesis.

 For the Reccurent Neural Network, we get a p-value of 0.857 for the Energy\_ attribute,
 and a p-value of 0.598 for the Energy\_DG attribute. We can reject the null hypothesis.

-\ldots
-
 As we did not find evidence which points to the hypothesis that the label is more than just linear,
 we apply Occam's razor and conclude that the simplest model is to be preferred.
 In more complex databases, we would use linear algorithms such as Linear Regression or Support Vector Machines
@@ -475,9 +481,11 @@ for each of the algorithms using KFold cross-validation and Optuna to perform an

 The baseline algorithm, a linear regression, performs extremely well with the right SMILES encoding.
 We compared the best models for each different algorithm on the testing dataset. We found that linear regression
-is competitive with the other methods. We use Occam's razor to determine that a linear model is to be 
+is competitive with the other methods we used.

-\ldots
+We use Occam's razor to determine that a simple linear regression with encoding of the number of occurences
+of the different symbols of the SMILES string is to be preferred for practical applications
+of molecular energy prediction.

 \section{Acknowledgments}