Predicting the Binding Affinity between Medicine and Estrogen Receptor Beta
Recent studies showed that the probability of Taiwanese females developing breast cancer has risen dramatically over the past 30 years. We are now facing younger and more breast cancer patients in Taiwan. What makes the matter even more severe, is the fact that patients that take cancer treating medicine will suffer from its serious side effects, some may even lose the ability to reproduce. We hope to develop a new system that can help doctors and researchers develop new medicine for treating breast cancer, the way medicine cures cancer tumors are by attaching onto the infected cells’ receptors. After collecting MACCS data (converted from SMILES), the dataset will be used for training the machine learning program. Due to the problem of insufficient training data, we used an ensemble method to generate our machine learning model. Among the three basic ensemble techniques, Max Voting, Averaging, and Weighted Averaging. we selected the max voting technique to perform the prediction for this research. We created two separate datasets, positive and negative, the two datasets will later be used as training data for the program. We weren’t sure of the ratio of positive and negative in the training data, therefore we compare 40 different ratios and evaluate the results. By comparing the accuracy of the models, we found out that when the ratio between positive data and negative data is 1:3000, the machine learning program will have the highest precision. After we created the final model through voting among the 1000 models generated, we evaluate the precision of the model through the following methods, AUC, precision, recall. The ultimate goal of this research is to assist doctors and researchers shorten the process of developing and testing new medicines.
多邊形的剖分圖形數量之探討
從參考資料[1]可知,將凸n+2邊形利用n-1條不相交的對角線剖分成n個三角形的圖形數量即為卡特蘭數Cn。而我利用不相交的對角線把n+2邊形剖分成數個多邊形和三角形的組合,並從此類的剖分圖形與三角剖分圖形之關聯,進而由卡特蘭數的一般式推導出此類剖分圖形數量的一般式。在本研究中可得,若到把n+2邊形剖分成一個k+2邊形和多個三角形的圖形數量是(2n-k+1 n+1) ;把n+2邊形剖分成一個k+2邊形、一個m+2邊形和多個三角形的圖形數量,當m≠k,數量為n+2/2(2n-k-m+2 n+2) ,當m=k時,數量為n+2/2(2n-2k+2 n+2) ;把n+2邊形剖分成一個k1+2邊形、一個k2+2邊形、一個k3+2邊形、和n-k1-k2-k3 個三角形的剖分圖形,當k1,k2,k3兩兩相異時,數量為(n+2)(n+3)(2n-k1-k2-k3+3 n+3) ;把n+2邊形剖分成一個K1+2邊形、一個K2+2邊形、一個K3+2邊形、一個K4+2邊形和n-K1-k2-k3-k4個三角形的剖分圖形當k1,k2,k3,k4兩兩相異,數量為(n+2)(n+3)(n+4)(2n-k1-k2-k3-k4+4 n=4)。並猜測若k1,k2,...,ki兩兩相異時,把n+2邊形剖分成一個k1+2邊形、一個k2+2邊形、…、一個ki+2邊形、和n-Σkj 個三角形的剖分圖形數量為(n+i)!/(n+1)!(2n-Σkj+i n+i) 。