Recent studies showed that the probability of Taiwanese females developing breast cancer has risen dramatically over the past 30 years. We are now facing younger and more breast cancer patients in Taiwan. What makes the matter even more severe, is the fact that patients that take cancer treating medicine will suffer from its serious side effects, some may even lose the ability to reproduce. We hope to develop a new system that can help doctors and researchers develop new medicine for treating breast cancer, the way medicine cures cancer tumors are by attaching onto the infected cells’ receptors. After collecting MACCS data (converted from SMILES), the dataset will be used for training the machine learning program. Due to the problem of insufficient training data, we used an ensemble method to generate our machine learning model. Among the three basic ensemble techniques, Max Voting, Averaging, and Weighted Averaging. we selected the max voting technique to perform the prediction for this research. We created two separate datasets, positive and negative, the two datasets will later be used as training data for the program. We weren’t sure of the ratio of positive and negative in the training data, therefore we compare 40 different ratios and evaluate the results. By comparing the accuracy of the models, we found out that when the ratio between positive data and negative data is 1:3000, the machine learning program will have the highest precision. After we created the final model through voting among the 1000 models generated, we evaluate the precision of the model through the following methods, AUC, precision, recall. The ultimate goal of this research is to assist doctors and researchers shorten the process of developing and testing new medicines.