全國中小學科展

電腦科學與資訊工程

摘要演算法和語句分析之關聯性

在這個資訊發達的時代,網路充滿著五花八門的資訊,導致我們在查詢資料時會因為這些雜亂且未經過濾的資料浪費許多時間,其中最為氾濫的便是點擊誘餌(clickbait),此種新聞常常有著吸引人的標題,而內容卻不會與主題相符,人們也常常在讀完整篇文章後才意識到自己浪費了許多時間在無意義的資訊上面。解決此問題很常用的方法之一便是運用摘要演算法來讓讀者先對新聞有一個大概的理解,不過,雖然摘要演算法越來越普及,但產生出來的摘要仍會和人為判斷的結果有所差距,進而造成閱讀理解上的錯誤以及偏差,所以我們想要藉由這次研究,從一個嶄新的角度切入,探討摘要演算法和句型分析之間的關係,融合原本向量建構的方式以及語句結構的分析來測試摘要的準確度,並且由結果研發出一個可以產生出更為精確的主旨之摘要演算法,除此之外,我們也會融合實地調查以及搜集意見的方式來更進一步探討人們思模式與產生出的摘要之關聯性。

彩色二維條碼手持產品開發之探討

QR Code是由黑白模組組成的二維數位條碼,掃描後可讀取儲存的訊息。受限於設計原理,QR Code使用二進位制儲存資料。增加模組數目可增加資料量,但若在條碼內塞進太多模組時,尺寸太小的模組將無法被掃描器讀取。此外,目前QR Code掃描器僅支援單張掃描,並無法應付同時多張條碼掃描的實務需求。 如能克服顏色辨識,理論上彩色二維條碼將能克服現行QR Code的限制,但市面上並無相關產品可供測試。因此本專題設計了一款10×10、具8顏色的"Colour Matrix",並利用Raspberry Pi開發Colour Matrix在手持裝置上運作的軟硬體來進行實驗。此實驗成功利用機器學習演算法在Raspberry Pi上進行的顏色辨識。開發的程式在單張掃描上效能與使用pyzbar辨識QR Code相當;在多張掃描方面,使用pyzbar辨識QR Code的解碼成功率為3.1%,而本專題的方法將成功率提升至92.4%,擴增數位條碼的使用範圍,具商用價值。

Limited Query Black-box Adversarial Attacks in the Real World

We study the creation of physical adversarial examples, which are robust to real-world transformations, using a limited number of queries to the target black-box neural networks. We observe that robust models tend to be especially susceptible to foreground manipulations, which motivates our novel Foreground attack. We demonstrate that gradient priors are a useful signal for black-box attacks and therefore introduce an improved version of the popular SimBA. We also propose an algorithm for transferable attacks that selects the most similar surrogates to the target model. Our black-box attacks outperform state-of-the-art approaches they are based on and support our belief that the concept of model similarity could be leveraged to build strong attacks in a limited-information setting.

A Person Re-identification based Misidentification-proof Person Following Service Robot

Two years ago, I attended a robot contest, in which one of the missions required the robot to follow the pedestrian to complete the task. At that time, I used their demo program to complete the task. Not long after, I found two main issues: 1. The program follows the closest point read by the depth camera, which if I walk close to a wall next to, the robot may likely ‘follow’ the wall. 2. Not to mention if another pedestrian crosses between the robot and the target. Regarding these two issues, I decided to improve it. We’ve designed a procedure of using YOLO Object Detection and Person re-identification to re-identify the target for continuous following.

Development of an Android Application for Triage Prediction in Hospital Emergency Departments

Triage is the process by which nurses manage hospital emergency departments by assigning patients varying degrees of urgency. While triage algorithms such as the Emergency Severity Index (ESI) have been standardized worldwide, many of them are highly inconsistent, which could endanger the lives of thousands of patients. One way to improve on nurses’ accuracy is to use machine learning models (ML), which can learn from past data to make predictions. We tested six ML models: random forest, XGBoost, logistic regression, support vector machines, k-nearest neighbors, and multilayer perceptron. These models were tasked with predicting whether a patient would be admitted to the intensive care unit (ICU), another unit in the hospital, or be discharged. After training on data from more than 30,000 patients and testing using 10-fold cross-validation, we found that all six models outperformed ESI. Of the six, the random forest model achieved the highest average accuracy in predicting both ICU admission (81% vs. 69% using ESI; p<0.001) and hospitalization (75% vs. 57%; p<0.001). These models were then added to an Android application, which would accept patient data, predict their triage, and then add them to a priority-ordered waiting list. This approach may offer significant advantages over conventional triage: mainly, it has a higher accuracy than nurses and returns predictions instantaneously. It could also stand-in for triage nurses entirely in disasters, where medical personnel must deal with a large influx of patients in a short amount of time.

以深度學習與遷移學習防範社群媒體片面新聞訊息之研究

現代民眾獲取新聞的途徑逐漸轉移到網路媒體,然而在網路資訊快速傳播以及媒體為追求報導曝光度以增加金錢利益的情形下,片面、誘導等形式的新聞標題與短句訊息在新聞媒體傳播中日益嚴重;本次研究透過Fake News Challenge提供的Stance Detection dataset,運用深度學習與遷移學習方法訓練可預測兩文本之間相關程度的自然語言處理模型,在過程中改善調參及訓練方式,並將其實際運用在預測美國新聞媒體於Facebook網路社群平台發文推播新聞的同時所附的短句與新聞報導文本內容之間的相關關係程度,分析社群平台中新聞可能造成的誤導式文句是否實際造成片面報導,而影響了受眾對於媒體的使用程度與信任程度。使此模型有助即時預警社群平台上的報導資訊型態品質,輔助使用者獲取新聞時所應具備的媒體識讀能力,進而改善片面報導於網路的流竄,同時提升未來媒體生態。

Enhancement of Online Stochastic Gradient Descent using Backward Queried Images

Stochastic gradient descent (SGD) is one of the preferred online optimization algorithms. However, one of its major drawbacks is its predisposition to forgetting previous data when optimizing through a data stream, also known as catastrophic interference. In this project, we attempt to mitigate this drawback by proposing a new low-cost approach which incorporates backward queried images with SGD during online training. Under this new approach, we propose that for every new training sample through the data stream, the neural network is optimized using the corresponding backward queried image from the initial dataset. After compiling the accuracy of the proposed method and SGD under a data-stream of 50,000 training cases with 10,000 test cases and comparing our algorithm to SGD, we see substantial improvements in the performance of the neural network with two different MNIST datasets (Fashion and Kuzushiji), classifying the MNIST datasets at a high accuracy for the mean, minimum, lower quartile, median, and upper quartile, while maintaining lower standard deviation in performance, demonstrating that our proposed algorithm can be a potential alternative to online SGD.

Method of prosthetic vision

This work is devoted to solving the problem of orientation in the space of visually impaired people. Working on the project, a new way of transmitting visual information through an acoustic channel was invented. In addition, was developed the device, which uses distance sensors to analyze the situation around a user. Thanks to the invented algorithm of transformation of the information about the position of the obstacle into the sound of a certain tone and intensity, this device allows the user to transmit subject-spatial information in real time. Currently, the device should use a facette locator made of 36 ultrasonic locators grouped in 12 sectors by the azimuth and 3 spatial cones by the angle. Data obtained in such a way is converted into its own note according to the following pattern : the angle of the place corresponds to octave, the azimuth corresponds to the note and the distance corresponds to the volume. The choice of the notes is not unambiguous. However, we used them for the reason that over the centuries, notes have had a felicitous way of layout on the frequency range and on the logarithmic scale. Therefore, the appearance of a new note in the total signal will not be muffled by a combination of other notes. Consequently, a blind person, moving around the room with the help of the tone and volume of the sound signals, will be able to assess the presence and location of all dangerous obstacles. After theoretical substantiation of the hypothesis and analysis of the available information, we started the production of prototypes of the devices that would implement the idea of transmitting information via the acoustic channel.

A Person Re-identification based Misidentification-proof Person Following Service Robot

Two years ago, I attended a robot contest, in which one of the missions required the robot to follow the pedestrian to complete the task. At that time, I used their demo program to complete the task. Not long after, I found two main issues: 1. The program follows the closest point read by the depth camera, which if I walk close to a wall next to, the robot may likely ‘follow’ the wall. 2. Not to mention if another pedestrian crosses between the robot and the target. Regarding these two issues, I decided to improve it. We’ve designed a procedure of using YOLO Object Detection and Person re-identification to re-identify the target for continuous following.

一種新的複音音樂片段相似性度量

平常聽音樂時經常有種似曾相識的感覺。為了描述這種感覺,我們展開了複音音樂片段相似性度量的研究。因為曾經使用過最長公共子序列實作卻效果不如預期,我們將音樂片段正規化後,視為座標平面上的時間、音高點對的集合,使用點對應與二分圖匹配的方法,定義兩個複音音樂片段的相似度為最大權重匹配的平均邊權。我們計算了資料集(JKUPDD)中相同、相異的音樂片段的相似性,調整算法中的參數,找出最適合的參數組合,並且透過音符之間的權重,畫出自相似度矩陣,發現樂曲中的重複片段。