科展作品檢索
「深」不可「測」——紅蓮燈魚3D座標重建與智慧管理系統
考量養魚業者未必能隨時隨地監控魚群的活動情形,因此本研究利用YOLOv5的視覺辨識模型及Deep SORT多目標追蹤演算法進行魚群的座標定位及移動追蹤,開發出一套「智慧魚缸管理系統」,我們分別使用600、150張魚隻照片作為辨識的訓練集與測試集,其結果準確度平均高達99.99%,進一步利用兩鏡頭拍攝的視差以及司乃耳定律所得折射公式重建出魚群中每隻個體的3D座標,再根據個體座標的變換計算其活動量,並推論其可能的行為活動,同時以三維動畫模擬出魚群在魚缸內的即時狀態,此外我們選用OpenCV的輪廓偵測函式,計算個體的側視面積,由此觀察魚隻的成長情形,最終將上述各數據寫入MySQL資料庫作統計分析,當發生特定事件時,將透過Line Notify傳送訊息及時通知業者處理。
> 更多
科展作品檢索
「藻」出不「塑」之客─狸藻捕蟲囊吸收塑膠微粒之探討
本研究主要是觀察狸藻葉子特化的捕蟲囊。以染色及褪色的方法,來了解是否因吸收聚氯乙烯(PVC),而改變其觸發速率,發現含有PVC的捕蟲囊,其觸發及排出色素的速率皆比未含有的慢。另外,也發現浸泡在不同PVC濃度下的捕蟲囊,其觸發及排出速率也不同,浸泡在濃度較低溶液中的捕蟲囊,其觸發、排出速率皆比濃度較高的還快。顯示水中不同濃度的物理性微粒對狸藻捕蟲囊觸發、排出作用有不同程度的影響。此外我們透過抗氧化酵素過氧化酶POD活性檢測浸泡PVC溶液不同時間的狸藻其生長狀況,發現有PVC的組別,體內POD含量皆較高且浸泡一天的組別含量最高,顯示PVC的確會對狸藻狀況造成明顯氧化壓力。
> 更多
科展作品檢索
Adversarial Attacks Against Detecting Bot Generated Text
With the introduction of the transformer architecture by Vaswani et al. (2017), contemporary Text Generation Models (TGMs) have shown incredible capabilities in generating neural text that, for humans, is nearly indistinguishable from human text (Radford et al., 2019; Zellers et al., 2019; Keskar et al., 2019). Although TGMs have many potential positive uses in writing, entertainment and software development (Solaiman et al., 2019), there is also a significant threat of these models being misused by malicious actors to generate fake news (Uchendu et al., 2020; Zellers et al., 2019), fake product reviews (Adelani et al., 2020), or extremist content (McGuffie & Newhouse, 2020). TGMs like GPT-2 generate text based on a given prompt, which limits the degree of control over the topic and sentiment of the neural text (Radford et al., 2019). However, other TGMs like GROVER and CTRL allow for greater control of the content and style of generated text, which increases its potential for misuse by malicious actors (Zellers et al., 2019; Keskar et al., 2019). Additionally, many state-of-the-art pre-trained TGMs are available freely online and can be deployed by low-skilled individuals with minimal resources (Solaiman et al., 2019). There is therefore an immediate and substantial need to develop methods that can detect misuse of TGMs on vulnerable platforms like social media or e-commerce websites. Several methods have been explored in detecting neural text. Gehrmann et al. (2019) developed the GLTR tool which highlights distributional differences in GPT-2 generated text and human text, and assists humans in identifying a piece of neural text. The other approach is to formulate the problem as a classification task to distinguish between neural text and human text and train a classifier model (henceforth a ‘detector’). Simple linear classifiers on TF-IDF vectors or topology of attention maps have also achieved moderate performance (Solaiman et al., 2019; Kushnareva et al., 2021). Zellers et al. (2019) propose a detector of GROVER generated text based on a linear classifier on top of the GROVER model and argue that the best TGMs are also the best detectors. However, later results by Uchendu et al. (2020) and Solaiman et al. (2019) show that this claim does not hold true for all TGMs. Consistent through most research thus far is that fine-tuning the BERT or RoBERTa language model for the detection task achieves state-of-the-art performance (Radford et al., 2019; Uchendu et al., 2020; Adelani et al., 2020; Fagni et al., 2021). I will therefore be focussing on attacks against a fine-tuned RoBERTa model. Although extensive research has been conducted on detecting generated text, there is a significant lack of research in adversarial attacks against such detectors (Jawahar et al., 2020). However, the present research that does exist preliminarily suggests that neural text detectors are not robust, meaning that the output can change drastically even for small changes in the text input and thus that these detectors are vulnerable to adversarial attacks (Wolff, 2020). In this paper, I extend on Wolff’s (2020) work on adversarial attacks on neural text detectors by proposing a series of attacks designed to counter detectors as well as an algorithm to optimally select for these attacks without compromising on the fluency of generated text. I do this with reference to a fine-tuned RoBERTa detector and on two datasets: (1) the GPT-2 WebText dataset (Radford et al., 2019) and (2) the Tweepfake dataset (Fagni et al., 2021). Additionally, I experiment with possible defences against these attacks, including (1) using count-based features, (2) stylometric features and (3) adversarial training.
> 更多