科展作品檢索
Adversarial Attacks Against Detecting Bot Generated Text
With the introduction of the transformer architecture by Vaswani et al. (2017), contemporary Text Generation Models (TGMs) have shown incredible capabilities in generating neural text that, for humans, is nearly indistinguishable from human text (Radford et al., 2019; Zellers et al., 2019; Keskar et al., 2019). Although TGMs have many potential positive uses in writing, entertainment and software development (Solaiman et al., 2019), there is also a significant threat of these models being misused by malicious actors to generate fake news (Uchendu et al., 2020; Zellers et al., 2019), fake product reviews (Adelani et al., 2020), or extremist content (McGuffie & Newhouse, 2020). TGMs like GPT-2 generate text based on a given prompt, which limits the degree of control over the topic and sentiment of the neural text (Radford et al., 2019). However, other TGMs like GROVER and CTRL allow for greater control of the content and style of generated text, which increases its potential for misuse by malicious actors (Zellers et al., 2019; Keskar et al., 2019). Additionally, many state-of-the-art pre-trained TGMs are available freely online and can be deployed by low-skilled individuals with minimal resources (Solaiman et al., 2019). There is therefore an immediate and substantial need to develop methods that can detect misuse of TGMs on vulnerable platforms like social media or e-commerce websites. Several methods have been explored in detecting neural text. Gehrmann et al. (2019) developed the GLTR tool which highlights distributional differences in GPT-2 generated text and human text, and assists humans in identifying a piece of neural text. The other approach is to formulate the problem as a classification task to distinguish between neural text and human text and train a classifier model (henceforth a ‘detector’). Simple linear classifiers on TF-IDF vectors or topology of attention maps have also achieved moderate performance (Solaiman et al., 2019; Kushnareva et al., 2021). Zellers et al. (2019) propose a detector of GROVER generated text based on a linear classifier on top of the GROVER model and argue that the best TGMs are also the best detectors. However, later results by Uchendu et al. (2020) and Solaiman et al. (2019) show that this claim does not hold true for all TGMs. Consistent through most research thus far is that fine-tuning the BERT or RoBERTa language model for the detection task achieves state-of-the-art performance (Radford et al., 2019; Uchendu et al., 2020; Adelani et al., 2020; Fagni et al., 2021). I will therefore be focussing on attacks against a fine-tuned RoBERTa model. Although extensive research has been conducted on detecting generated text, there is a significant lack of research in adversarial attacks against such detectors (Jawahar et al., 2020). However, the present research that does exist preliminarily suggests that neural text detectors are not robust, meaning that the output can change drastically even for small changes in the text input and thus that these detectors are vulnerable to adversarial attacks (Wolff, 2020). In this paper, I extend on Wolff’s (2020) work on adversarial attacks on neural text detectors by proposing a series of attacks designed to counter detectors as well as an algorithm to optimally select for these attacks without compromising on the fluency of generated text. I do this with reference to a fine-tuned RoBERTa detector and on two datasets: (1) the GPT-2 WebText dataset (Radford et al., 2019) and (2) the Tweepfake dataset (Fagni et al., 2021). Additionally, I experiment with possible defences against these attacks, including (1) using count-based features, (2) stylometric features and (3) adversarial training.
> 更多
科展作品檢索
The GoClub-梅花棋演算法效率及適用性分析
本研究旨在研究一款自創棋類遊戲「梅花棋」,找出效率最佳的演算法及分析AI的適用性。遊戲規則如下:雙方玩家輪流在19階的棋盤上下棋,先手執黑子,後手執白子,任一方形成梅花即獲勝。隨著棋子的增加,肉眼判斷勝負愈發困難,因此希望借助電腦的力量完成它。我們透過C++編寫程式,持續改良優化演算法,提升電腦的精確度與流暢度。過程中依序提出了平均演算法、畢氏定理演算法、向量演算法、以及網狀編碼演算法。目前最新版本中,我們使用含有螺旋編碼表的網狀編碼演算法,這可使電腦快速正確地判斷勝負。得到最佳的演算法後,我們嘗試運用撰寫Minimax演算法編寫AI,並且不斷增加演算法的深度,從而提升電腦的實力。透過Victory notion的概念分析兩者間的相似度,判斷其對於梅花棋的適用性。透過不斷與Minimax演算法測試遊戲,將梅花棋規則中,先後手的優勢差距逐漸縮小。目前本研究已可順利進行單純的雙人對戰與複雜的人機對戰模式。
> 更多
科展作品檢索
A.N.T.s: Algorithm for Navigating Traffic System in Automated Warehouses
According to CNN Indonesia 2020, the demand for e-Commerce in Indonesia has nearly doubled during this pandemic. This surge in demand calls for a time-efficient method for warehouse order-picking. One approach to achieve that goal is by incorporating automation in their warehouse systems. Globally, the market of warehouse robotics is expected to reach 12.6 billion USD by 2027 (Data Bridge Market Research, 2020). In this research, the warehouse system studied would utilize AMR (Autonomous Mobile Robots) to lift and deliver movable shelf units to the packing station where workers are at. This research designed a heuristic algorithm called A.N.T.s (Algorithm for Navigating Traffic System) to conduct task assigning and pathfinding for AMR in the automated warehouse. The warehouse layout was drawn as a two-dimensional map in grids. When an order is placed, A.N.T.s would assign the task to a robot that would require the least amount of time to reach the target shelf. A.N.T.s then conducted pathfinding heuristically using Manhattan Distance. A.N.T.s would help the robot to navigate its way to the target shelf unit, lift the shelf and bring it to the designated packing station. A.N.T.s algorithm was tested in various warehouse layouts and with a varying number of AMRs. Comparison against the commonly used Djikstra’s algorithm was also conducted (Shaikh and Dhale, 2013). Results show that the proposed A.N.T.s algorithm could execute 100 orders in a 27x23 layout with five robots 9.96 times faster than Dijkstra with no collisions. The algorithm is also shown to be able to help assign tasks to robots and help them find short paths to navigate their ways to the shelf units and packing stations. A.N.T.s could navigate traffic to avoid deadlocks and collisions in the warehouse with the aid of lanes and directions.
> 更多
科展作品檢索
利用虛擬篩選LpxC抑制劑的方式找出對抗多重抗藥性綠膿桿菌的新療法
多重抗藥性(MDR)細菌已經在全世界的範圍內成為了一個重大威脅,而像是多重抗藥性的綠膿桿菌就是其中一種對大多數療法有抵抗力的病原體。在目前的治療方案無效之前,有必要開發出一種新型機制的抗生素能夠作為對抗的手段。我們通過電腦虛擬篩選的方式,並用一個脂多糖脂質A (LipidA)生合成路徑的關鍵蛋白,LpxC,作為篩選的對象。在我們的第一次預測中,ZINC000001587011 (brequinar) 具有較低的結合能和較高的生物利用度。但由於其較高的cLogP值,使我們對其進行了官能團修飾,以期能有所改善。最後,我們在所有衍生物中找到了N11,有最大的潛力能做為抗綠膿桿菌的藥物前驅物。
> 更多