科展作品檢索
以分散式邊緣運算網路架構實現智慧機器人代理系統之研究
本研究企圖建構一個運用邊緣運算技術(Edge Computing)之人工智能機器人代理人(AI Agent),並將之運用於實體人型格鬥用機器人的研究開發中。 在此以人型格鬥機器人做為場景需求使用設定目標,運用彈性化模組,加上分散式、嵌入式即時網路技術來降低系統設計的複雜度,整合通訊協議與深度學習YOLO影像演算法,進一步運用ZMP運動控制理論,以及多維感測器融合技術(sensor fusion),融合陀螺儀(GYRO)、加速儀(accelerometer)、CMOS Sensors、FSR(Force sensing resistor)作為人形機器人智慧平衡基礎,再藉由圖形識別做為預測辨識以及智慧姿態ZMP控制技術作為攻防策略判斷。 整體系統藉由AI 晶片與嵌入式系統網路作為整合。透過網路即時傳輸環境資訊與指令,使機器人可以知道高層的指令目的資訊。值得一提的是本系統網路設計建構依照仿生哺乳類動物的分層式架構。神經系統將反射以及即時控制交由智慧代理人軟體作為即時演算與控制來達到高性能與彈性發展的需要,未來可用在高等擴充性的人形機器人使用,包括格鬥機器人,人形機器人工地建材搬運、具自平衡醫療外骨骼機器人......等,使人與機器人能並肩工作,提升人與機器人整合互動。
> 更多
科展作品檢索
利用虛擬篩選LpxC抑制劑的方式找出對抗多重抗藥性綠膿桿菌的新療法
多重抗藥性(MDR)細菌已經在全世界的範圍內成為了一個重大威脅,而像是多重抗藥性的綠膿桿菌就是其中一種對大多數療法有抵抗力的病原體。在目前的治療方案無效之前,有必要開發出一種新型機制的抗生素能夠作為對抗的手段。我們通過電腦虛擬篩選的方式,並用一個脂多糖脂質A (LipidA)生合成路徑的關鍵蛋白,LpxC,作為篩選的對象。在我們的第一次預測中,ZINC000001587011 (brequinar) 具有較低的結合能和較高的生物利用度。但由於其較高的cLogP值,使我們對其進行了官能團修飾,以期能有所改善。最後,我們在所有衍生物中找到了N11,有最大的潛力能做為抗綠膿桿菌的藥物前驅物。
> 更多
本研究使用半監督式機器學習搭配卷積神經網路來訓練核心模型,並將星系的圖片加入模型裡,讓電腦自動判斷出該星系的種類為何。我使用自行設計的CNN架構以及VGG-16當作我的卷積神經網路架構。資料集來源為EFIGI和Galaxy Zoo 2。我分為兩種任務,第一種任務是讓模型能分辨橢圓 (E)、螺旋 (S)、不規則 (I) 這三種類別的星系,訓練資料共有2,468張照片,最後的正確率能達到94%。第二個任務是將8種的星系照片(E、S0、Sa、Sb、Sc、SBa、SBb、SBc)進行分類,並使用自動編碼器作為預訓練,使用1,923張EFIGI的圖片以及1,258張Galaxy Zoo 2的照片當作訓練資料。由於各星系照片有許多外觀太過相似,測試準確度最高達到54.12%,基於我的研究,星系自動化辨識於天文學上應該有相當大的運用空間。
> 更多
科展作品檢索
以大腸直腸癌預測為例進行缺失值處理方式的探討與實驗
機器學習和精準醫療是目前醫學界的熱門話題。機器學習在醫療領域的應用越來越普及,可幫助臨床更快速及精準診斷疾病,並提供個人化治療方案。例如,通過訓練大量醫學影像數據,建立深度學習模型,可用於腫瘤的自動辨識與分類。通過醫療資料大數據分析,可以為臨床提供及時的疾病預測和預防建議。然而,如何讓臨床資料結合機器學習建立模型預測,是很重要的議題。本研究使用臺北醫學大學數據處蒐集衛生福利部雙和醫院的大腸直腸癌與大腸炎病患三年的臨床資料,結合機器學習進行模型的建立與預測。經處理數據的缺失值、特徵的排序與選取及向前特徵選取法來訓練與驗證模型,找出分辨大腸直腸癌和大腸炎的最佳檢驗項目組合及效能,以預測大腸直腸癌。
> 更多
科展作品檢索
Adversarial Attacks Against Detecting Bot Generated Text
With the introduction of the transformer architecture by Vaswani et al. (2017), contemporary Text Generation Models (TGMs) have shown incredible capabilities in generating neural text that, for humans, is nearly indistinguishable from human text (Radford et al., 2019; Zellers et al., 2019; Keskar et al., 2019). Although TGMs have many potential positive uses in writing, entertainment and software development (Solaiman et al., 2019), there is also a significant threat of these models being misused by malicious actors to generate fake news (Uchendu et al., 2020; Zellers et al., 2019), fake product reviews (Adelani et al., 2020), or extremist content (McGuffie & Newhouse, 2020). TGMs like GPT-2 generate text based on a given prompt, which limits the degree of control over the topic and sentiment of the neural text (Radford et al., 2019). However, other TGMs like GROVER and CTRL allow for greater control of the content and style of generated text, which increases its potential for misuse by malicious actors (Zellers et al., 2019; Keskar et al., 2019). Additionally, many state-of-the-art pre-trained TGMs are available freely online and can be deployed by low-skilled individuals with minimal resources (Solaiman et al., 2019). There is therefore an immediate and substantial need to develop methods that can detect misuse of TGMs on vulnerable platforms like social media or e-commerce websites. Several methods have been explored in detecting neural text. Gehrmann et al. (2019) developed the GLTR tool which highlights distributional differences in GPT-2 generated text and human text, and assists humans in identifying a piece of neural text. The other approach is to formulate the problem as a classification task to distinguish between neural text and human text and train a classifier model (henceforth a ‘detector’). Simple linear classifiers on TF-IDF vectors or topology of attention maps have also achieved moderate performance (Solaiman et al., 2019; Kushnareva et al., 2021). Zellers et al. (2019) propose a detector of GROVER generated text based on a linear classifier on top of the GROVER model and argue that the best TGMs are also the best detectors. However, later results by Uchendu et al. (2020) and Solaiman et al. (2019) show that this claim does not hold true for all TGMs. Consistent through most research thus far is that fine-tuning the BERT or RoBERTa language model for the detection task achieves state-of-the-art performance (Radford et al., 2019; Uchendu et al., 2020; Adelani et al., 2020; Fagni et al., 2021). I will therefore be focussing on attacks against a fine-tuned RoBERTa model. Although extensive research has been conducted on detecting generated text, there is a significant lack of research in adversarial attacks against such detectors (Jawahar et al., 2020). However, the present research that does exist preliminarily suggests that neural text detectors are not robust, meaning that the output can change drastically even for small changes in the text input and thus that these detectors are vulnerable to adversarial attacks (Wolff, 2020). In this paper, I extend on Wolff’s (2020) work on adversarial attacks on neural text detectors by proposing a series of attacks designed to counter detectors as well as an algorithm to optimally select for these attacks without compromising on the fluency of generated text. I do this with reference to a fine-tuned RoBERTa detector and on two datasets: (1) the GPT-2 WebText dataset (Radford et al., 2019) and (2) the Tweepfake dataset (Fagni et al., 2021). Additionally, I experiment with possible defences against these attacks, including (1) using count-based features, (2) stylometric features and (3) adversarial training.
> 更多