Security Papers

在安全领域应用GPT/AIGC/LLM的论文以及博文

行业新闻

  1. 微软宣布推出网络安全产品——Microsoft Security Copilot Security Copilot将目前最强大语言模型GPT-4内置在产品中,并与微软拥有65万亿个网络安全威胁的安全模型库相结合使用,为企业、个人用户提供网络安全、恶意代码防护、隐私合规监控等生成式自动化AI服务 https://thehackernews.com/2023/03/microsoft-introduces-gpt-4-ai-powered.html

  2. 行业分析:云之后,大模型是网络安全的新机会吗? https://mp.weixin.qq.com/s/nmeDrQX5dTRUT23-2sGI-g

  3. VirusTotal推出Code Insight,用生成式人工智能为威胁分析赋能 https://blog.virustotal.com/2023/04/introducing-virustotal-code-insight.html

  4. 安全大模型进入爆发期!谷歌云已接入全线安全产品|RSAC 2023 https://mp.weixin.qq.com/s/5Aywrqk7B6YCiLRbojNCuQ

  5. Facebook季度安全报告:假冒ChatGPT的恶意软件激增 https://about.fb.com/news/2023/05/metas-q1-2023-security-reports/

  6. Tenable的报告展示了生成式人工智能正在如何改变安全研究 https://venturebeat.com/security/tenable-report-shows-how-generative-ai-is-changing-security-research/

  7. 挖掘AIGC军火产业链,颠覆巨头的机会和风险 https://mp.weixin.qq.com/s/bVQYT0QqueGyLwPAppDRtg

  8. 增强ChatGPT等安全性,大语言模型界的“安全管家”开源了! https://mp.weixin.qq.com/s/PeuNht95WVJbJ8hiOOrSoA

  9. ChatGPT暗黑版——FraudGPT https://hackernoon.com/what-is-fraudgpt

实践指南

软件供应链安全

利用GPT/AIGC/LLM来进行漏洞挖掘和修复、代码质量评测、程序理解

论文

  1. Trust in Software Supply Chains: Blockchain-Enabled SBOM and the AIBOM Future https://arxiv.org/pdf/2307.02088.pdf

  2. An Empirical Study on Using Large Language Models to Analyze Software Supply Chain Security Failures https://arxiv.org/pdf/2308.04898.pdf

  3. SecureFalcon: The Next Cyber Reasoning System for Cyber Security https://arxiv.org/pdf/2307.06616.pdf

  4. Using ChatGPT as a Static Application Security Testing Tool https://arxiv.org/pdf/2308.14434.pdf

  5. A Preliminary Study on Using Large Language Models in Software Pentesting https://arxiv.org/pdf/2401.17459.pdf

  6. LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs’ Vulnerability Reasoning https://arxiv.org/pdf/2401.16185.pdf

  7. Finetuning Large Language Models for Vulnerability Detection https://arxiv.org/pdf/2401.17010.pdf

  8. Large Language Model for Vulnerability Detection: Emerging Results and Future Directions https://arxiv.org/pdf/2401.15468.pdf

  9. How Far Have We Gone in Vulnerability Detection Using Large Language Models https://arxiv.org/pdf/2311.12420.pdf

  10. LLbezpeky: Leveraging Large Language Models for Vulnerability Detection https://arxiv.org/pdf/2401.01269.pdf

  11. Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities https://arxiv.org/pdf/2311.16169.pdf

  12. Detecting software vulnerabilities using Language Models https://arxiv.org/ftp/arxiv/papers/2302/2302.11773.pdf

  13. Prompt-Enhanced Software Vulnerability Detection Using ChatGPT https://arxiv.org/pdf/2308.12697.pdf

  14. Evaluation of ChatGPT Model for Vulnerability Detection https://arxiv.org/pdf/2304.07232.pdf

  15. LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluation https://arxiv.org/pdf/2303.09384.pdf

  16. DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection https://arxiv.org/pdf/2304.00409.pdf

  17. FLAG: Finding Line Anomalies (in code) with Generative AI https://arxiv.org/pdf/2306.12643.pdf

  18. Large Language Models are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT https://arxiv.org/pdf/2304.02014.pdf

  19. Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models https://arxiv.org/pdf/2212.14834.pdf

  20. CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement Learning for LLM-based Mutation https://arxiv.org/pdf/2402.12222.pdf

  21. When GPT Meets Program Analysis: Towards Intelligent Detection of Smart Contract Logic Vulnerabilities in GPTScan https://arxiv.org/pdf/2308.03314.pdf

  22. Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding https://arxiv.org/pdf/2309.09826.pdf

  23. Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives https://arxiv.org/pdf/2310.01152.pdf

  24. VulLibGen: Identifying Vulnerable Third-Party Libraries via Generative Pre-Trained Model https://arxiv.org/pdf/2308.04662.pdf

24.Leveraging AI Planning For Detecting Cloud Security Vulnerabilities https://arxiv.org/pdf/2402.10985.pdf

  1. InferFix: End-to-End Program Repair with LLMs https://arxiv.org/pdf/2303.07263.pdf

  2. HW-V2W-Map: Hardware Vulnerability to Weakness Mapping Framework for Root Cause Analysis with GPT-assisted Mitigation Suggestion https://arxiv.org/pdf/2312.13530.pdf

  3. Fixing Hardware Security Bugs with Large Language Models https://arxiv.org/pdf/2302.01215.pdf

  4. Unlocking Hardware Security Assurancee: The Potential of LLMs https://arxiv.org/pdf/2308.11042.pdf

  5. Generating Secure Hardware using ChatGPT Resistant to CWEs 围绕硬件设计实施常见的10个CWE,分别展示了生成带缺陷代码和安全代码的提示词场景 https://eprint.iacr.org/2023/212.pdf

  6. DIVAS: An LLM-based End-to-End Framework for SoC Security Analysis and Policy-based Protection https://arxiv.org/pdf/2308.06932.pdf

  7. Examining Zero-Shot Vulnerability Repair with Large Language Models https://arxiv.org/pdf/2112.02125.pdf

  8. Practical Program Repair in the Era of Large Pre-trained Language Models https://arxiv.org/pdf/2210.14179.pdf

  9. An Analysis of the Automatic Bug Fixing Performance of ChatGPT https://arxiv.org/pdf/2301.08653.pdf

  10. Automatic Program Repair with OpenAI's Codex: Evaluating QuixBugs https://arxiv.org/pdf/2111.03922.pdf

  11. How Effective Are Neural Networks for Fixing Security Vulnerabilities https://arxiv.org/pdf/2305.18607.pdf

  12. STEAM: Simulating the InTeractive BEhavior of ProgrAMmers for Automatic Bug Fixing https://arxiv.org/pdf/2308.14460.pdf

  13. ZeroLeak: Using LLMs for Scalable and Cost Effective Side-Channel Patching https://arxiv.org/pdf/2308.13062.pdf

  14. Can LLMs Patch Security Issues? https://arxiv.org/pdf/2312.00024.pdf

  15. Better patching using LLM prompting, via Self-Consistency https://arxiv.org/pdf/2306.00108.pdf

  16. Identifying Vulnerability Patches by Comprehending Code Commits with Comprehensive Change Contexts https://arxiv.org/pdf/2310.02530.pdf

  17. Just-in-Time Security Patch Detection -- LLM At the Rescue for Data Augmentation https://arxiv.org/pdf/2312.01241.pdf

  18. Towards JavaScript program repair with generative pre-trained transformer (GPT-2) https://dl.acm.org/doi/abs/10.1145/3524459.3527350

  19. Code Security Vulnerability Repair Using Reinforcement https://arxiv.org/pdf/2401.07031.pdf

  20. Enhanced Automated Code Vulnerability Repair using Large Language Models https://arxiv.org/ftp/arxiv/papers/2401/2401.03741.pdf

  21. Repair Is Nearly Generation: Multilingual Program Repair with LLMs https://arxiv.org/pdf/2208.11640.pdf

  22. Cupid: Leveraging ChatGPT for More Accurate Duplicate Bug Report Detection https://arxiv.org/pdf/2308.10022v2.pdf

  23. Asleep at the Keyboard? Assessing the Security of GitHub Copilot's Code Contributions https://arxiv.org/pdf/2108.09293.pdf

  24. Do Users Write More Insecure Code with AI Assistants? https://arxiv.org/pdf/2211.03622.pdf

  25. How Secure is Code Generated by ChatGPT? https://arxiv.org/pdf/2304.09655.pdf

  26. Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants https://arxiv.org/pdf/2208.09727.pdf

  27. Evaluating Large Language Models Trained on Code https://arxiv.org/pdf/2107.03374.pdf

  28. No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT https://arxiv.org/pdf/2308.04838.pdf

  29. Assessing the Quality of GitHub Copilot's Code Generation https://dl.acm.org/doi/abs/10.1145/3558489.3559072

  30. Is GitHub's Copilot as Bad as Humans at Introducing Vulnerabilities in Code? https://arxiv.org/pdf/2204.04741.pdf

  31. How ChatGPT is Solving Vulnerability Management Problem https://arxiv.org/pdf/2311.06530.pdf

  32. Neural Code Completion Tools Can Memorize Hard-coded Credentials https://arxiv.org/pdf/2309.07639.pdf

  33. BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT https://www.ndss-symposium.org/wp-content/uploads/2023/02/NDSS2023Poster_paper_7966.pdf

  34. Teaching Large Language Models to Self-Debug https://arxiv.org/pdf/2304.05128.pdf

  35. LLM4SecHW: Leveraging Domain-Specific Large Language Model for Hardware Debugging https://browse.arxiv.org/pdf/2401.16448.pdf

  36. Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT https://arxiv.org/pdf/2304.10778.pdf

  37. Fault-Aware Neural Code Rankers https://arxiv.org/pdf/2206.03865.pdf

  38. Using Large Language Models to Enhance Programming Error Messages https://arxiv.org/pdf/2210.11630.pdf

  39. Controlling Large Language Models to Generate Secure and Vulnerable Code 引用了Asleep at the keyboard? 使用了预训练模型,对LM的输出进行pre-train以控制输出的代码是安全的还是存在漏洞的 https://arxiv.org/pdf/2302.05319.pdf

  40. Systematically Finding Security Vulnerabilities in Black-Box Code Generation Models 针对“prompt中的微小变化可能导致漏洞”的问题,在"Asleep at the keyboard?"手动操作在基础上实现自动化发现 https://arxiv.org/pdf/2302.04012.pdf

  41. SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques 在Systematically Finding Security Vulnerabilities in Black-Box Code Generation的论文中,把这篇论文看的很重,解决模型评估的数据集的问题 https://dl.acm.org/doi/abs/10.1145/3549035.3561184

  42. Security Code Review by LLMs: A Deep Dive into Responses https://browse.arxiv.org/pdf/2401.16310.pdf

  43. Exploring the Limits of ChatGPT in Software Security Applications https://arxiv.org/pdf/2312.05275.pdf

  44. An Empirical Evaluation of LLMs for Solving Offensive Security Challenges https://arxiv.org/pdf/2402.11814.pdf

  45. Purple Llama CYBERSECEVAL: A Secure Coding Benchmark for Language Models https://arxiv.org/pdf/2312.04724.pdf

  46. Can Large Language Models Identify And Reason About Security Vulnerabilities? Not Yet https://arxiv.org/pdf/2312.12575.pdf

  47. Pop Quiz! Can a Large Language Model Help With Reverse Engineering https://arxiv.org/pdf/2202.01142.pdf

  48. CODAMOSA: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models 微软在ICSE2023上发布的论文,旨在利用LLM来缓解传统fuzz中的陷入“Coverage Plateaus”的问题 https://www.carolemieux.com/codamosa_icse23.pdf

  49. Understanding Large Language Model Based Fuzz Driver Generation https://arxiv.org/abs/2307.12469

  50. ChatGPT for Software Security: Exploring the Strengths and Limitations of ChatGPT in the Security Applications https://arxiv.org/abs/2307.12488

  51. How well does LLM generate security tests? https://arxiv.org/pdf/2310.00710.pdf

博客

  1. ChatGPT的软件包推荐是可信赖的吗? https://vulcan.io/blog/ai-hallucinations-package-risk

  2. 人工智能驱动的模糊测试:打破漏洞狩猎的障碍 https://security.googleblog.com/2023/08/ai-powered-fuzzing-breaking-bug-hunting.html

  3. 用GPT做静态代码扫描实现自动化漏洞挖掘思路分享 https://mp.weixin.qq.com/s/Masyfq12cjaM4Zn6qxvGoA

  4. 用 LLM 降低白盒误报及自动修复漏洞代码 https://mp.weixin.qq.com/s/leLFECUaNOGbjsN_8mcXrQ

  5. ChatGPT在源代码分析中可靠吗? https://mp.weixin.qq.com/s/Ix2lArBzaCAJr5nyGolCwQ

  6. ChatGPT在代码中定位缺陷足够好吗? https://pvs-studio.com/en/blog/posts/1035/

  7. ChatGPT+RASP,实现CodeQL漏洞挖掘高效自动化 https://mp.weixin.qq.com/s/xlUWn2oWU51NVkgB157pRw

  8. ChatGPTScan:使用ChatGPTScan批量进行代码审计 https://mp.weixin.qq.com/s/QIKvRzNlAKiqh_UMOMfDdg

  9. Kondukto产品利用OpenAI来修复代码漏洞 https://kondukto.io/blog/kondukto-openai-chatgpt

  10. 利用GPT-4进行调试和漏洞修复 https://www.sitepoint.com/gpt-4-for-debugging/

  11. 开源项目的神奇助理:让AI扮演代码检察员,加速高质量PR review https://mp.weixin.qq.com/s/7WeMbWDUghyS5kSBiZhYYA

  12. GPT-4 Jailbreak and Hacking via Rabbithole Attack, Prompt Injection, Content Modderation Bypass and Weaponizing AI https://adversa.ai/blog/gpt-4-hacking-and-jailbreaking-via-rabbithole-attack-plus-prompt-injection-content-moderation-bypass-weaponizing-ai/

  13. 我是如何用GPT自动化生成Nuclei的POC https://mp.weixin.qq.com/s/Z8cTUItmbwuWbRTAU_Y3pg

威胁检测

利用GPT/AIGC/LLM来完成恶意软件、网络攻击等威胁检测

论文

  1. Static Malware Detection Using Stacked BiLSTM and GPT-2 https://ieeexplore.ieee.org/document/9785789

  2. FlowTransformer: A Transformer Framework for Flow-based Network Intrusion Detection Systems https://arxiv.org/pdf/2304.14746.pdf

  3. ChatGPT for Digital Forensic Investigation: The Good, The Bad, and The Unknown https://arxiv.org/pdf/2307.10195.pdf

  4. Devising and Detecting Phishing: large language models vs. Smaller Human Models https://arxiv.org/pdf/2308.12287.pdf

  5. Anatomy of an AI-powered malicious social botnet https://arxiv.org/pdf/2307.16336.pdf

  6. Revolutionizing Cyber Threat Detection with Large Language Models https://arxiv.org/pdf/2306.14263.pdf

  7. LLM in the Shell: Generative Honeypots 借助LLM打造智能的高交互蜜罐 https://arxiv.org/pdf/2309.00155.pdf

  8. What Does an LLM-Powered Threat Intelligence Program Look Like? http://i.blackhat.com/BH-US-23/Presentations/US-23-Grof-Miller-LLM-Powered-TI-Program.pdf

博客

  1. ChatGPT赋能的威胁分析——使用ChatGPT为每个npm, PyPI包检查安全问题,包括信息渗透、SQL注入漏洞、凭证泄露、提权、后门、恶意安装、预设指令污染等威胁 https://socket.dev/blog/introducing-socket-ai-chatgpt-powered-threat-analysis

安全运营

利用GPT/AIGC/LLM来辅助安全运营/SOAR/SIEM

论文

  1. GPT-2C: A GPT-2 Parser for Cowrie Honeypot Logs https://arxiv.org/pdf/2109.06595.pdf

  2. On the Uses of Large Language Models to Interpret Ambiguous Cyberattack Descriptions https://arxiv.org/pdf/2306.14062.pdf

  3. From Text to MITRE Techniques: Exploring the Malicious Use of Large Language Models for Generating Cyber Attack Payloads https://arxiv.org/pdf/2305.15336.pdf

  4. Automated CVE Analysis for Threat Prioritization and Impact Prediction https://arxiv.org/pdf/2309.03040.pdf

  5. LogGPT: Log Anomaly Detection via GPT https://browse.arxiv.org/pdf/2309.14482.pdf

  6. Cyber Sentinel: Exploring Conversational Agents’ Role in Streamlining Security Tasks with GPT-4 https://browse.arxiv.org/pdf/2309.16422.pdf

  7. AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language Generation https://arxiv.org/pdf/2310.02655.pdf

  8. ChatGPT, Llama, can you write my report? An experiment on assisted digital forensics reports written using (Local) Large Language Models https://arxiv.org/pdf/2312.14607.pdf

  9. HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs) https://arxiv.org/pdf/2309.16021.pdf

  10. Benchmarking Large Language Models for Log Analysis, Security, and Interpretation https://arxiv.org/pdf/2311.14519.pdf

博客

  1. Elastic公司发布的"与ChatGPT探索安全的未来" 提出了6个构想:(1) 聊天机器人协助事件响应 (2) 威胁报告生成 (3) 自然语言检索 (4) 异常检测 (5) 安全策略问答机器人 (6) 告警排序。 https://www.elastic.co/cn/security-labs/exploring-applications-of-chatgpt-to-improve-detection-response-and-understanding

  2. ChatGPT在安全运营中的应用初探 结论——ChatGPT可以赋能包括事件分析与响应在内的多种安全运营过程,降低对本就不足的预置安全知识的依赖,并能促进有价值的安全知识的产生和积累,从而帮助安全运营团队更准确地做出决策、实施响应、积累经验,尤其对初级安全工程师有辅导作用。当然,现阶段以及未来一段时间,ChatGPT等高级AI驱动的聊天机器人还无法完全取代人类分析师,更多是提供辅助决策与操作支持。相信随着持续高强度的人机会话互动,再借助更大规模更专业(网络安全运营领域)的语料库训练,ChatGPT会不断强化自己的能力,不断减轻人类安全分析师的工作负担。 https://www.secrss.com/articles/51775

  3. 利用Chat GPT和D3的AI辅助事件响应 探讨将ChatGPT与Smart SOAR整合的好处;提供样例分析——使用MITRE TTPs和微软端点防御系统警报中发现的恶意软件家族以收集事件的背景信息,之后问ChatGPT,让它根据对TTP和恶意软件的了解,描述攻击者接下来可能会采取什么措施、恶意软件可能利用什么漏洞。 https://www.163.com/dy/article/I48DBHHG055633FJ.html

  4. Blink公司推出有史以来第一个用于自动化安全和IT运营工作流程的生成式人工智能 https://www.blinkops.com/blog/introducing-blink-copilot-the-first-generative-ai-for-security-workflows

GPT自身安全

关于GPT/AIGC/LLM等大模型技术自身的各类安全风险以及漏洞,大模型技术滥用与误用的可能性

论文

  1. GPT-4 Technical Report OpenAI对模型自身安全评估和缓解 https://arxiv.org/abs/2303.08774

  2. Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models 提出一个未来的研究方向:在特定的使用情境下,保障大语言模型的安全可靠性需要发展什么类型的测试? https://arxiv.org/pdf/2102.02503.pdf

  3. Taxonomy of Risks posed by Language Models https://dl.acm.org/doi/10.1145/3531146.3533088

  4. ChatGPT Security Risks: A Guide for Cyber Security Professionals https://www.cybertalk.org/wp-content/uploads/2023/03/ChatGPT_eBook_CP_CT_2023.pdf

  5. A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation https://arxiv.org/pdf/2305.11391.pdf

  6. The (ab)use of Open Source Code to Train Large Language Models https://arxiv.org/pdf/2302.13681.pdf

  7. GPT in Sheep’s Clothing: The Risk of Customized GPTs https://arxiv.org/pdf/2401.09075.pdf

  8. In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT https://arxiv.org/pdf/2304.08979.pdf

  9. DECODINGTRUST: A Comprehensive Assessment of Trustworthiness in GPT Models https://arxiv.org/pdf/2306.11698.pdf

  10. On the Trustworthiness Landscape of State-of-the-art Generative Models: A Comprehensive Survey https://arxiv.org/pdf/2307.16680.pdf

  11. LLM Censorship: A Machine Learning Challenge or a Computer Security Problem? https://arxiv.org/pdf/2307.10719.pdf

  12. Ignore Previous Prompt: Attack Techniques For Language Models ML Safety Workshop NeurIPS 2022,以及提示注入的开山之作 https://arxiv.org/pdf/2211.09527.pdf

  13. Boosting Big Brother: Attacking Search Engines with Encodings 攻击测试了集成OpenAI GPT-4模型的必应搜索引擎 https://arxiv.org/pdf/2304.14031.pdf

  14. Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System https://arxiv.org/pdf/2309.04858.pdf

  15. More than you've asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models 间接提示注入的开山之作,里面很多场景都已成为现实 https://arxiv.org/pdf/2302.12173.pdf

  16. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models https://arxiv.org/pdf/2009.11462.pdf

  17. Jailbroken: How Does LLM Safety Training Fail? https://arxiv.org/pdf/2307.02483.pdf

  18. FuzzLLM: A Novel and Universal Fuzzing Framework for Proactively Discovering Jailbreak Vulnerabilities in Large Language Models https://arxiv.org/pdf/2309.05274.pdf

  19. GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts https://arxiv.org/pdf/2309.10253.pdf

  20. Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks https://arxiv.org/pdf/2302.05733.pdf

  21. Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection https://arxiv.org/pdf/2302.12173.pdf

  22. Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications https://arxiv.org/pdf/2311.16153.pdf

  23. Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned https://arxiv.org/pdf/2209.07858.pdf

  24. Explore, Establish, Exploit: Red Teaming Language Models from Scratch https://arxiv.org/pdf/2306.09442.pdf

  25. Beyond the Safeguards: Exploring the Security Risks of ChatGPT https://arxiv.org/pdf/2305.08005.pdf

  26. ProPILE: Probing Privacy Leakage in Large Language Models https://arxiv.org/pdf/2307.01881.pdf

  27. Analyzing Leakage of Personally Identifiable Information in Language Models https://ieeexplore.ieee.org/document/10179300

  28. Can We Generate Shellcodes via Natural Language? An Empirical Study https://link.springer.com/article/10.1007/s10515-022-00331-3

  29. RatGPT: Turning online LLMs into Proxies for Malware Attacks https://arxiv.org/pdf/2308.09183.pdf

  30. Chatbots to ChatGPT in a Cybersecurity Space: Evolution, Vulnerabilities, Attacks, Challenges, and Future Recommendations https://arxiv.org/pdf/2306.09255.pdf

  31. BadPrompt: Backdoor Attacks on Continuous Prompts 南开大学在NeurIPS 2022上发表的论文,小样本场景下通过连续prompt对大模型后门攻击 https://arxiv.org/pdf/2211.14719.pdf

  32. LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors https://arxiv.org/pdf/2308.13904.pdf

  33. A Comprehensive Overview of Backdoor Attacks in Large Language Models within Communication Networks https://arxiv.org/pdf/2308.14367.pdf

  34. BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models https://arxiv.org/pdf/2401.12242.pdf

  35. Universal and Transferable Adversarial Attacks on Aligned Language Models https://arxiv.org/pdf/2307.15043.pdf https://llm-attacks.org

  36. Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment https://arxiv.org/pdf/2308.09662.pdf

  37. Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models https://arxiv.org/pdf/2310.00322.pdf

  38. A LLM Assisted Exploitation of AI-Guardian https://arxiv.org/pdf/2307.15008.pdf

  39. The Art of Defending: A Systematic Evaluation and Analysis of LLM Defense Strategies on Safety and Over-Defensiveness https://arxiv.org/pdf/2401.00287.pdf

  40. GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher 用摩斯、凯撒等加密密码,可向ChatGPT询问非法内容 https://arxiv.org/pdf/2308.06463.pdf

  41. Model Leeching: An Extraction Attack Targeting LLMs https://arxiv.org/pdf/2309.10544.pdf

  42. "Do Anything Now": Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models https://arxiv.org/pdf/2308.03825.pdf

  43. Tree of Attacks: Jailbreaking Black-Box LLMs Automatically https://arxiv.org/pdf/2312.02119.pdf

  44. LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI's ChatGPT Plugins 主页:https://llm-platform-security.github.io/chatgpt-plugin-eval/ 开源项目:chatgpt-plugin-eval https://arxiv.org/pdf/2309.10254v1.pdf

  45. Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition [https://aclanthology.org/2023.emnlp-main.302.pdf]

  46. ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger [https://arxiv.org/pdf/2304.14475.pdf]

  47. JADE: A Linguistics-based Safety Evaluation Platform for Large Language Models [https://arxiv.org/pdf/2311.00286.pdf]

  48. Robust Safety Classifier for Large Language Models: Adversarial Prompt Shield [https://arxiv.org/pdf/2311.00172.pdf]

  49. BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B [https://arxiv.org/pdf/2311.00117.pdf]

  50. Constitutional AI: Harmlessness from AI Feedback [https://arxiv.org/abs/2212.08073.pdf]

  51. Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models [https://arxiv.org/abs/2312.09669.pdf]

  52. Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs [https://arxiv.org/abs/2308.13387.pdf]

  53. Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models [https://arxiv.org/abs/2305.01219.pdf]

  54. Detecting Language Model Attacks with Perplexity [https://arxiv.org/abs/2308.14132v3.pdf]

博客

  1. OWASP发布了Top 10 for Large Language Model Applications项目,旨在教育开发人员、设计师、架构师、经理和组织了解部署和管理大语言模型LLM时的潜在安全风险,该项目提供了一个常见于LLM应用中的十个最关键漏洞的列表。 https://owasp.org/www-project-top-10-for-large-language-model-applications/

  2. 干货分享!Langchain框架Prompt Injection在野0day漏洞分析 https://mp.weixin.qq.com/s/wFJ8TPBiS74RzjeNk7lRsw

  3. 通过提示注入在 MathGPT 中实现代码执行 公开可用的 MathGPT 借助底层的GPT-3模型来回答用户生成的数学问题。最近的研究和实验表明,GPT-3 等 大模型在直接进行数学计算的任务上表现不佳,然而能够更准确地生成问题解决方案的可执行代码。 因此 MathGPT 将用户的自然语言问题转换为 Python 代码,执行计算后的代码和答案会显示给用户。某些 LLM 可能容易受到提示词注入攻击,恶意用户输入会导致模型执行意外行为[3][4]。 在此事件中,攻击者探索了几种提示词覆盖途径,生成的代码最终导致攻击者获得应用程序主机系统的环境变量和应用程序的 GPT-3 API 密钥的访问权限,并执行拒绝服务攻击。 因此,攻击者可能会耗尽应用程序的 API 查询预算或关闭应用程序。 https://atlas.mitre.org/studies/AML.CS0016/

  4. 用ChatGPT来生成编码器与配套WebShell antsword官方出品 https://mp.weixin.qq.com/s/I9IhkZZ3YrxblWIxWMXAWA

  5. 使用ChatGPT来生成钓鱼邮件和钓鱼网站 相比其他仅生成钓鱼邮件,这里把钓鱼网站也生成了 https://www.richardosgood.com/posts/using-openai-chat-for-phishing/

  6. Using GPT-Eliezer against ChatGPT Jailbreaking 检测对抗性提示词 https://www.alignmentforum.org/posts/pNcFYZnPdXyL2RfgA/using-gpt-eliezer-against-chatgpt-jailbreaking

  7. LLM中的安全隐患-以VirusTotal Code Insight中的提示注入为例 https://mp.weixin.qq.com/s/U2yPGOmzlvlF6WeNd7B7ww

  8. ChatGPT羊驼家族全沦陷!CMU博士击破LLM护栏,人类毁灭计划脱口而出 CMU和人工智能安全中心的研究人员发现,只要通过附加一系列特定的无意义token,就能生成一个神秘的prompt后缀。由此,任何人都可以轻松破解LLM的安全措施,生成无限量的有害内容。 论文地址:https://arxiv.org/abs/2307.15043 代码地址:https://github.com/llm-attacks/llm-attacks https://mp.weixin.qq.com/s/298nwP98UdRNybV2Fuo6Wg

大模型隐私保护

以隐私计算等技术保护GPT/AIGC/LLM等大模型

论文

  1. Challenges and Remedies to Privacy and Security in AIGC: Exploring the Potential of Privacy Computing, Blockchain, and Beyond https://arxiv.org/pdf/2306.00419.pdf

  2. GenAIPABench: A Benchmark for Generative AI-based Privacy Assistants https://arxiv.org/pdf/2309.05138.pdf

  3. Recovering from Privacy-Preserving Masking with Large Language Models https://arxiv.org/pdf/2309.08628.pdf

  4. Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy Protection https://arxiv.org/pdf/2309.03057.pdf

  5. Knowledge Sanitization of Large Language Models https://arxiv.org/pdf/2309.11852.pdf

  6. PrivateLoRA For Efficient Privacy Preserving LLM https://arxiv.org/pdf/2311.14030.pdf

  7. Don’t forget private retrieval: distributed private similarity search for large language models https://arxiv.org/pdf/2311.12955.pdf

以安全数据训练GPT

收集安全数据训练或增强GPT/AIGC/LLM等大模型技术

论文

  1. Web Content Filtering through knowledge distillation of Large Language Models https://arxiv.org/pdf/2305.05027.pdf

  2. HackMentor: Fine-Tuning Large Language Models for Cybersecurity 论文发布:https://github.com/tmylla/HackMentor/blob/main/HackMentor.pdf 项目地址:https://github.com/tmylla/HackMentor

  3. LLMs Perform Poorly at Concept Extraction in Cyber-security Research Literature https://arxiv.org/pdf/2312.07110.pdf

  4. netFound: Foundation Model for Network Security 网络安全基础大模型 https://arxiv.org/pdf/2310.17025.pdf

博客

  1. ChatGPT:和黑客知识库聊天 (1) 从prompt到自训练数据原文反向索引的准确性;(2) openai提供模型的微调服务的尝试;(3) 其他可替代性模型总结;(4) 围绕markdown格式的数据集解析和分块索引的脚本示例; (5) 相似索引向量引擎推荐。 https://mp.weixin.qq.com/s/dteH4oP24qGY-4l3xSl7Vg

  2. 如何训练自己的“安全大模型” https://mp.weixin.qq.com/s/801sV5a7-wOh_1EN3U64-Q

Last updated