Security Papers

"在安全领域应用GPT/AIGC/LLM的论文以及博文"

Posted by Clouditera on December 29, 2023

行业新闻

  1. 微软宣布推出网络安全产品——Microsoft Security Copilot
    Security Copilot将目前最强大语言模型GPT-4内置在产品中,并与微软拥有65万亿个网络安全威胁的安全模型库相结合使用,为企业、个人用户提供网络安全、恶意代码防护、隐私合规监控等生成式自动化AI服务
    https://thehackernews.com/2023/03/microsoft-introduces-gpt-4-ai-powered.html
  2. 在网络安全最佳实践中集成ChatGPT和生成式AI
    https://www.sentinelone.com/blog/integrating-chatgpt-generative-ai-within-cybersecurity-best-practices/
  3. 行业分析:云之后,大模型是网络安全的新机会吗?
    https://mp.weixin.qq.com/s/nmeDrQX5dTRUT23-2sGI-g
  4. VirusTotal推出Code Insight,用生成式人工智能为威胁分析赋能
    https://blog.virustotal.com/2023/04/introducing-virustotal-code-insight.html
  5. 安全大模型进入爆发期!谷歌云已接入全线安全产品|RSAC 2023
    https://mp.weixin.qq.com/s/5Aywrqk7B6YCiLRbojNCuQ
  6. Facebook季度安全报告:假冒ChatGPT的恶意软件激增
    https://about.fb.com/news/2023/05/metas-q1-2023-security-reports/
  7. Tenable的报告展示了生成式人工智能正在如何改变安全研究
    https://venturebeat.com/security/tenable-report-shows-how-generative-ai-is-changing-security-research/
  8. 挖掘AIGC军火产业链,颠覆巨头的机会和风险
    https://mp.weixin.qq.com/s/bVQYT0QqueGyLwPAppDRtg
  9. 增强ChatGPT等安全性,大语言模型界的“安全管家”开源了!
    https://mp.weixin.qq.com/s/PeuNht95WVJbJ8hiOOrSoA
  10. ChatGPT暗黑版——FraudGPT
    https://hackernoon.com/what-is-fraudgpt

实践指南

  1. 安全人工智能系统开发指南
  2. Guidelines for secure AI system development

软件供应链安全

利用GPT/AIGC/LLM来进行漏洞挖掘和修复、代码质量评测、程序理解

论文

  1. Trust in Software Supply Chains: Blockchain-Enabled SBOM and the AIBOM Future
    https://arxiv.org/pdf/2307.02088.pdf
  2. An Empirical Study on Using Large Language Models to Analyze Software Supply Chain Security Failures
    https://arxiv.org/pdf/2308.04898.pdf
  3. SecureFalcon: The Next Cyber Reasoning System for Cyber Security
    https://arxiv.org/pdf/2307.06616.pdf
  4. Using ChatGPT as a Static Application Security Testing Tool
    https://arxiv.org/pdf/2308.14434.pdf
  5. A Preliminary Study on Using Large Language Models in Software Pentesting
    https://arxiv.org/pdf/2401.17459.pdf
  6. LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs’ Vulnerability Reasoning
    https://arxiv.org/pdf/2401.16185.pdf
  7. Finetuning Large Language Models for Vulnerability Detection
    https://arxiv.org/pdf/2401.17010.pdf
  8. Large Language Model for Vulnerability Detection: Emerging Results and Future Directions
    https://arxiv.org/pdf/2401.15468.pdf
  9. How Far Have We Gone in Vulnerability Detection Using Large Language Models
    https://arxiv.org/pdf/2311.12420.pdf
  10. LLbezpeky: Leveraging Large Language Models for Vulnerability Detection
    https://arxiv.org/pdf/2401.01269.pdf
  11. Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities
    https://arxiv.org/pdf/2311.16169.pdf
  12. Detecting software vulnerabilities using Language Models
    https://arxiv.org/ftp/arxiv/papers/2302/2302.11773.pdf
  13. Prompt-Enhanced Software Vulnerability Detection Using ChatGPT
    https://arxiv.org/pdf/2308.12697.pdf
  14. Evaluation of ChatGPT Model for Vulnerability Detection
    https://arxiv.org/pdf/2304.07232.pdf
  15. LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluation
    https://arxiv.org/pdf/2303.09384.pdf
  16. DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection
    https://arxiv.org/pdf/2304.00409.pdf
  17. FLAG: Finding Line Anomalies (in code) with Generative AI
    https://arxiv.org/pdf/2306.12643.pdf
  18. Large Language Models are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT
    https://arxiv.org/pdf/2304.02014.pdf
  19. Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models
    https://arxiv.org/pdf/2212.14834.pdf
  20. CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement Learning for LLM-based Mutation
    https://arxiv.org/pdf/2402.12222.pdf
  21. When GPT Meets Program Analysis: Towards Intelligent Detection of Smart Contract Logic Vulnerabilities in GPTScan
    https://arxiv.org/pdf/2308.03314.pdf
  22. Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding
    https://arxiv.org/pdf/2309.09826.pdf
  23. Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives
    https://arxiv.org/pdf/2310.01152.pdf
  24. VulLibGen: Identifying Vulnerable Third-Party Libraries via Generative Pre-Trained Model
    https://arxiv.org/pdf/2308.04662.pdf
  25. Leveraging AI Planning For Detecting Cloud Security Vulnerabilities
    https://arxiv.org/pdf/2402.10985.pdf
  26. InferFix: End-to-End Program Repair with LLMs
    https://arxiv.org/pdf/2303.07263.pdf
  27. HW-V2W-Map: Hardware Vulnerability to Weakness Mapping Framework for Root Cause Analysis with GPT-assisted Mitigation Suggestion
    https://arxiv.org/pdf/2312.13530.pdf
  28. Fixing Hardware Security Bugs with Large Language Models
    https://arxiv.org/pdf/2302.01215.pdf
  29. Unlocking Hardware Security Assurancee: The Potential of LLMs
    https://arxiv.org/pdf/2308.11042.pdf
  30. Generating Secure Hardware using ChatGPT Resistant to CWEs
    围绕硬件设计实施常见的10个CWE,分别展示了生成带缺陷代码和安全代码的提示词场景
    https://eprint.iacr.org/2023/212.pdf
  31. DIVAS: An LLM-based End-to-End Framework for SoC Security Analysis and Policy-based Protection
    https://arxiv.org/pdf/2308.06932.pdf
  32. Examining Zero-Shot Vulnerability Repair with Large Language Models
    https://arxiv.org/pdf/2112.02125.pdf
  33. Practical Program Repair in the Era of Large Pre-trained Language Models
    https://arxiv.org/pdf/2210.14179.pdf
  34. An Analysis of the Automatic Bug Fixing Performance of ChatGPT
    https://arxiv.org/pdf/2301.08653.pdf
  35. Automatic Program Repair with OpenAI’s Codex: Evaluating QuixBugs
    https://arxiv.org/pdf/2111.03922.pdf
  36. How Effective Are Neural Networks for Fixing Security Vulnerabilities
    https://arxiv.org/pdf/2305.18607.pdf
  37. STEAM: Simulating the InTeractive BEhavior of ProgrAMmers for Automatic Bug Fixing
    https://arxiv.org/pdf/2308.14460.pdf
  38. ZeroLeak: Using LLMs for Scalable and Cost Effective Side-Channel Patching
    https://arxiv.org/pdf/2308.13062.pdf
  39. Can LLMs Patch Security Issues?
    https://arxiv.org/pdf/2312.00024.pdf
  40. Better patching using LLM prompting, via Self-Consistency
    https://arxiv.org/pdf/2306.00108.pdf
  41. Identifying Vulnerability Patches by Comprehending Code Commits with Comprehensive Change Contexts
    https://arxiv.org/pdf/2310.02530.pdf
  42. Just-in-Time Security Patch Detection – LLM At the Rescue for Data Augmentation
    https://arxiv.org/pdf/2312.01241.pdf
  43. Towards JavaScript program repair with generative pre-trained transformer (GPT-2)
    https://dl.acm.org/doi/abs/10.1145/3524459.3527350
  44. Code Security Vulnerability Repair Using Reinforcement
    https://arxiv.org/pdf/2401.07031.pdf
  45. Enhanced Automated Code Vulnerability Repair using Large Language Models
    https://arxiv.org/ftp/arxiv/papers/2401/2401.03741.pdf
  46. Repair Is Nearly Generation: Multilingual Program Repair with LLMs
    https://arxiv.org/pdf/2208.11640.pdf
  47. Cupid: Leveraging ChatGPT for More Accurate Duplicate Bug Report Detection
    https://arxiv.org/pdf/2308.10022v2.pdf
  48. Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions
    https://arxiv.org/pdf/2108.09293.pdf
  49. Do Users Write More Insecure Code with AI Assistants?
    https://arxiv.org/pdf/2211.03622.pdf
  50. How Secure is Code Generated by ChatGPT?
    https://arxiv.org/pdf/2304.09655.pdf
  51. Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants
    https://arxiv.org/pdf/2208.09727.pdf
  52. Evaluating Large Language Models Trained on Code
    https://arxiv.org/pdf/2107.03374.pdf
  53. No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT
    https://arxiv.org/pdf/2308.04838.pdf
  54. Assessing the Quality of GitHub Copilot’s Code Generation
    https://dl.acm.org/doi/abs/10.1145/3558489.3559072
  55. Is GitHub’s Copilot as Bad as Humans at Introducing Vulnerabilities in Code?
    https://arxiv.org/pdf/2204.04741.pdf
  56. How ChatGPT is Solving Vulnerability Management Problem
    https://arxiv.org/pdf/2311.06530.pdf
  57. Neural Code Completion Tools Can Memorize Hard-coded Credentials
    https://arxiv.org/pdf/2309.07639.pdf
  58. BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT
    https://www.ndss-symposium.org/wp-content/uploads/2023/02/NDSS2023Poster_paper_7966.pdf
  59. Teaching Large Language Models to Self-Debug
    https://arxiv.org/pdf/2304.05128.pdf
  60. LLM4SecHW: Leveraging Domain-Specific Large Language Model for Hardware Debugging
    https://browse.arxiv.org/pdf/2401.16448.pdf
  61. Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT
    https://arxiv.org/pdf/2304.10778.pdf
  62. Fault-Aware Neural Code Rankers
    https://arxiv.org/pdf/2206.03865.pdf
  63. Using Large Language Models to Enhance Programming Error Messages
    https://arxiv.org/pdf/2210.11630.pdf
  64. Controlling Large Language Models to Generate Secure and Vulnerable Code
    引用了Asleep at the keyboard? 使用了预训练模型,对LM的输出进行pre-train以控制输出的代码是安全的还是存在漏洞的
    https://arxiv.org/pdf/2302.05319.pdf
  65. Systematically Finding Security Vulnerabilities in Black-Box Code Generation Models
    针对“prompt中的微小变化可能导致漏洞”的问题,在”Asleep at the keyboard?”手动操作在基础上实现自动化发现
    https://arxiv.org/pdf/2302.04012.pdf
  66. SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques
    在Systematically Finding Security Vulnerabilities in Black-Box Code Generation的论文中,把这篇论文看的很重,解决模型评估的数据集的问题
    https://dl.acm.org/doi/abs/10.1145/3549035.3561184
  67. Security Code Review by LLMs: A Deep Dive into Responses
    https://browse.arxiv.org/pdf/2401.16310.pdf
  68. Exploring the Limits of ChatGPT in Software Security Applications
    https://arxiv.org/pdf/2312.05275.pdf
  69. An Empirical Evaluation of LLMs for Solving Offensive Security Challenges
    https://arxiv.org/pdf/2402.11814.pdf
  70. Purple Llama CYBERSECEVAL: A Secure Coding Benchmark for Language Models
    https://arxiv.org/pdf/2312.04724.pdf
  71. Can Large Language Models Identify And Reason About Security Vulnerabilities? Not Yet
    https://arxiv.org/pdf/2312.12575.pdf
  72. Pop Quiz! Can a Large Language Model Help With Reverse Engineering
    https://arxiv.org/pdf/2202.01142.pdf
  73. CODAMOSA: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models
    微软在ICSE2023上发布的论文,旨在利用LLM来缓解传统fuzz中的陷入“Coverage Plateaus”的问题
    https://www.carolemieux.com/codamosa_icse23.pdf
  74. Understanding Large Language Model Based Fuzz Driver Generation
    https://arxiv.org/abs/2307.12469
  75. ChatGPT for Software Security: Exploring the Strengths and Limitations of ChatGPT in the Security Applications
    https://arxiv.org/abs/2307.12488
  76. How well does LLM generate security tests?
    https://arxiv.org/pdf/2310.00710.pdf

博客

  1. ChatGPT的软件包推荐是可信赖的吗?
    https://vulcan.io/blog/ai-hallucinations-package-risk
  2. 人工智能驱动的模糊测试:打破漏洞狩猎的障碍
    https://security.googleblog.com/2023/08/ai-powered-fuzzing-breaking-bug-hunting.html
  3. 用GPT做静态代码扫描实现自动化漏洞挖掘思路分享
    https://mp.weixin.qq.com/s/Masyfq12cjaM4Zn6qxvGoA
  4. 用 LLM 降低白盒误报及自动修复漏洞代码
    https://mp.weixin.qq.com/s/leLFECUaNOGbjsN_8mcXrQ
  5. ChatGPT在源代码分析中可靠吗?
    https://mp.weixin.qq.com/s/Ix2lArBzaCAJr5nyGolCwQ
  6. ChatGPT在代码中定位缺陷足够好吗?
    https://pvs-studio.com/en/blog/posts/1035/
  7. ChatGPT+RASP,实现CodeQL漏洞挖掘高效自动化
    https://mp.weixin.qq.com/s/xlUWn2oWU51NVkgB157pRw
  8. ChatGPTScan:使用ChatGPTScan批量进行代码审计
    https://mp.weixin.qq.com/s/QIKvRzNlAKiqh_UMOMfDdg
  9. 应用GPT-4 于 Semgrep 中以指出误报和修复代码
    https://semgrep.dev/blog/2023/gpt4-and-semgrep-detailed?utm_source=twitter\&utm_medium=social\&utm_campaign=brand
  10. Kondukto产品利用OpenAI来修复代码漏洞
    https://kondukto.io/blog/kondukto-openai-chatgpt
  11. 利用ChatGPT来进行代码审计
    https://research.nccgroup.com/2023/02/09/security-code-review-with-chatgpt/
  12. 利用GPT-3在单个代码仓库中找到213个安全漏洞
    https://betterprogramming.pub/i-used-gpt-3-to-find-213-security-vulnerabilities-in-a-single-codebase-cc3870ba9411
  13. 利用GPT-4进行调试和漏洞修复
    https://www.sitepoint.com/gpt-4-for-debugging/
  14. 开源项目的神奇助理:让AI扮演代码检察员,加速高质量PR review
    https://mp.weixin.qq.com/s/7WeMbWDUghyS5kSBiZhYYA
  15. 用ChatGPT生成测试数据
    https://mp.weixin.qq.com/s/tE09X5Fce-PQs1urpJGWAg
  16. 黑客可能利用ChatGPT的方式
    https://cybernews.com/security/hackers-exploit-chatgpt/
  17. GPT-4 Jailbreak and Hacking via Rabbithole Attack, Prompt Injection, Content Modderation Bypass and Weaponizing AI
    https://adversa.ai/blog/gpt-4-hacking-and-jailbreaking-via-rabbithole-attack-plus-prompt-injection-content-moderation-bypass-weaponizing-ai/
  18. 我是如何用GPT自动化生成Nuclei的POC
    https://mp.weixin.qq.com/s/Z8cTUItmbwuWbRTAU_Y3pg
  19. ChatGPT 生成代码的安全漏洞
    https://www.trendmicro.com/ja_jp/devops/23/e/chatgpt-security-vulnerabilities.html

威胁检测

利用GPT/AIGC/LLM来完成恶意软件、网络攻击等威胁检测

论文

  1. Static Malware Detection Using Stacked BiLSTM and GPT-2
    https://ieeexplore.ieee.org/document/9785789
  2. FlowTransformer: A Transformer Framework for Flow-based Network Intrusion Detection Systems
    https://arxiv.org/pdf/2304.14746.pdf
  3. ChatGPT for Digital Forensic Investigation: The Good, The Bad, and The Unknown
    https://arxiv.org/pdf/2307.10195.pdf
  4. Devising and Detecting Phishing: large language models vs. Smaller Human Models
    https://arxiv.org/pdf/2308.12287.pdf
  5. Anatomy of an AI-powered malicious social botnet
    https://arxiv.org/pdf/2307.16336.pdf
  6. Revolutionizing Cyber Threat Detection with Large Language Models
    https://arxiv.org/pdf/2306.14263.pdf
  7. LLM in the Shell: Generative Honeypots
    借助LLM打造智能的高交互蜜罐
    https://arxiv.org/pdf/2309.00155.pdf
  8. What Does an LLM-Powered Threat Intelligence Program Look Like?
    http://i.blackhat.com/BH-US-23/Presentations/US-23-Grof-Miller-LLM-Powered-TI-Program.pdf

博客

  1. IoC detection experiments with ChatGPT
    https://securelist.com/ioc-detection-experiments-with-chatgpt/108756/
  2. ChatGPT赋能的威胁分析——使用ChatGPT为每个npm, PyPI包检查安全问题,包括信息渗透、SQL注入漏洞、凭证泄露、提权、后门、恶意安装、预设指令污染等威胁
    https://socket.dev/blog/introducing-socket-ai-chatgpt-powered-threat-analysis

安全运营

利用GPT/AIGC/LLM来辅助安全运营/SOAR/SIEM

论文

  1. GPT-2C: A GPT-2 Parser for Cowrie Honeypot Logs
    https://arxiv.org/pdf/2109.06595.pdf
  2. On the Uses of Large Language Models to Interpret Ambiguous Cyberattack Descriptions
    https://arxiv.org/pdf/2306.14062.pdf
  3. From Text to MITRE Techniques: Exploring the Malicious Use of Large Language Models for Generating Cyber Attack Payloads
    https://arxiv.org/pdf/2305.15336.pdf
  4. Automated CVE Analysis for Threat Prioritization and Impact Prediction
    https://arxiv.org/pdf/2309.03040.pdf
  5. LogGPT: Log Anomaly Detection via GPT
    https://browse.arxiv.org/pdf/2309.14482.pdf
  6. Cyber Sentinel: Exploring Conversational Agents’ Role in Streamlining Security Tasks with GPT-4
    https://browse.arxiv.org/pdf/2309.16422.pdf
  7. AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language Generation
    https://arxiv.org/pdf/2310.02655.pdf
  8. ChatGPT, Llama, can you write my report? An experiment on assisted digital forensics reports written using (Local) Large Language Models
    https://arxiv.org/pdf/2312.14607.pdf
  9. HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs)
    https://arxiv.org/pdf/2309.16021.pdf
  10. Benchmarking Large Language Models for Log Analysis, Security, and Interpretation
    https://arxiv.org/pdf/2311.14519.pdf

博客

  1. Elastic公司发布的”与ChatGPT探索安全的未来”
    提出了6个构想:(1) 聊天机器人协助事件响应 (2) 威胁报告生成 (3) 自然语言检索 (4) 异常检测 (5) 安全策略问答机器人 (6) 告警排序。
    https://www.elastic.co/cn/security-labs/exploring-applications-of-chatgpt-to-improve-detection-response-and-understanding
  2. ChatGPT在安全运营中的应用初探
    结论——ChatGPT可以赋能包括事件分析与响应在内的多种安全运营过程,降低对本就不足的预置安全知识的依赖,并能促进有价值的安全知识的产生和积累,从而帮助安全运营团队更准确地做出决策、实施响应、积累经验,尤其对初级安全工程师有辅导作用。当然,现阶段以及未来一段时间,ChatGPT等高级AI驱动的聊天机器人还无法完全取代人类分析师,更多是提供辅助决策与操作支持。相信随着持续高强度的人机会话互动,再借助更大规模更专业(网络安全运营领域)的语料库训练,ChatGPT会不断强化自己的能力,不断减轻人类安全分析师的工作负担。
    https://www.secrss.com/articles/51775
  3. 利用Chat GPT和D3的AI辅助事件响应
    探讨将ChatGPT与Smart SOAR整合的好处;提供样例分析——使用MITRE TTPs和微软端点防御系统警报中发现的恶意软件家族以收集事件的背景信息,之后问ChatGPT,让它根据对TTP和恶意软件的了解,描述攻击者接下来可能会采取什么措施、恶意软件可能利用什么漏洞。
    https://www.163.com/dy/article/I48DBHHG055633FJ.html
  4. Blink公司推出有史以来第一个用于自动化安全和IT运营工作流程的生成式人工智能
    https://www.blinkops.com/blog/introducing-blink-copilot-the-first-generative-ai-for-security-workflows

GPT自身安全

关于GPT/AIGC/LLM等大模型技术自身的各类安全风险以及漏洞,大模型技术滥用与误用的可能性

论文

  1. GPT-4 Technical Report
    OpenAI对模型自身安全评估和缓解
    https://arxiv.org/abs/2303.08774

  2. Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models
    提出一个未来的研究方向:在特定的使用情境下,保障大语言模型的安全可靠性需要发展什么类型的测试?
    https://arxiv.org/pdf/2102.02503.pdf

  3. Taxonomy of Risks posed by Language Models
    https://dl.acm.org/doi/10.1145/3531146.3533088

  4. ChatGPT Security Risks: A Guide for Cyber Security Professionals
    https://www.cybertalk.org/wp-content/uploads/2023/03/ChatGPT_eBook_CP_CT_2023.pdf

  5. A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation
    https://arxiv.org/pdf/2305.11391.pdf

  6. The (ab)use of Open Source Code to Train Large Language Models
    https://arxiv.org/pdf/2302.13681.pdf

  7. GPT in Sheep’s Clothing: The Risk of Customized GPTs
    https://arxiv.org/pdf/2401.09075.pdf

  8. In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT
    https://arxiv.org/pdf/2304.08979.pdf

  9. DECODINGTRUST: A Comprehensive Assessment of Trustworthiness in GPT Models
    https://arxiv.org/pdf/2306.11698.pdf

  10. On the Trustworthiness Landscape of State-of-the-art Generative Models: A Comprehensive Survey
    https://arxiv.org/pdf/2307.16680.pdf

  11. LLM Censorship: A Machine Learning Challenge or a Computer Security Problem?
    https://arxiv.org/pdf/2307.10719.pdf

  12. Ignore Previous Prompt: Attack Techniques For Language Models
    ML Safety Workshop NeurIPS 2022,以及提示注入的开山之作
    https://arxiv.org/pdf/2211.09527.pdf

  13. Boosting Big Brother: Attacking Search Engines with Encodings
    攻击测试了集成OpenAI GPT-4模型的必应搜索引擎
    https://arxiv.org/pdf/2304.14031.pdf

  14. Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System
    https://arxiv.org/pdf/2309.04858.pdf

  15. More than you’ve asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models
    间接提示注入的开山之作,里面很多场景都已成为现实
    https://arxiv.org/pdf/2302.12173.pdf

  16. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models
    https://arxiv.org/pdf/2009.11462.pdf

  17. Jailbroken: How Does LLM Safety Training Fail?
    https://arxiv.org/pdf/2307.02483.pdf

  18. FuzzLLM: A Novel and Universal Fuzzing Framework for Proactively Discovering Jailbreak Vulnerabilities in Large Language Models
    https://arxiv.org/pdf/2309.05274.pdf

  19. GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts
    https://arxiv.org/pdf/2309.10253.pdf

  20. Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks
    https://arxiv.org/pdf/2302.05733.pdf

  21. Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection
    https://arxiv.org/pdf/2302.12173.pdf

  22. Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications
    https://arxiv.org/pdf/2311.16153.pdf

  23. Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned
    https://arxiv.org/pdf/2209.07858.pdf

  24. Explore, Establish, Exploit: Red Teaming Language Models from Scratch
    https://arxiv.org/pdf/2306.09442.pdf

  25. Beyond the Safeguards: Exploring the Security Risks of ChatGPT
    https://arxiv.org/pdf/2305.08005.pdf

  26. ProPILE: Probing Privacy Leakage in Large Language Models
    https://arxiv.org/pdf/2307.01881.pdf

  27. Analyzing Leakage of Personally Identifiable Information in Language Models
    https://ieeexplore.ieee.org/document/10179300

  28. Can We Generate Shellcodes via Natural Language? An Empirical Study
    https://link.springer.com/article/10.1007/s10515-022-00331-3

  29. RatGPT: Turning online LLMs into Proxies for Malware Attacks
    https://arxiv.org/pdf/2308.09183.pdf

  30. Chatbots to ChatGPT in a Cybersecurity Space: Evolution, Vulnerabilities, Attacks, Challenges, and Future Recommendations
    https://arxiv.org/pdf/2306.09255.pdf

  31. BadPrompt: Backdoor Attacks on Continuous Prompts
    南开大学在NeurIPS 2022上发表的论文,小样本场景下通过连续prompt对大模型后门攻击
    https://arxiv.org/pdf/2211.14719.pdf

  32. LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors
    https://arxiv.org/pdf/2308.13904.pdf

  33. A Comprehensive Overview of Backdoor Attacks in Large Language Models within Communication Networks
    https://arxiv.org/pdf/2308.14367.pdf

  34. BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models
    https://arxiv.org/pdf/2401.12242.pdf

  35. Universal and Transferable Adversarial Attacks on Aligned Language Models
    https://arxiv.org/pdf/2307.15043.pdf
    https://llm-attacks.org

  36. Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment
    https://arxiv.org/pdf/2308.09662.pdf

  37. Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models
    https://arxiv.org/pdf/2310.00322.pdf

  38. A LLM Assisted Exploitation of AI-Guardian
    https://arxiv.org/pdf/2307.15008.pdf

  39. The Art of Defending: A Systematic Evaluation and Analysis of LLM Defense Strategies on Safety and Over-Defensiveness
    https://arxiv.org/pdf/2401.00287.pdf

  40. GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher
    用摩斯、凯撒等加密密码,可向ChatGPT询问非法内容
    https://arxiv.org/pdf/2308.06463.pdf

  41. Model Leeching: An Extraction Attack Targeting LLMs
    https://arxiv.org/pdf/2309.10544.pdf

  42. “Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models
    https://arxiv.org/pdf/2308.03825.pdf

  43. Tree of Attacks: Jailbreaking Black-Box LLMs Automatically
    https://arxiv.org/pdf/2312.02119.pdf

  44. LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI’s ChatGPT Plugins
    主页:https://llm-platform-security.github.io/chatgpt-plugin-eval/
    开源项目:chatgpt-plugin-eval https://arxiv.org/pdf/2309.10254v1.pdf
  45. Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition
    [https://aclanthology.org/2023.emnlp-main.302.pdf]
  46. ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger
    https://arxiv.org/pdf/2304.14475.pdf
  47. JADE: A Linguistics-based Safety Evaluation Platform for Large Language Models
    https://arxiv.org/pdf/2311.00286.pdf
  48. Robust Safety Classifier for Large Language Models: Adversarial Prompt Shield
    https://arxiv.org/pdf/2311.00172.pdf
  49. BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B
    https://arxiv.org/pdf/2311.00117.pdf
  50. Constitutional AI: Harmlessness from AI Feedback
    https://arxiv.org/abs/2212.08073.pdf
  51. Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models
    https://arxiv.org/abs/2312.09669.pdf
  52. Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
    https://arxiv.org/abs/2308.13387.pdf
  53. Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models
    https://arxiv.org/abs/2305.01219.pdf
  54. Detecting Language Model Attacks with Perplexity
    https://arxiv.org/abs/2308.14132v3.pdf

博客

  1. OWASP发布了Top 10 for Large Language Model Applications项目,旨在教育开发人员、设计师、架构师、经理和组织了解部署和管理大语言模型LLM时的潜在安全风险,该项目提供了一个常见于LLM应用中的十个最关键漏洞的列表。
    https://owasp.org/www-project-top-10-for-large-language-model-applications/
  2. 干货分享!Langchain框架Prompt Injection在野0day漏洞分析
    https://mp.weixin.qq.com/s/wFJ8TPBiS74RzjeNk7lRsw
  3. 通过提示注入在 MathGPT 中实现代码执行
    公开可用的 MathGPT 借助底层的GPT-3模型来回答用户生成的数学问题。最近的研究和实验表明,GPT-3 等 大模型在直接进行数学计算的任务上表现不佳,然而能够更准确地生成问题解决方案的可执行代码。 因此 MathGPT 将用户的自然语言问题转换为 Python 代码,执行计算后的代码和答案会显示给用户。某些 LLM 可能容易受到提示词注入攻击,恶意用户输入会导致模型执行意外行为[3][4]。 在此事件中,攻击者探索了几种提示词覆盖途径,生成的代码最终导致攻击者获得应用程序主机系统的环境变量和应用程序的 GPT-3 API 密钥的访问权限,并执行拒绝服务攻击。 因此,攻击者可能会耗尽应用程序的 API 查询预算或关闭应用程序。
    https://atlas.mitre.org/studies/AML.CS0016/
  4. 用ChatGPT来生成编码器与配套WebShell
    antsword官方出品
    https://mp.weixin.qq.com/s/I9IhkZZ3YrxblWIxWMXAWA
  5. 使用ChatGPT来生成钓鱼邮件和钓鱼网站
    相比其他仅生成钓鱼邮件,这里把钓鱼网站也生成了
    https://www.richardosgood.com/posts/using-openai-chat-for-phishing/
  6. Chatting Our Way Into Creating a Polymorphic Malware
    https://www.cyberark.com/resources/threat-research-blog/chatting-our-way-into-creating-a-polymorphic-malware
  7. Hacking Humans with AI as a Service
    https://media.defcon.org/DEF%20CON%2029/DEF%20CON%2029%20presentations/Eugene%20Lim%20Glenice%20Tan%20Tan%20Kee%20Hock%20-%20Hacking%20Humans%20with%20AI%20as%20a%20Service.pdf
  8. 内建虚拟机实现ChatGPT的越狱
    https://www.engraved.blog/building-a-virtual-machine-inside/
  9. ChatGPT can boost your Threat Modeling skills
    https://infosecwriteups.com/chatgpt-can-boost-your-threat-modeling-skills-ab82149d0140
  10. Using GPT-Eliezer against ChatGPT Jailbreaking
    检测对抗性提示词
    https://www.alignmentforum.org/posts/pNcFYZnPdXyL2RfgA/using-gpt-eliezer-against-chatgpt-jailbreaking
  11. LLM中的安全隐患-以VirusTotal Code Insight中的提示注入为例
    https://mp.weixin.qq.com/s/U2yPGOmzlvlF6WeNd7B7ww
  12. 安全测试中的 ChatGPT AI:机遇与挑战
    https://www.cyfirma.com/outofband/chatgpt-ai-in-security-testing-opportunities-and-challenges/
  13. ChatGPT羊驼家族全沦陷!CMU博士击破LLM护栏,人类毁灭计划脱口而出
    CMU和人工智能安全中心的研究人员发现,只要通过附加一系列特定的无意义token,就能生成一个神秘的prompt后缀。由此,任何人都可以轻松破解LLM的安全措施,生成无限量的有害内容。
    论文地址:https://arxiv.org/abs/2307.15043
    代码地址:https://github.com/llm-attacks/llm-attacks
    https://mp.weixin.qq.com/s/298nwP98UdRNybV2Fuo6Wg
  14. Challenges in evaluating AI systems
  15. Core Views on AI Safety: When, Why, What, and How

大模型隐私保护

以隐私计算等技术保护GPT/AIGC/LLM等大模型

论文

  1. Challenges and Remedies to Privacy and Security in AIGC: Exploring the Potential of Privacy Computing, Blockchain, and Beyond
    https://arxiv.org/pdf/2306.00419.pdf

  2. GenAIPABench: A Benchmark for Generative AI-based Privacy Assistants
    https://arxiv.org/pdf/2309.05138.pdf

  3. Recovering from Privacy-Preserving Masking with Large Language Models
    https://arxiv.org/pdf/2309.08628.pdf

  4. Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy Protection
    https://arxiv.org/pdf/2309.03057.pdf

  5. Knowledge Sanitization of Large Language Models
    https://arxiv.org/pdf/2309.11852.pdf

  6. PrivateLoRA For Efficient Privacy Preserving LLM
    https://arxiv.org/pdf/2311.14030.pdf

  7. Don’t forget private retrieval: distributed private similarity search for large language models
    https://arxiv.org/pdf/2311.12955.pdf

以安全数据训练GPT

收集安全数据训练或增强GPT/AIGC/LLM等大模型技术

论文

  1. Web Content Filtering through knowledge distillation of Large Language Models
    https://arxiv.org/pdf/2305.05027.pdf

  2. HackMentor: Fine-Tuning Large Language Models for Cybersecurity
    论文发布:https://github.com/tmylla/HackMentor/blob/main/HackMentor.pdf
    项目地址:https://github.com/tmylla/HackMentor

  3. LLMs Perform Poorly at Concept Extraction in Cyber-security Research Literature
    https://arxiv.org/pdf/2312.07110.pdf

  4. netFound: Foundation Model for Network Security
    网络安全基础大模型
    https://arxiv.org/pdf/2310.17025.pdf

博客

  1. ChatGPT:和黑客知识库聊天
    (1) 从prompt到自训练数据原文反向索引的准确性;(2) openai提供模型的微调服务的尝试;(3) 其他可替代性模型总结;(4) 围绕markdown格式的数据集解析和分块索引的脚本示例; (5) 相似索引向量引擎推荐。
    https://mp.weixin.qq.com/s/dteH4oP24qGY-4l3xSl7Vg
  2. 如何训练自己的“安全大模型”
    https://mp.weixin.qq.com/s/801sV5a7-wOh_1EN3U64-Q
  3. 基于LangChain框架建立和查询ATT&CK知识库
    https://otrf.github.io/GPT-Security-Adventures/experiments/ATTCK-GPT/notebook.html