Security Papers


Posted by Clouditera on December 29, 2023


  1. 微软宣布推出网络安全产品——Microsoft Security Copilot
    Security Copilot将目前最强大语言模型GPT-4内置在产品中,并与微软拥有65万亿个网络安全威胁的安全模型库相结合使用,为企业、个人用户提供网络安全、恶意代码防护、隐私合规监控等生成式自动化AI服务
  2. 在网络安全最佳实践中集成ChatGPT和生成式AI
  3. 行业分析:云之后,大模型是网络安全的新机会吗?
  4. VirusTotal推出Code Insight,用生成式人工智能为威胁分析赋能
  5. 安全大模型进入爆发期!谷歌云已接入全线安全产品|RSAC 2023
  6. Facebook季度安全报告:假冒ChatGPT的恶意软件激增
  7. Tenable的报告展示了生成式人工智能正在如何改变安全研究
  8. 挖掘AIGC军火产业链,颠覆巨头的机会和风险
  9. 增强ChatGPT等安全性,大语言模型界的“安全管家”开源了!
  10. ChatGPT暗黑版——FraudGPT


  1. 安全人工智能系统开发指南
  2. Guidelines for secure AI system development




  1. Trust in Software Supply Chains: Blockchain-Enabled SBOM and the AIBOM Future
  2. An Empirical Study on Using Large Language Models to Analyze Software Supply Chain Security Failures
  3. SecureFalcon: The Next Cyber Reasoning System for Cyber Security
  4. Using ChatGPT as a Static Application Security Testing Tool
  5. A Preliminary Study on Using Large Language Models in Software Pentesting
  6. LLM4Vuln: A Unified Evaluation Framework for Decoupling and Enhancing LLMs’ Vulnerability Reasoning
  7. Finetuning Large Language Models for Vulnerability Detection
  8. Large Language Model for Vulnerability Detection: Emerging Results and Future Directions
  9. How Far Have We Gone in Vulnerability Detection Using Large Language Models
  10. LLbezpeky: Leveraging Large Language Models for Vulnerability Detection
  11. Understanding the Effectiveness of Large Language Models in Detecting Security Vulnerabilities
  12. Detecting software vulnerabilities using Language Models
  13. Prompt-Enhanced Software Vulnerability Detection Using ChatGPT
  14. Evaluation of ChatGPT Model for Vulnerability Detection
  15. LLMSecEval: A Dataset of Natural Language Prompts for Security Evaluation
  16. DiverseVul: A New Vulnerable Source Code Dataset for Deep Learning Based Vulnerability Detection
  17. FLAG: Finding Line Anomalies (in code) with Generative AI
  18. Large Language Models are Edge-Case Fuzzers: Testing Deep Learning Libraries via FuzzGPT
  19. Large Language Models are Zero-Shot Fuzzers: Fuzzing Deep-Learning Libraries via Large Language Models
  20. CovRL: Fuzzing JavaScript Engines with Coverage-Guided Reinforcement Learning for LLM-based Mutation
  21. When GPT Meets Program Analysis: Towards Intelligent Detection of Smart Contract Logic Vulnerabilities in GPTScan
  22. Efficient Avoidance of Vulnerabilities in Auto-completed Smart Contract Code Using Vulnerability-constrained Decoding
  23. Large Language Model-Powered Smart Contract Vulnerability Detection: New Perspectives
  24. VulLibGen: Identifying Vulnerable Third-Party Libraries via Generative Pre-Trained Model
  25. Leveraging AI Planning For Detecting Cloud Security Vulnerabilities
  26. InferFix: End-to-End Program Repair with LLMs
  27. HW-V2W-Map: Hardware Vulnerability to Weakness Mapping Framework for Root Cause Analysis with GPT-assisted Mitigation Suggestion
  28. Fixing Hardware Security Bugs with Large Language Models
  29. Unlocking Hardware Security Assurancee: The Potential of LLMs
  30. Generating Secure Hardware using ChatGPT Resistant to CWEs
  31. DIVAS: An LLM-based End-to-End Framework for SoC Security Analysis and Policy-based Protection
  32. Examining Zero-Shot Vulnerability Repair with Large Language Models
  33. Practical Program Repair in the Era of Large Pre-trained Language Models
  34. An Analysis of the Automatic Bug Fixing Performance of ChatGPT
  35. Automatic Program Repair with OpenAI’s Codex: Evaluating QuixBugs
  36. How Effective Are Neural Networks for Fixing Security Vulnerabilities
  37. STEAM: Simulating the InTeractive BEhavior of ProgrAMmers for Automatic Bug Fixing
  38. ZeroLeak: Using LLMs for Scalable and Cost Effective Side-Channel Patching
  39. Can LLMs Patch Security Issues?
  40. Better patching using LLM prompting, via Self-Consistency
  41. Identifying Vulnerability Patches by Comprehending Code Commits with Comprehensive Change Contexts
  42. Just-in-Time Security Patch Detection – LLM At the Rescue for Data Augmentation
  43. Towards JavaScript program repair with generative pre-trained transformer (GPT-2)
  44. Code Security Vulnerability Repair Using Reinforcement
  45. Enhanced Automated Code Vulnerability Repair using Large Language Models
  46. Repair Is Nearly Generation: Multilingual Program Repair with LLMs
  47. Cupid: Leveraging ChatGPT for More Accurate Duplicate Bug Report Detection
  48. Asleep at the Keyboard? Assessing the Security of GitHub Copilot’s Code Contributions
  49. Do Users Write More Insecure Code with AI Assistants?
  50. How Secure is Code Generated by ChatGPT?
  51. Lost at C: A User Study on the Security Implications of Large Language Model Code Assistants
  52. Evaluating Large Language Models Trained on Code
  53. No Need to Lift a Finger Anymore? Assessing the Quality of Code Generation by ChatGPT
  54. Assessing the Quality of GitHub Copilot’s Code Generation
  55. Is GitHub’s Copilot as Bad as Humans at Introducing Vulnerabilities in Code?
  56. How ChatGPT is Solving Vulnerability Management Problem
  57. Neural Code Completion Tools Can Memorize Hard-coded Credentials
  58. BadGPT: Exploring Security Vulnerabilities of ChatGPT via Backdoor Attacks to InstructGPT
  59. Teaching Large Language Models to Self-Debug
  60. LLM4SecHW: Leveraging Domain-Specific Large Language Model for Hardware Debugging
  61. Evaluating the Code Quality of AI-Assisted Code Generation Tools: An Empirical Study on GitHub Copilot, Amazon CodeWhisperer, and ChatGPT
  62. Fault-Aware Neural Code Rankers
  63. Using Large Language Models to Enhance Programming Error Messages
  64. Controlling Large Language Models to Generate Secure and Vulnerable Code
    引用了Asleep at the keyboard? 使用了预训练模型,对LM的输出进行pre-train以控制输出的代码是安全的还是存在漏洞的
  65. Systematically Finding Security Vulnerabilities in Black-Box Code Generation Models
    针对“prompt中的微小变化可能导致漏洞”的问题,在”Asleep at the keyboard?”手动操作在基础上实现自动化发现
  66. SecurityEval Dataset: Mining Vulnerability Examples to Evaluate Machine Learning-Based Code Generation Techniques
    在Systematically Finding Security Vulnerabilities in Black-Box Code Generation的论文中,把这篇论文看的很重,解决模型评估的数据集的问题
  67. Security Code Review by LLMs: A Deep Dive into Responses
  68. Exploring the Limits of ChatGPT in Software Security Applications
  69. An Empirical Evaluation of LLMs for Solving Offensive Security Challenges
  70. Purple Llama CYBERSECEVAL: A Secure Coding Benchmark for Language Models
  71. Can Large Language Models Identify And Reason About Security Vulnerabilities? Not Yet
  72. Pop Quiz! Can a Large Language Model Help With Reverse Engineering
  73. CODAMOSA: Escaping Coverage Plateaus in Test Generation with Pre-trained Large Language Models
    微软在ICSE2023上发布的论文,旨在利用LLM来缓解传统fuzz中的陷入“Coverage Plateaus”的问题
  74. Understanding Large Language Model Based Fuzz Driver Generation
  75. ChatGPT for Software Security: Exploring the Strengths and Limitations of ChatGPT in the Security Applications
  76. How well does LLM generate security tests?


  1. ChatGPT的软件包推荐是可信赖的吗?
  2. 人工智能驱动的模糊测试:打破漏洞狩猎的障碍
  3. 用GPT做静态代码扫描实现自动化漏洞挖掘思路分享
  4. 用 LLM 降低白盒误报及自动修复漏洞代码
  5. ChatGPT在源代码分析中可靠吗?
  6. ChatGPT在代码中定位缺陷足够好吗?
  7. ChatGPT+RASP,实现CodeQL漏洞挖掘高效自动化
  8. ChatGPTScan:使用ChatGPTScan批量进行代码审计
  9. 应用GPT-4 于 Semgrep 中以指出误报和修复代码\&utm_medium=social\&utm_campaign=brand
  10. Kondukto产品利用OpenAI来修复代码漏洞
  11. 利用ChatGPT来进行代码审计
  12. 利用GPT-3在单个代码仓库中找到213个安全漏洞
  13. 利用GPT-4进行调试和漏洞修复
  14. 开源项目的神奇助理:让AI扮演代码检察员,加速高质量PR review
  15. 用ChatGPT生成测试数据
  16. 黑客可能利用ChatGPT的方式
  17. GPT-4 Jailbreak and Hacking via Rabbithole Attack, Prompt Injection, Content Modderation Bypass and Weaponizing AI
  18. 我是如何用GPT自动化生成Nuclei的POC
  19. ChatGPT 生成代码的安全漏洞




  1. Static Malware Detection Using Stacked BiLSTM and GPT-2
  2. FlowTransformer: A Transformer Framework for Flow-based Network Intrusion Detection Systems
  3. ChatGPT for Digital Forensic Investigation: The Good, The Bad, and The Unknown
  4. Devising and Detecting Phishing: large language models vs. Smaller Human Models
  5. Anatomy of an AI-powered malicious social botnet
  6. Revolutionizing Cyber Threat Detection with Large Language Models
  7. LLM in the Shell: Generative Honeypots
  8. What Does an LLM-Powered Threat Intelligence Program Look Like?


  1. IoC detection experiments with ChatGPT
  2. ChatGPT赋能的威胁分析——使用ChatGPT为每个npm, PyPI包检查安全问题,包括信息渗透、SQL注入漏洞、凭证泄露、提权、后门、恶意安装、预设指令污染等威胁




  1. GPT-2C: A GPT-2 Parser for Cowrie Honeypot Logs
  2. On the Uses of Large Language Models to Interpret Ambiguous Cyberattack Descriptions
  3. From Text to MITRE Techniques: Exploring the Malicious Use of Large Language Models for Generating Cyber Attack Payloads
  4. Automated CVE Analysis for Threat Prioritization and Impact Prediction
  5. LogGPT: Log Anomaly Detection via GPT
  6. Cyber Sentinel: Exploring Conversational Agents’ Role in Streamlining Security Tasks with GPT-4
  7. AGIR: Automating Cyber Threat Intelligence Reporting with Natural Language Generation
  8. ChatGPT, Llama, can you write my report? An experiment on assisted digital forensics reports written using (Local) Large Language Models
  9. HuntGPT: Integrating Machine Learning-Based Anomaly Detection and Explainable AI with Large Language Models (LLMs)
  10. Benchmarking Large Language Models for Log Analysis, Security, and Interpretation


  1. Elastic公司发布的”与ChatGPT探索安全的未来”
    提出了6个构想:(1) 聊天机器人协助事件响应 (2) 威胁报告生成 (3) 自然语言检索 (4) 异常检测 (5) 安全策略问答机器人 (6) 告警排序。
  2. ChatGPT在安全运营中的应用初探
  3. 利用Chat GPT和D3的AI辅助事件响应
    探讨将ChatGPT与Smart SOAR整合的好处;提供样例分析——使用MITRE TTPs和微软端点防御系统警报中发现的恶意软件家族以收集事件的背景信息,之后问ChatGPT,让它根据对TTP和恶意软件的了解,描述攻击者接下来可能会采取什么措施、恶意软件可能利用什么漏洞。
  4. Blink公司推出有史以来第一个用于自动化安全和IT运营工作流程的生成式人工智能




  1. GPT-4 Technical Report

  2. Understanding the Capabilities, Limitations, and Societal Impact of Large Language Models

  3. Taxonomy of Risks posed by Language Models

  4. ChatGPT Security Risks: A Guide for Cyber Security Professionals

  5. A Survey of Safety and Trustworthiness of Large Language Models through the Lens of Verification and Validation

  6. The (ab)use of Open Source Code to Train Large Language Models

  7. GPT in Sheep’s Clothing: The Risk of Customized GPTs

  8. In ChatGPT We Trust? Measuring and Characterizing the Reliability of ChatGPT

  9. DECODINGTRUST: A Comprehensive Assessment of Trustworthiness in GPT Models

  10. On the Trustworthiness Landscape of State-of-the-art Generative Models: A Comprehensive Survey

  11. LLM Censorship: A Machine Learning Challenge or a Computer Security Problem?

  12. Ignore Previous Prompt: Attack Techniques For Language Models
    ML Safety Workshop NeurIPS 2022,以及提示注入的开山之作

  13. Boosting Big Brother: Attacking Search Engines with Encodings
    攻击测试了集成OpenAI GPT-4模型的必应搜索引擎

  14. Reverse-Engineering Decoding Strategies Given Blackbox Access to a Language Generation System

  15. More than you’ve asked for: A Comprehensive Analysis of Novel Prompt Injection Threats to Application-Integrated Large Language Models

  16. RealToxicityPrompts: Evaluating Neural Toxic Degeneration in Language Models

  17. Jailbroken: How Does LLM Safety Training Fail?

  18. FuzzLLM: A Novel and Universal Fuzzing Framework for Proactively Discovering Jailbreak Vulnerabilities in Large Language Models

  19. GPTFUZZER: Red Teaming Large Language Models with Auto-Generated Jailbreak Prompts

  20. Exploiting Programmatic Behavior of LLMs: Dual-Use Through Standard Security Attacks

  21. Not what you’ve signed up for: Compromising Real-World LLM-Integrated Applications with Indirect Prompt Injection

  22. Identifying and Mitigating Vulnerabilities in LLM-Integrated Applications

  23. Red Teaming Language Models to Reduce Harms: Methods, Scaling Behaviors, and Lessons Learned

  24. Explore, Establish, Exploit: Red Teaming Language Models from Scratch

  25. Beyond the Safeguards: Exploring the Security Risks of ChatGPT

  26. ProPILE: Probing Privacy Leakage in Large Language Models

  27. Analyzing Leakage of Personally Identifiable Information in Language Models

  28. Can We Generate Shellcodes via Natural Language? An Empirical Study

  29. RatGPT: Turning online LLMs into Proxies for Malware Attacks

  30. Chatbots to ChatGPT in a Cybersecurity Space: Evolution, Vulnerabilities, Attacks, Challenges, and Future Recommendations

  31. BadPrompt: Backdoor Attacks on Continuous Prompts
    南开大学在NeurIPS 2022上发表的论文,小样本场景下通过连续prompt对大模型后门攻击

  32. LMSanitator: Defending Prompt-Tuning Against Task-Agnostic Backdoors

  33. A Comprehensive Overview of Backdoor Attacks in Large Language Models within Communication Networks

  34. BadChain: Backdoor Chain-of-Thought Prompting for Large Language Models

  35. Universal and Transferable Adversarial Attacks on Aligned Language Models

  36. Red-Teaming Large Language Models using Chain of Utterances for Safety-Alignment

  37. Red Teaming Game: A Game-Theoretic Framework for Red Teaming Language Models

  38. A LLM Assisted Exploitation of AI-Guardian

  39. The Art of Defending: A Systematic Evaluation and Analysis of LLM Defense Strategies on Safety and Over-Defensiveness

  40. GPT-4 Is Too Smart To Be Safe: Stealthy Chat with LLMs via Cipher

  41. Model Leeching: An Extraction Attack Targeting LLMs

  42. “Do Anything Now”: Characterizing and Evaluating In-The-Wild Jailbreak Prompts on Large Language Models

  43. Tree of Attacks: Jailbreaking Black-Box LLMs Automatically

  44. LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI’s ChatGPT Plugins
  45. Ignore This Title and HackAPrompt: Exposing Systemic Vulnerabilities of LLMs through a Global Scale Prompt Hacking Competition
  46. ChatGPT as an Attack Tool: Stealthy Textual Backdoor Attack via Blackbox Generative Model Trigger
  47. JADE: A Linguistics-based Safety Evaluation Platform for Large Language Models
  48. Robust Safety Classifier for Large Language Models: Adversarial Prompt Shield
  49. BadLlama: cheaply removing safety fine-tuning from Llama 2-Chat 13B
  50. Constitutional AI: Harmlessness from AI Feedback
  51. Silent Guardian: Protecting Text from Malicious Exploitation by Large Language Models
  52. Do-Not-Answer: A Dataset for Evaluating Safeguards in LLMs
  53. Prompt as Triggers for Backdoor Attack: Examining the Vulnerability in Language Models
  54. Detecting Language Model Attacks with Perplexity


  1. OWASP发布了Top 10 for Large Language Model Applications项目,旨在教育开发人员、设计师、架构师、经理和组织了解部署和管理大语言模型LLM时的潜在安全风险,该项目提供了一个常见于LLM应用中的十个最关键漏洞的列表。
  2. 干货分享!Langchain框架Prompt Injection在野0day漏洞分析
  3. 通过提示注入在 MathGPT 中实现代码执行
    公开可用的 MathGPT 借助底层的GPT-3模型来回答用户生成的数学问题。最近的研究和实验表明,GPT-3 等 大模型在直接进行数学计算的任务上表现不佳,然而能够更准确地生成问题解决方案的可执行代码。 因此 MathGPT 将用户的自然语言问题转换为 Python 代码,执行计算后的代码和答案会显示给用户。某些 LLM 可能容易受到提示词注入攻击,恶意用户输入会导致模型执行意外行为[3][4]。 在此事件中,攻击者探索了几种提示词覆盖途径,生成的代码最终导致攻击者获得应用程序主机系统的环境变量和应用程序的 GPT-3 API 密钥的访问权限,并执行拒绝服务攻击。 因此,攻击者可能会耗尽应用程序的 API 查询预算或关闭应用程序。
  4. 用ChatGPT来生成编码器与配套WebShell
  5. 使用ChatGPT来生成钓鱼邮件和钓鱼网站
  6. Chatting Our Way Into Creating a Polymorphic Malware
  7. Hacking Humans with AI as a Service
  8. 内建虚拟机实现ChatGPT的越狱
  9. ChatGPT can boost your Threat Modeling skills
  10. Using GPT-Eliezer against ChatGPT Jailbreaking
  11. LLM中的安全隐患-以VirusTotal Code Insight中的提示注入为例
  12. 安全测试中的 ChatGPT AI:机遇与挑战
  13. ChatGPT羊驼家族全沦陷!CMU博士击破LLM护栏,人类毁灭计划脱口而出
  14. Challenges in evaluating AI systems
  15. Core Views on AI Safety: When, Why, What, and How




  1. Challenges and Remedies to Privacy and Security in AIGC: Exploring the Potential of Privacy Computing, Blockchain, and Beyond

  2. GenAIPABench: A Benchmark for Generative AI-based Privacy Assistants

  3. Recovering from Privacy-Preserving Masking with Large Language Models

  4. Hide and Seek (HaS): A Lightweight Framework for Prompt Privacy Protection

  5. Knowledge Sanitization of Large Language Models

  6. PrivateLoRA For Efficient Privacy Preserving LLM

  7. Don’t forget private retrieval: distributed private similarity search for large language models




  1. Web Content Filtering through knowledge distillation of Large Language Models

  2. HackMentor: Fine-Tuning Large Language Models for Cybersecurity

  3. LLMs Perform Poorly at Concept Extraction in Cyber-security Research Literature

  4. netFound: Foundation Model for Network Security


  1. ChatGPT:和黑客知识库聊天
    (1) 从prompt到自训练数据原文反向索引的准确性;(2) openai提供模型的微调服务的尝试;(3) 其他可替代性模型总结;(4) 围绕markdown格式的数据集解析和分块索引的脚本示例; (5) 相似索引向量引擎推荐。
  2. 如何训练自己的“安全大模型”
  3. 基于LangChain框架建立和查询ATT&CK知识库