LLM-BASED FRAUD DETECTION IN FINANCIAL TRANSACTIONS: A DEFENSE FRAMEWORK AGAINST ADVERSARIAL ATTACKS
Keywords:
Bag of Words, IND-FinAdversary, Large Language Models, TF-IDF.Abstract
Large Language Models (LLMs) have significantly transformed the financial sector, enabling advancements in areas such as fraud detection, asset management, wealth advisory, and automated financial analysis. However, these benefits are accompanied by notable security vulnerabilities, particularly in highly regulated sectors like finance. The rise of sophisticated models such as GPT-4 has made adversarial attacks, including prompt injection attacks, a pressing concern. This paper explores the security challenges posed by these attacks in the financial domain and introduces an innovative defense framework to address them. A comprehensive risk classification system is developed, detailing eight distinct input-side attack strategies and five categories of output vulnerabilities. To evaluate these threats, the study uses IND-FinAdversary, a domain-specific adversarial dataset developed through human-machine interactions. Additionally, an end-to-end security defense framework is proposed, which integrates preprocessing filters, legal compliance checks, and automated response refinement mechanisms. Empirical evaluations using widely deployed LLMs reveal substantial improvements, with the framework achieving a high accuracy of 96.8% in mitigating adversarial risks, surpassing the targeted threshold of 95.37%. The results confirm the framework's ability to reduce inappropriate content generation and enhance resilience against adversarial prompts. This work offers foundational tools, datasets, and evaluation metrics to strengthen the security and compliance of LLM applications in the financial sector, particularly in fraud detection.