Skip to main content

Fortifying Generative AI: Mitigating Its Risks with Guardrails for Security and Compliance

Manpreet Dash, Amit Phadke, Shiv Kumar

Fortifying Generative AI: AIShield Mitigating Its Risks with Guardrails for Security and Compliance

Synopsis

To support organizations looking to build new generative AI applications, Amazon Web Services (AWS) has launched Amazon Bedrock, a fully managed service that makes foundation models (FMs) from leading AI companies available through a single application programming interface (API). AIShield GuArdIan, a tool integrated with AWS offerings, oversees inbound and outbound data from Large-Language Models (LLMs), ensuring alignment with predefined policies. Through its dynamic policy mapping, jailbreak protection, and easy integration features, GuArdIan offers a robust shield against potential AI risks. Two notable use cases—protecting a software company's chatbot and enforcing role-based access in a healthcare setting—highlight GuArdIan's practical applications.

Introduction

Generative AI, echoing human-like decision-making, is heralding a new technological era. This evolution compels businesses and policymakers to harness its vast potential in industries such as consumers, energy and industrials, financial services, government and public services, life sciences and healthcare, technology, media, and telecommunications. Close to 70% of organizations consider GPT/LLMs/Generative AI adoption a top priority by the end of 2023. However, this swift embrace of generative AI has not been without challenges. Gartner's survey underscores generative AI as a primary emerging risk, appearing in its top 10 for first time, spotlighting concerns like IP infringement, data breaches, and other vulnerabilities. A significant 79% of senior IT leaders express apprehension about these potential security threats. It's crucial for organizations to prioritize ethical, transparent, and accountable use of these technologies.

In this article, we'll explore how generative AI and Large Language Models (LLMs) can bolster organizational capabilities, from software development to enhancing customer engagement and driving internal efficiency. By leveraging AWS cloud services, businesses can tap into powerful computational tools essential for scalable generative AI applications. We'll also address the inherent risks of Generative AI and LLMs, and introduce solutions like AIShield GuArdIan, supported by Amazon Bedrock, which ensures responsible application, empowering firms to harness the full potential of this groundbreaking technology with confidence.

Generative AI use cases

Now, let's delve into the compelling applications of generative AI. Consider the impact in these key functions:

1. Advising: Generative AI acts as an invaluable assistant, supercharging worker efficiency by supplying hyper-personalized insights. These models navigate complex customer interactions to determine intent and refine responses. For instance:

• Virtual public servant enhancing citizen engagement (public services)

• Gen AI-powered financial assistant for FSI clientele (financial services)

• Personalized patient interactions as a physician’s message manager (healthcare)

• Customer support or virtual shopping assistants

2. Creating: A catalyst for creativity, generative AI fast-tracks innovations in copy creation and real-time personalization. In marketing, it can act as a content assistant, streamlining and tailoring content generation. In government sector, it can automate the RFP and SoW writing process by generating the initial drafts based on templates, historical documents, or specific prompts provided by procurement officials.

3. Automating: Generative AI brings a new era of efficiency, especially in business process automation. Its capability to summarize and predict is already being leveraged by firms. For instance:

• In finance, it facilitates post-trade email processes, minimizing manual interventions and optimizing client interactions.

• In the legal sphere, it assists legislative teams in swiftly transcribing and condensing hearings, official documents, and announcements.

4. Coding: Generative AI can be used in the development of the code itself, serving as an assistant supporting software developers in writing and maintaining code. In software development, Amazon CodeWhisperer, an AI assistant that utilizes generative AI, enhances developer productivity by giving real-time code suggestions from developers' natural language comments within their Integrated Development Environment (IDE). It accurately detects code issues and provides intelligent remedies.

5. Protecting: Integral to governance and security, Generative AI can bolster defenses against fraud and ensures stringent compliance. Its capabilities span website classification to malware interpretation in cybersecurity. Yet, one must remain vigilant against its misuse, such as generating malicious codes or designing intricate phishing tactics.

The value that generative AI use cases can enable can be conceived across six dimensions: cost reduction, process efficiency, growth, innovation, discovery and insights, and government citizen services.

How does AWS catalyze generative AI?

AWS champions generative AI with tools and platforms designed for streamlined integration. Amazon Bedrock provides API access to foundation models from Amazon, AI21 Labs, Anthropic, and Stability AI, catering to both text and image data. Concurrently, Amazon SageMaker JumpStart offers a hub with pre-configured foundation models, algorithms, and ML solutions deployable via UI or SDK. As enterprises scale generative AI deployments, Amazon EC2 Inf2 instances powered by AWS Inferentia2, efficiently manage inference for models with parameter counts reaching hundreds of billions.

Risks and concerns with LLMs and generative AI for enterprises

We’ve now seen how generative AI models may help consumers, streamline organizational processes, and free up time for employees to take on higher-value organizational tasks. However, risks to privacy, cybersecurity, regulatory compliance, third-party relationships, legal obligations, and intellectual property have already emerged with the adoption of generative AI. The top risks associated to the enterprise use of LLMs according to OWASP include Intellectual Property (IP) Infringement, data privacy breaches, plagiarism, toxicity, and a general increase of enterprises’ attack surface.

To truly get the most benefits from this groundbreaking technology, enterprises need to manage the wide array of risks it poses.

1. Copyright & IP Violations: Generative AI models are trained on massive amounts of data, mostly collected through internet scraping. Some of it might fall under copyright or intellectual property rights. Using generative AI to create content could open your organization to legal liability for copyright infringement, as underscored by the EU Commission.

2. Input Data Privacy: Some generative AI systems, especially those that run as a cloud service or API, will automatically log all inputs into the system. These inputs, which might contain sensitive or proprietary data may be used for training future versions of the generative model. A noticeable recent incident is the data leakage caused by employees at a large enterprise who shared sensitive data with an LLM.

3. Prompt Injection/Jailbreaking: Many generative AI systems have added safeguards to prevent malicious, manipulative, or harmful uses. However, there are ways of breaking these protections and this is known as jailbreaking. There are ways adversarial actors can inject their own prompt and then make it appear like your system generated something inappropriate (prompt injection).

4. Harmful content: Generative AI systems, despite all protections, can generate harmful content. The content could cause emotional or psychological harm, especially if used or targeted towards specific populations (e.g., children). This vulnerability raises concerns over security, privacy, legal, and ethical issues.

5. Hallucination: These models do not necessarily have a full understanding of the ‘real world’ and so they are able to very convincingly make up stuff that is not true. Their outputs could be erroneous, potentially causing harm to businesses and third parties.

6. System safeguard interactions: The protections added to the generative systems to try to prevent them from generating harmful content may interfere with legitimate prompts or use cases. Knowing what kinds of protections exist, and how the system responds to them is essential for understanding how the system may behave once deployed to users.

7. Security vulnerabilities: Like any other software, generative AI tools themselves can contain vulnerabilities that expose companies to cyberthreats. In the case where it is used as a coding assistant, the generated code might embed security vulnerabilities in critical applications, which could be exploited causing massive ramifications.

Organizations worldwide are grappling with the growing significance of mitigating risks associated with Large Language Models (LLMs) and the imperative of safe and compliant Generative AI adoption.

AIShield GuArdIan: Bridge between User Applications and LLMs for Secure and Ethical Generative AI Utilization

AIShield GuArdIan provides guardrails based on organizational policies, rules, and ethical guidelines to leverage Generative AI usage while managing its associated risks. It is designed to address identified risks by acting as a robust ‘middleware’ between the users and the Target LLM (refer to Fig. 1), analyzing the inputs to and the outputs from the LLM.

The internals of the developed solution and its interfaces are designed intuitively for ease-of-use, configurability, and scalability. The internal architecture (refer to Fig. 2) contains independent blocks to inspect input and output independently, this provides the possibility to set policies differently for user prompts and LLM responses.

Figure 1 – AIShield GuArdIan as the bridge between user application and LLM (Amazon Bedrock)
Figure 1 – AIShield GuArdIan as the bridge between user application and LLM (Amazon Bedrock)

AIShield GuArdIan seamlessly deploys on-premises, residing within an enterprise's VPC in a dedicated private subnet, leveraging advanced architecture within the AWS ecosystem (refer Fig. 2). Its primary function is to meticulously oversee both inbound data streams and resulting outputs, rigorously ensuring strict adherence to predefined policies. When incoming data aligns with these guidelines, GuArdIan interfaces directly with the Amazon Bedrock service, providing access to a comprehensive suite of Large-Language Models (LLMs). Importantly, GuArdIan's vigilance extends beyond this initial interaction. It continuously scrutinizes LLM outputs, actively searching for any policy deviations. When discrepancies are detected, GuArdIan initiates immediate intervention.

There are mainly two components of AIShield GuArdIan:

1. GuArdIan Core Service: This component is hosted on GPU-powered VMs to leverage the computational power required for machine learning tasks. The Core Service houses the custom AIShield GuArdIan LLM models, complemented by proprietary AIShield GuArdIan machine learning models. Its primary role is to scrutinize and validate incoming prompts and the output produced by LLM models, ensuring alignment with user-defined GuArdIan Policies.

2. GuArdIan Service: This service operates on a CPU-powered Virtual Machine (VM) equipped with Docker. It orchestrates the management of GuArdIan policies, aligns these policies with specific roles, and processes the outcomes determined by these policies.

Figure 2 - AIShield GuArdIan integration with Amazon Bedrock
Figure 2 - AIShield GuArdIan integration with Amazon Bedrock

Use Case 1: Mitigate IP infringement, data leaks and jailbreaking risks in an internal productivity chatbot at a software development company

A top-tier software giant sought to deploy a generative AI and LLM-driven internal chatbot, tapping into its vast internal document database to aid global employees in tasks like coding, data analysis, and support. Aware of inherent risks, the firm's cybersecurity team, backed by IT/Data Security and legal units, aimed for a robust risk mitigation strategy. Turning to AIShield, they utilized GuArdIan's features for a risk assessment.

AIShield team conducted an initial assessment of information leak and copyright infringement risks related to the selected LLM model, identified, and enabled the mitigation step from GuArdIan’s feature matrix, and ultimately deployed and evaluated GuArdIan’s performance with the selected chatbot. AIShield GuArdIan was easily deployed with Amazon Bedrock, and demonstrated a significant enhancement in warding off jailbreak attack attempts compared to standard LLM content filters. This translated to substantial risk reduction of IP and copyright infringement leaks, increasing the security and efficiency of the company's internal productivity chatbot and enabling its widespread use among employees to enhance their productivity and efficiency.

Use Case 2: Role-Based Access Control in a Healthcare Chatbot

A leading hospital deployed a generative AI-powered chatbot for staff productivity, while ensuring data privacy. The challenge: different access levels for doctors and auditors. With AIShield’s GuArdIan, role-specific data access was set—doctors saw specialized or curated surgical lists and medical recommendations, while auditors, administrators and compliance officers accessed broader data.

AIShield GuArdIan's Python-SDK ensured a seamless chatbot integration while improving application security. It was able to ingest domain- and organization specific policies. Using the 3x3 framework, policies were mapped effortlessly for enforcing role-based control. GuArdIan’s dynamic enforcement and textual violation support further fortified the system. The result: a precise balance of accessibility and privacy in the chatbot, enabling a more secure generative AI application.

AIShield GuArdIan Features

GuArdIan is designed to address three main areas of risk: input/output management by filtering data, ensuring data protection and privacy with a need-to-know basis approach, and enhancing cybersecurity to guard against malicious behavior. AIShield GuArdIan provides a set of practical features supporting the usage of trustworthy and responsible Generative AI at enterprise-level, for example:

Policy enforcement: The solution offers predefined policies for content moderation (protection against harmful content, gender and racial bias, not-safe-for-work filtering), privacy protection (detection and blocking PII leaks), and security (jailbreak protection). You can easily activate these policies or create custom ones.

Domain and organization-specific controls: Alongside generic policies, you can set specific rules for different sectors. For customized deployments, the solution is also capable of ingesting organizational policy documents for specialized controls. It uses transfer learning to adapt to different domains, making it capable of addressing industry-specific requirements, such as healthcare, finance, and software development.

Dynamic policy mapping: Inspired by Identity-Access Management (IAM) Systems, AIShield.GuArdIan controls LLM Usage policies based on User-Role. Dynamic mapping enforces contextual policies for users' roles, queries, and responses. Upon user query, relevant policy control is retrieved for moderation.

Easy integration: Its readymade Python-SDK facilitate effortless application integration with diverse LLMs and deployments such as Amazon Bedrock and other third-party services. Dynamic policy enforcement adapts to each user input, providing horizontal implementation of security measures.

Jailbreak Protection: AIShield.GuArdIan employs algorithms to prevent unauthorized manipulation or jailbreaking of the AI system. It detects and thwarts jailbreak attempts with an effectiveness boost of almost 400% over unprotected systems, preserving system integrity against malicious exploitation.

Reasoning and observability: The system provides clear alerts and detailed explanations for query decisions. GuArdIan’s logging functionality is useful for compliance audits.

Real-time monitoring: This functionality empowers organizations to track compliance, identify potential threats, and take immediate action to mitigate risks.

Conclusion

Generative AI is ushering in a new era of innovation and heightened productivity. AWS offers generative AI capabilities that empower you to revolutionize your applications, create entirely novel customer experiences, enhance productivity significantly, and drive transformative changes in your business. However, as we build and embrace this powerful tool, we must also be vigilant about the risks it presents. AIShield.GuArdIan stands sentinel, fortifying Generative AI and Large Language Models (LLMs), shielding against ethical dilemmas, misinformation, IP theft, data breaches and vulnerabilities. With integration into Amazon Bedrock and other tools on the AWS generative AI suite of tools, it steps in as an additional layer of protection against ethical concerns, IP theft, and data breaches. As we harness the power of generative AI, this helps organizations meet their security and compliance goals.

AIShield has received notable accolades for its technology, including the CES Innovation Award 2023, IoT World Congress Award 2023: Best Cybersecurity Solution, and recognition from Gartner in its Market Guide for AI Trust, Risk and Security Management. Furthermore, highlighting its significance in the AI realm, GuArdIan was cited among the 28 pivotal tools for generative AI in the "Catalogue of Tools and Metrics for Trustworthy AI," launched by the OECD in April 2023, and subsequently discussed in the G7 Hiroshima's focus on the pressing need for safety, quality control, and trust in AI. This recognition underscores the pivotal role of AIShield GuArdIan in shaping a secure and promising future for generative AI.

AIShield & AWS Collaboration for Gen AI CoE