5 minutes to read - Aug 1, 2023

Source code is the most common sensitive data shared to ChatGPT

VISIT
Source code is the most common sensitive data shared to ChatGPT
Source code accounted for the largest share of sensitive data being exposed to ChatGPT, according to the latest Netskope Threat Labs report. To address the risks, the vendor argues a generative AI (genAI) ban is just a short-term solution. Instead, organizations should set up controls and policies to safeguard their data and secure the use of artificial intelligence (AI) tools.
Table of Contents
1Keep tabs on your source code

The research found the use of genAI in enterprises is surging with organizations of 10,000 users or more accessing at least five gen AI apps daily. ChatGPT is leading the pack and has more than eight times as many daily active users as any other genAI app.

These findings are part of Netskope’s Cloud & Threat Report: AI Apps in the Enterprise and the data is derived from anonymized telemetry, collected with prior authorization from a subset of the secure access service edge (SASE) vendor’s customers, which includes thousands of organizations and millions of users from a wide array of industries globally, Netskope Threat Labs director Ray Canzanese told SDxCentral in an email.

Netskope’s report also revealed that sensitive data is shared with genAI applications at a troubling frequency within large enterprises: per 10,000 enterprise users, there are approximately 183 incidents of sensitive data being posted to the app monthly.

Keep tabs on your source code

And source code is the primary data type that is posted to ChatGPT, with 22 out of 10,000 enterprise users posting source code to ChatGPT per month. And these users are responsible for an average of 158 posts containing source code per month.

“Source code appearing on the list was not surprising to our team since inspecting source code for security flaws or coding mistakes is a well-publicized feature of ChatGPT,” Canzanese noted. “However, the magnitude of the problem was surprising — that there were so many users doing it, and frequently.”

GenAI tools can be used to review and explain code and find bugs and security vulnerabilities in the source code. However, researchers warned in the report, sharing confidential source code with ChatGPT might introduce security risks including potential data breaches, accidental data disclosure and legal and regulatory risks. For example, embedded in source code, passwords and keys sometimes got exposed to the genAI tools.

“One important part of addressing this is training and coaching users to avoid posting any internal or proprietary code to ChatGPT,” Canzanese said. “On the technology side, modern DLP [data loss prevention] solutions are very effective at detecting source code. So, setting a policy to detect and block users from uploading source code to ChatGPT can be very effective.”

Following source code, regulated data, such as financial data, healthcare information and personally identifiable information, is another common form of sensitive data sharing with genAI tools with 18 incidents per 10,000 enterprise users per month.

Take these steps instead of blocking access to ChatGPT

Netskope’s report showed that nearly 20% of organizations in financial services and healthcare — both highly regulated industries — have implemented a blanket ban on employee use of ChatGPT, but only about 5% of technology companies have done so.

The vendor argues blocking the access is just a short-term solution. “While a common practice when ChatGPT was first gaining popularity was for enterprises to block the chatbot altogether, organizations have since come to terms with the fact that ChatGPT and other AI apps can offer advantages to the organization, including enhancing operations, improving customer experiences, and facilitating data-driven decision-making,” researchers wrote in the report.

“Organizations should focus on evolving their workforce awareness and data policies to meet the needs of employees using AI products productively. There is a good path to safe enablement of generative AI with the right tools and the right mindset,” Netskope deputy CISO James Robinson said in a statement.

The report recommends measures should include domain filtering, URL filtering and content inspection:

Regularly review genAI app activity, trends and behaviors to identify risks.

Start with only allowing reputable apps currently in use while blocking all others.

Use data loss prevention (DLP) technologies to detect posts containing potentially sensitive information, including these data types mentioned in the report such as source code, regulated data, passwords and keys and intellectual property.

Employ user coaching to remind users of company policy surrounding the use of genAI apps.

Block opportunistic attackers attempting by blocking known malicious domains and URLs and inspecting all HTTP and HTTPS content.

“ChatGPT and AI apps pose a security risk to organizations, but the risk is manageable. While blocking the apps mitigates much of the risk, there are fortunately more nuanced technology solutions that include DLP and user coaching that can help organizations reap the benefits of the AI apps while managing the risks,” said Canzanese.


 


Article source
Author
loading...