Extracting Training Data from ChatGPT…

Devansh

Jan 8, 2024

122

Scalable Extraction of Training Data from (Production) Language Models

Read →

7 Comments

Logan Thorneloe

Feb 7, 2024

Reading back through this again and have some thoughts:

Can you elaborate more on this? "just don’t give it the capability to do that to begin with". I'm guessing this is a simplified way of saying: If you don't want it to be able to expose a certain piece of data, don't train it on that piece of data but just making sure I understand.

I find alignment super interesting because I don't think anyone (at the commercial scale at least) has done it well yet. You have examples like this where privacy is clearly an issue. There are also examples like Bard where it's doesn't seem to leak private info, but it also hinders the core functionality because it constantly tells me it doesn't have access to things of mine it should have access to. As you kind of touched on toward the end, it makes me wonder how alignment-as-a-service will play out. Will it be helpful? Will it work? Will it scale?

The memorization definition makes me realize just how much we don't understand about evaluating LLMs. I feel like there should be a better method of quantifying extracting training data, but I can't think of one myself.

Expand full comment

Milan Reichl

Jan 9, 2024

The DeepMind research on extracting training data from ChatGPT is a real eye-opener. It challenges our assumptions about AI security and the effectiveness of alignment in preventing data leaks. This discovery is a reminder that in AI development, sometimes the simplest solution—limiting certain capabilities from the start—is more effective than trying to fine-tune our way out of potential problems. The nuances in AI behavior, especially in large language models, underscore the importance of a cautious and critical approach in AI research and application.

Expand full comment

Reply (2)

miro

Jan 10, 2024

ChatGPT reply if I have ever seen one

Expand full comment

Reply (1)

Devansh

Jan 10, 2024

Lolz. I get comments like this from time to time. I wonder why this happens.

Expand full comment

Devansh

Jan 9, 2024

Yep

Expand full comment

Vaibhav

Jan 10, 2024

Doing lots of trial and error to break through a piece of software. Why does this sound familiar!

That’s traditional hacking.

The problem with this piece of software is we do not know how it works.

Patch it too much and it might lose some of its previous magical abilities.

Expand full comment

Comment deleted

Apr 2, 2024

Comment deleted

Expand full comment

Devansh

Apr 2, 2024

That's an interesting view point. While I do agree with some of your sentiments, I would urge you to look at the Open Source Movement as a counter-point to "We should really regulate the amount of material that gets published and channel funds to fewer, larger entities. In return, these entities would hire more researchers and scientists to do guided work." OSS has been the key driver of growth an movement both within AI and Tech since it enables everyone to contribute in more diverse ways. This is not something that centralized organizations can replicate.

Two reads that I would love to get your thought on:

https://artificialintelligencemadesimple.substack.com/p/unpacking-the-financial-incentives

https://codinginterviewsmadesimple.substack.com/p/what-googles-leaked-letter-tells

I do think we need to get better with designing incentives for publications. Focus has to be on quality, not quantity. You're very correct, "publish or perish" is a huge problem. Will likely do a post on improving this broken replication crisis.

Expand full comment

Artificial Intelligence Made Simple

Extracting Training Data from ChatGPT…