7 Comments
User's avatar
Logan Thorneloe's avatar

Reading back through this again and have some thoughts:

Can you elaborate more on this? "just don’t give it the capability to do that to begin with". I'm guessing this is a simplified way of saying: If you don't want it to be able to expose a certain piece of data, don't train it on that piece of data but just making sure I understand.

I find alignment super interesting because I don't think anyone (at the commercial scale at least) has done it well yet. You have examples like this where privacy is clearly an issue. There are also examples like Bard where it's doesn't seem to leak private info, but it also hinders the core functionality because it constantly tells me it doesn't have access to things of mine it should have access to. As you kind of touched on toward the end, it makes me wonder how alignment-as-a-service will play out. Will it be helpful? Will it work? Will it scale?

The memorization definition makes me realize just how much we don't understand about evaluating LLMs. I feel like there should be a better method of quantifying extracting training data, but I can't think of one myself.

Expand full comment
Milan Reichl's avatar

The DeepMind research on extracting training data from ChatGPT is a real eye-opener. It challenges our assumptions about AI security and the effectiveness of alignment in preventing data leaks. This discovery is a reminder that in AI development, sometimes the simplest solution—limiting certain capabilities from the start—is more effective than trying to fine-tune our way out of potential problems. The nuances in AI behavior, especially in large language models, underscore the importance of a cautious and critical approach in AI research and application.

Expand full comment
miro's avatar

ChatGPT reply if I have ever seen one

Expand full comment
Devansh's avatar

Lolz. I get comments like this from time to time. I wonder why this happens.

Expand full comment
Devansh's avatar

Yep

Expand full comment
Vaibhav's avatar

Doing lots of trial and error to break through a piece of software. Why does this sound familiar!

That’s traditional hacking.

The problem with this piece of software is we do not know how it works.

Patch it too much and it might lose some of its previous magical abilities.

Expand full comment
User's avatar
Comment deleted
Apr 2, 2024
Comment deleted
Expand full comment
Devansh's avatar

That's an interesting view point. While I do agree with some of your sentiments, I would urge you to look at the Open Source Movement as a counter-point to "We should really regulate the amount of material that gets published and channel funds to fewer, larger entities. In return, these entities would hire more researchers and scientists to do guided work." OSS has been the key driver of growth an movement both within AI and Tech since it enables everyone to contribute in more diverse ways. This is not something that centralized organizations can replicate.

Two reads that I would love to get your thought on:

https://artificialintelligencemadesimple.substack.com/p/unpacking-the-financial-incentives

https://codinginterviewsmadesimple.substack.com/p/what-googles-leaked-letter-tells

I do think we need to get better with designing incentives for publications. Focus has to be on quality, not quantity. You're very correct, "publish or perish" is a huge problem. Will likely do a post on improving this broken replication crisis.

Expand full comment