I would like to say that moral alignment is crucial, yet 11% of participants who did not feel the moral graph was fair is a huge number! Think about it, 11% alone could be "all Indian men" or "all Latino men" for any set of questions (because morality is so complex that it will never be "just one issue or question"). We know better than that, however, and the likelihood of such a situation arising (i.e., one in which a single population is targeted solely on the basis of its ethnicity) is extremely low. However, it does point to the fact that it is morally difficult to reach a consensus among populations.
I think we should still make efforts to elicit moral graphs. On the other hand, I do not think it is moral or ethical to appease end users with unscientific and potentially immoral or unethical beliefs. This means that AGI companies will have to "steer" certain users toward less extreme views, which can itself be considered unethical (but not immoral), and I am willing for them to do this with the most extreme users.
In summary, moral graphs seem to be an intuitive and worthwhile area of research in AI safety, it helps to realize a way to a cohesive and potentially coherent superset of values based on morality and ethics, which is very important for future machine learning models. On the other hand, there are still issues of representativeness, such as the 11% of users who felt that the generated moral graph was not fair. We need to avoid that 11% corresponding to a single-class target on metrics such as ethnicity, socioeconomic status, religious beliefs.
Conversely, we need to recognize that there is a large group of users with unethical and immoral beliefs, and AGI companies cannot "compromise" or " accommodate" these types of users, and whether the end result is to alienate them or steer them toward a more reasoned view, they should be OK with that.
Is my view reasonable or itself unethical or immoral? I am willing to change my mind and views as I learn and improve as a human.
Thanks for your comment. Your example isn't a disagreement with mine, since your example speaks to the need of nuanced evaluations and inclusive design (both parts of the good principles I talk about)
Notice that my argument is the moral alignment of LLMs is useless. Not that we should not have morally aligned systems. If that 11% is one demographic, and you still push out AI that alienates them- then it's an issue with your evaluation and safety procedures (i.e it's bad design). Fixing it requires changing your inputs, adding more representation... all components of good system design. That was exactly my point- any problem that moral alignment claims to solve is better solved with better design.
"Conversely, we need to recognize that there is a large group of users with unethical and immoral beliefs, and AGI companies cannot "compromise" or " accommodate" these types of users, and whether the end result is to alienate them or steer them toward a more reasoned view, they should be OK with that."- The so-called AGI companies build platforms. Safety and security should be left to developers. If these AGI companies start meddling here- you only create more errors and lesss powerful solutions. You can ofcourse build some checks in place, but my other argument was that any real harms caused by AGI were caused only b/c we had deeper problems that we needed to fix (AI Misnformation is a problem b/c we don't have the critical thinking to examine inputs). To me those are the issues worth dedicating putting our attention to.
I have mentioned this excellent write-up on moral alignment in my newsletter 'AI For Real' this week. Here's the link https://aiforreal.substack.com/p/are-you-ai-positive-or-ai-negative
Thanks for sharing
This is pretty thorough Devansh, well done.
I would like to say that moral alignment is crucial, yet 11% of participants who did not feel the moral graph was fair is a huge number! Think about it, 11% alone could be "all Indian men" or "all Latino men" for any set of questions (because morality is so complex that it will never be "just one issue or question"). We know better than that, however, and the likelihood of such a situation arising (i.e., one in which a single population is targeted solely on the basis of its ethnicity) is extremely low. However, it does point to the fact that it is morally difficult to reach a consensus among populations.
I think we should still make efforts to elicit moral graphs. On the other hand, I do not think it is moral or ethical to appease end users with unscientific and potentially immoral or unethical beliefs. This means that AGI companies will have to "steer" certain users toward less extreme views, which can itself be considered unethical (but not immoral), and I am willing for them to do this with the most extreme users.
In summary, moral graphs seem to be an intuitive and worthwhile area of research in AI safety, it helps to realize a way to a cohesive and potentially coherent superset of values based on morality and ethics, which is very important for future machine learning models. On the other hand, there are still issues of representativeness, such as the 11% of users who felt that the generated moral graph was not fair. We need to avoid that 11% corresponding to a single-class target on metrics such as ethnicity, socioeconomic status, religious beliefs.
Conversely, we need to recognize that there is a large group of users with unethical and immoral beliefs, and AGI companies cannot "compromise" or " accommodate" these types of users, and whether the end result is to alienate them or steer them toward a more reasoned view, they should be OK with that.
Is my view reasonable or itself unethical or immoral? I am willing to change my mind and views as I learn and improve as a human.
Thanks for your comment. Your example isn't a disagreement with mine, since your example speaks to the need of nuanced evaluations and inclusive design (both parts of the good principles I talk about)
Notice that my argument is the moral alignment of LLMs is useless. Not that we should not have morally aligned systems. If that 11% is one demographic, and you still push out AI that alienates them- then it's an issue with your evaluation and safety procedures (i.e it's bad design). Fixing it requires changing your inputs, adding more representation... all components of good system design. That was exactly my point- any problem that moral alignment claims to solve is better solved with better design.
"Conversely, we need to recognize that there is a large group of users with unethical and immoral beliefs, and AGI companies cannot "compromise" or " accommodate" these types of users, and whether the end result is to alienate them or steer them toward a more reasoned view, they should be OK with that."- The so-called AGI companies build platforms. Safety and security should be left to developers. If these AGI companies start meddling here- you only create more errors and lesss powerful solutions. You can ofcourse build some checks in place, but my other argument was that any real harms caused by AGI were caused only b/c we had deeper problems that we needed to fix (AI Misnformation is a problem b/c we don't have the critical thinking to examine inputs). To me those are the issues worth dedicating putting our attention to.