Dr. Bohdan Tanyhin
- Dec 12, 2023
- 5 min read

3 Hs for AI: Helpful, Honest, and Harmless

With AI's introduction, scientists and users started questioning its ethical intentions. For example, AI turned out to be biased because it could not be neutral. In processing big data, AI search engines prioritize information based on the number of clicks, user preferences, and location. If you had to ask where the output information came from – humanity generated it. So, AI-based decisions were prone to be inaccurate, discriminatory, and certainly embedded with bias.

Then, AI tools still lack transparency. Artificial Intelligence cannot be intelligible to humans all the time and it cannot think the way humans do. Also, due to these facts, there are concerns for fairness and risk to fundamental values set by humanity.

In addition, AI has safety issues. According to rules and regulations, e.g., GDPR, there should be informational privacy of the data that belongs to users. Thus, people might feel that LLMs are not reliable and can disclose something important and spread it to the masses.

Therefore, let's discover whether AI can meet the three basic principles: be Helpful, Honest, and Harmless. Is there a possibility for Artificial General Intelligence (AGI)? Sencury is up for this challenge!

LLM Toxicity through Bias

Artificial Intelligence is full of ethical dilemmas. Large Language Models are being trained on human-specific data. Therefore, this data can be rather toxic.

At Princeton University, language models are defined as text-powered tools that perfectly capture statistical patterns. They produce a certain effect on people only applied to downstream tasks. Here, there is a broader social context. Let’s talk more about it.

What is bias?

Bias can occur in the form of performance disparities. The accuracy of the system might be different for various demographic groups. Therefore, when the system predictions are full of associations between target concepts and demographic groups, the effect might lead to bias on demographic groups. If this bias is input as training data into a large language model, it receives new powerful capabilities. With the output achieved, there is an increased adoption of newly received information. And everything new based on bias leads to increased harm.

To know more about LLMs and how they work, read our latest article - “Does AI Think?”

What is Toxicity?

The toxicity of a large language model is based on texts that are generated in a rude, disrespectful, or unreasonable manner. In neural LLMʼs, this phenomenon is known as neural toxic degeneration.

In a toxic conversation, one party may discontinue due to the offensive context. Especially, when the audience is young and vulnerable. Also, when the output was unintended but has led to obvious harm.

Toxicity of generated information may not intend to lead to disinformation, but it surely leads to such a consequence. Therefore, the data obtained produces harm unintentionally and misleads people or this data has an intentional purpose to mislead crowds.

What is Helpfulness, Honesty and Harmlessness of AI?

Helpfulness

Helpful AI refers to the systems’ ability to provide users with assistance, so that they can achieve their goals and solve problems. AI is being helpful via its subsets such as natural language processing (NLP), computer vision, and machine learning (ML).

To be helpful, an AI system should be personalized, i.e., adapt its behavior to the individual needs and preferences of every user. Also, it has to provide contextually relevant information or suggestions, to enhance better decision-making of humans.

Honesty

Honesty by AI lies in the accuracy of capabilities, limitations, and biases representation. The system has to be truthful. Therefore, users will trust this system up to the fact that they will not expect any harm from it. Especially, when there's decision-making at sake.

To promote honesty requires being transparent. Perhaps, through explanations of how AI systems work, make decisions, answer questions, their algorithms, etc.

Harmlessness

An AI system, by all means, has to be of no harm to humans or the environment. Therefore, harmlessness presupposes no discriminatory decisions against certain groups of people, no physical harm, and no destructive decisions concerning the environment. So, AI systems have to be continuously tested and evaluated to exclude bias and the impact on marginalized groups. What's more, AI systems must comply with relevant laws, and regulations, and be responsible for the consequences of their actions.

So, both bias and toxicity contradict the 3 Hs of AI – Helpful, Honest and Harmless. The situation becomes worse with LLMs increasing in size. The bigger the LLM, the greater data it operates on.

To avoid possible non-compliance with these three principles, LLMs are being fine-tuned via the reward model.

Fine-Tuning

Fine-tuning of AI systems allows organizations to adapt existing AI models to unique use cases. This way, LLMs perform better, show better results, and make decisions faster.

According to Open AI, the creators of ChatGPT, fine-tuning through API has:

Higher quality results than prompting
Ability to train on more examples than a prompt can fit
Token savings via shorter prompts
Lower latency requests

How can an AI model be fine-tuned?

One of the options to fine-tune an AI model is to use reinforcement learning with human feedback. Reinforcement learning involves finding the optimal action to maximize accumulated rewards in the current state of the language model.

When training a language model, this model will be the agent, the input prompt will be the current state, and human feedback - the reward. To avoid losing too many costs evaluating the newly generated texts directly through users, a reward model that has learned human preferences to streamline the process will be the best solution.

3 Hs based AI - for Humans or Instead of Humans (AGI)?

An artificial general intelligence (AGI) is a hypothetical type of intelligence close to the one every human possesses. It is thought that AGI can accomplish any intellectual task that humans (animals) can perform. AGI is autonomous and can surpass human capabilities in economically valuable tasks. Creating AGI is a primary goal of some artificial intelligence research companies.

To achieve AGI, humanity should comply with the HHHs of AI and create systems that are harmless, honest and helpful. In general, people find AI such as ChatGPT helpful. Concerning honesty, LLMs have difficulty understanding the concept of truth. And to be harmless, AI has to be safe and unbiased. With the introduction of ChatGPT4 there were safety issues. And bias is still present as it is hard to get rid of it at once.

Therefore, potentially, AGI can be achieved. Now there is about 55% progress with AGI according to the Thompson scale. How quickly it will be achieved is an open question. But you can always view this scale live here and monitor the situation in general.

With AGI on its way, the other questions arise. Is there going to be a post-intellectual-labor utopia? Will people need to work intellectually or physically then?

Artificial intelligence aims to automate all the tasks humans carry out. Moreover, with AGI there is a thought that humans should be divided from economic productivity and focus more on their well-being. This tendency is also called Post-Labor Economics.

According to Deloitte,

Post-labor economics claims that human identity and worth should not depend on jobs. To make it possible, there is a need to:

Accelerate automation
Provide universal basic goods and services
Redistribute technological wealth
Democratize technology
Set price externalities
Enhance democracy
Adopt holistic metrics

In addition, society, government, and businesses should be prepared to take this leap. Some believe that AGI will make people lose their jobs more, the others say that AI might create new jobs with new responsibilities and requirements. All in all, this will require rebuilding and redefining traditional sectors.

Sencury on 3 Hs of AI

Sencury’s technology consulting and AI/ML services is the safest and the most reliable way to comply with 3 Hs principle of AI in solutions of your organization!

3 Hs for AI: Helpful, Honest, and Harmless

LLM Toxicity through Bias

Fine-Tuning

3 Hs based AI - for Humans or Instead of Humans (AGI)?

Sencury on 3 Hs of AI

Recent Posts

LINKS

ABOUT

SOCIAL