Copy
# start by installing semanticscholar api
pip install -qU  semanticscholar
Copy
from langchain_classic import hub
from langchain.agents import AgentExecutor, create_openai_functions_agent
from langchain_openai import ChatOpenAI
Copy
instructions = """You are an expert researcher."""
base_prompt = hub.pull("langchain-ai/openai-functions-template")
prompt = base_prompt.partial(instructions=instructions)
Copy
llm = ChatOpenAI(temperature=0)
Copy
from langchain_community.tools.semanticscholar.tool import SemanticScholarQueryRun
tools = [SemanticScholarQueryRun()]
Copy
agent = create_openai_functions_agent(llm, tools, prompt)
Copy
agent_executor = AgentExecutor(
    agent=agent,
    tools=tools,
    verbose=True,
)
Copy
agent_executor.invoke(
    {
        "input": "What are some biases in the large language models? How have people tried to mitigate them? "
        "show me a list of papers and techniques. Based on your findings write new research questions "
        "to work on. Break down the task into subtasks for search. Use the search tool"
    }
)
Copy
> Entering new AgentExecutor chain...
Invoking: `semanticscholar` with `{'query': 'biases in large language models'}`
Published year: 2023
Title: Biases in Large Language Models: Origins, Inventory, and Discussion
Authors: Roberto Navigli, Simone Conia, Björn Ross
Astract: In this article, we introduce and discuss the pervasive issue of bias in the large language models that are currently at the core of mainstream approaches to Natural Language Processing (NLP). We first introduce data selection bias, that is, the bias caused by the choice of texts that make up a training corpus. Then, we survey the different types of social bias evidenced in the text generated by language models trained on such corpora, ranging from gender to age, from sexual orientation to ethnicity, and from religion to culture. We conclude with directions focused on measuring, reducing, and tackling the aforementioned types of bias.
Published year: 2023
Title: Surfacing Biases in Large Language Models using Contrastive Input Decoding
Authors: G. Yona, Or Honovich, Itay Laish, Roee Aharoni
Astract: Ensuring that large language models (LMs) are fair, robust and useful requires an understanding of how different modifications to their inputs impact the model's behaviour. In the context of open-text generation tasks, however, such an evaluation is not trivial. For example, when introducing a model with an input text and a perturbed,"contrastive"version of it, meaningful differences in the next-token predictions may not be revealed with standard decoding strategies. With this motivation in mind, we propose Contrastive Input Decoding (CID): a decoding algorithm to generate text given two inputs, where the generated text is likely given one input but unlikely given the other. In this way, the contrastive generations can highlight potentially subtle differences in how the LM output differs for the two inputs in a simple and interpretable manner. We use CID to highlight context-specific biases that are hard to detect with standard decoding strategies and quantify the effect of different input perturbations.
Published year: 2023
Title: Benchmarking Cognitive Biases in Large Language Models as Evaluators
Authors: Ryan Koo, Minhwa Lee, Vipul Raheja, Jong Inn Park, Zae Myung Kim, Dongyeop Kang
Astract: Large Language Models (LLMs) have recently been shown to be effective as automatic evaluators with simple prompting and in-context learning. In this work, we assemble 15 LLMs of four different size ranges and evaluate their output responses by preference ranking from the other LLMs as evaluators, such as System Star is better than System Square. We then evaluate the quality of ranking outputs introducing the Cognitive Bias Benchmark for LLMs as Evaluators (CoBBLEr), a benchmark to measure six different cognitive biases in LLM evaluation outputs, such as the Egocentric bias where a model prefers to rank its own outputs highly in evaluation. We find that LLMs are biased text quality evaluators, exhibiting strong indications on our bias benchmark (average of 40% of comparisons across all models) within each of their evaluations that question their robustness as evaluators. Furthermore, we examine the correlation between human and machine preferences and calculate the average Rank-Biased Overlap (RBO) score to be 49.6%, indicating that machine preferences are misaligned with humans. According to our findings, LLMs may still be unable to be utilized for automatic annotation aligned with human preferences. Our project page is at: https://minnesotanlp.github.io/cobbler.
Published year: 2023
Title: Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models
Authors: Emilio Ferrara
Astract: As generative language models, exemplified by ChatGPT, continue to advance in their capabilities, the spotlight on biases inherent in these models intensifies. This paper delves into the distinctive challenges and risks associated with biases specifically in large-scale language models. We explore the origins of biases, stemming from factors such as training data, model specifications
Invoking: `semanticscholar` with `{'query': 'mitigating biases in large language models'}`
Published year: 2023
Title: Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models
Authors: Emilio Ferrara
Astract: As generative language models, exemplified by ChatGPT, continue to advance in their capabilities, the spotlight on biases inherent in these models intensifies. This paper delves into the distinctive challenges and risks associated with biases specifically in large-scale language models. We explore the origins of biases, stemming from factors such as training data, model specifications, algorithmic constraints, product design, and policy decisions. Our examination extends to the ethical implications arising from the unintended consequences of biased model outputs. In addition, we analyze the intricacies of mitigating biases, acknowledging the inevitable persistence of some biases, and consider the consequences of deploying these models across diverse applications, including virtual assistants, content generation, and chatbots. Finally, we provide an overview of current approaches for identifying, quantifying, and mitigating biases in language models, underscoring the need for a collaborative, multidisciplinary effort to craft AI systems that embody equity, transparency, and responsibility. This article aims to catalyze a thoughtful discourse within the AI community, prompting researchers and developers to consider the unique role of biases in the domain of generative language models and the ongoing quest for ethical AI.
Published year: 2021
Title: Towards Understanding and Mitigating Social Biases in Language Models
Authors: P. Liang, Chiyu Wu, Louis-Philippe Morency, R. Salakhutdinov
Astract: As machine learning methods are deployed in real-world settings such as healthcare, legal systems, and social science, it is crucial to recognize how they shape social biases and stereotypes in these sensitive decision-making processes. Among such real-world deployments are large-scale pretrained language models (LMs) that can be potentially dangerous in manifesting undesirable representational biases - harmful biases resulting from stereotyping that propagate negative generalizations involving gender, race, religion, and other social constructs. As a step towards improving the fairness of LMs, we carefully define several sources of representational biases before proposing new benchmarks and metrics to measure them. With these tools, we propose steps towards mitigating social biases during text generation. Our empirical results and human evaluation demonstrate effectiveness in mitigating bias while retaining crucial contextual information for high-fidelity text generation, thereby pushing forward the performance-fairness Pareto frontier.
Published year: 2023
Title: In-Contextual Bias Suppression for Large Language Models
Authors: Daisuke Oba, Masahiro Kaneko, D. Bollegala
Astract: Despite their impressive performance in a wide range of NLP tasks, Large Language Models (LLMs) have been reported to encode worrying-levels of gender bias. Prior work has proposed debiasing methods that require human labelled examples, data augmentation and fine-tuning of the LLMs, which are computationally costly. Moreover, one might not even have access to the internal parameters for performing debiasing such as in the case of commercially available LLMs such as GPT-4. To address this challenge we propose bias suppression, a novel alternative to debiasing that does not require access to model parameters. We show that text-based preambles, generated from manually designed templates covering counterfactual statements, can accurately suppress gender biases in LLMs. Moreover, we find that descriptive sentences for occupations can further suppress gender biases. Interestingly, we find that bias suppression has a minimal adverse effect on downstream task performance, while effectively mitigating the gender biases.
Published year: 2023
Copy
Title: The Knowledge Alignment Problem: Bridging Human and External Knowledge for LargeBased on my findings, here are some papers and techniques related to mitigating biases in large language models:
1. "Biases in Large Language Models: Origins, Inventory, and Discussion" by Roberto Navigli, Simone Conia, Björn Ross: This paper discusses the issue of bias in large language models, including data selection bias and various types of social bias. It explores directions for measuring, reducing, and tackling bias in language models.
2. "Surfacing Biases in Large Language Models using Contrastive Input Decoding" by G. Yona, Or Honovich, Itay Laish, Roee Aharoni: This paper proposes a decoding algorithm called Contrastive Input Decoding (CID) to highlight context-specific biases in language models. It aims to reveal meaningful differences in model behavior when given different inputs.
3. "Benchmarking Cognitive Biases in Large Language Models as Evaluators" by Ryan Koo, Minhwa Lee, Vipul Raheja, Jong Inn Park, Zae Myung Kim, Dongyeop Kang: This work evaluates the biases in large language models used as automatic evaluators. It introduces the Cognitive Bias Benchmark for LLMs as Evaluators (CoBBLEr) to measure different cognitive biases in model evaluation outputs.
4. "Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models" by Emilio Ferrara: This paper explores the challenges and risks associated with biases in large-scale language models. It discusses the origins of biases and the ethical implications of biased model outputs. It also provides an overview of current approaches for identifying, quantifying, and mitigating biases in language models.
5. "Towards Understanding and Mitigating Social Biases in Language Models" by P. Liang, Chiyu Wu, Louis-Philippe Morency, R. Salakhutdinov: This work focuses on mitigating social biases in language models. It proposes new benchmarks and metrics to measure representational biases and suggests steps towards mitigating biases during text generation.
6. "In-Contextual Bias Suppression for Large Language Models" by Daisuke Oba, Masahiro Kaneko, D. Bollegala: This paper presents a novel approach called bias suppression to mitigate gender biases in language models. It uses text-based preambles and descriptive sentences to suppress biases without requiring access to model parameters.
Based on these papers, here are some research questions to work on:
1. How can we further improve the effectiveness of bias suppression techniques in large language models?
2. What are the long-term effects of biases in language models on downstream applications and user experiences?
3. How can we develop more comprehensive benchmarks and metrics to measure and evaluate biases in language models?
4. What are the ethical considerations and trade-offs involved in mitigating biases in language models?
5. How can we ensure transparency and accountability in the deployment of language models to minimize biases?
6. What are the potential biases introduced by fine-tuning language models on specific domains or datasets, and how can we address them?
To break down the task into subtasks for further search, you can focus on the following topics:
1. Techniques for measuring and quantifying biases in large language models.
2. Approaches for mitigating biases during text generation in language models.
3. Evaluation methods and benchmarks for assessing biases in language models.
4. Ethical considerations and implications of biases in language models.
5. Impact of biases in language models on downstream applications and user experiences.
6. Bias suppression techniques that do not require access to model parameters.
Using the search tool, you can explore these subtopics and find more specific papers and techniques related to each subtask.
> Finished chain.
Copy
{'input': 'What are some biases in the large language models? How have people tried to mitigate them? show me a list of papers and techniques. Based on your findings write new research questions to work on. Break down the task into subtasks for search. Use the search tool',
 'output': 'Based on my findings, here are some papers and techniques related to mitigating biases in large language models:\n\n1. "Biases in Large Language Models: Origins, Inventory, and Discussion" by Roberto Navigli, Simone Conia, Björn Ross: This paper discusses the issue of bias in large language models, including data selection bias and various types of social bias. It explores directions for measuring, reducing, and tackling bias in language models.\n\n2. "Surfacing Biases in Large Language Models using Contrastive Input Decoding" by G. Yona, Or Honovich, Itay Laish, Roee Aharoni: This paper proposes a decoding algorithm called Contrastive Input Decoding (CID) to highlight context-specific biases in language models. It aims to reveal meaningful differences in model behavior when given different inputs.\n\n3. "Benchmarking Cognitive Biases in Large Language Models as Evaluators" by Ryan Koo, Minhwa Lee, Vipul Raheja, Jong Inn Park, Zae Myung Kim, Dongyeop Kang: This work evaluates the biases in large language models used as automatic evaluators. It introduces the Cognitive Bias Benchmark for LLMs as Evaluators (CoBBLEr) to measure different cognitive biases in model evaluation outputs.\n\n4. "Should ChatGPT be Biased? Challenges and Risks of Bias in Large Language Models" by Emilio Ferrara: This paper explores the challenges and risks associated with biases in large-scale language models. It discusses the origins of biases and the ethical implications of biased model outputs. It also provides an overview of current approaches for identifying, quantifying, and mitigating biases in language models.\n\n5. "Towards Understanding and Mitigating Social Biases in Language Models" by P. Liang, Chiyu Wu, Louis-Philippe Morency, R. Salakhutdinov: This work focuses on mitigating social biases in language models. It proposes new benchmarks and metrics to measure representational biases and suggests steps towards mitigating biases during text generation.\n\n6. "In-Contextual Bias Suppression for Large Language Models" by Daisuke Oba, Masahiro Kaneko, D. Bollegala: This paper presents a novel approach called bias suppression to mitigate gender biases in language models. It uses text-based preambles and descriptive sentences to suppress biases without requiring access to model parameters.\n\nBased on these papers, here are some research questions to work on:\n\n1. How can we further improve the effectiveness of bias suppression techniques in large language models?\n2. What are the long-term effects of biases in language models on downstream applications and user experiences?\n3. How can we develop more comprehensive benchmarks and metrics to measure and evaluate biases in language models?\n4. What are the ethical considerations and trade-offs involved in mitigating biases in language models?\n5. How can we ensure transparency and accountability in the deployment of language models to minimize biases?\n6. What are the potential biases introduced by fine-tuning language models on specific domains or datasets, and how can we address them?\n\nTo break down the task into subtasks for further search, you can focus on the following topics:\n\n1. Techniques for measuring and quantifying biases in large language models.\n2. Approaches for mitigating biases during text generation in language models.\n3. Evaluation methods and benchmarks for assessing biases in language models.\n4. Ethical considerations and implications of biases in language models.\n5. Impact of biases in language models on downstream applications and user experiences.\n6. Bias suppression techniques that do not require access to model parameters.\n\nUsing the search tool, you can explore these subtopics and find more specific papers and techniques related to each subtask.'}
Connect these docs programmatically to Claude, VSCode, and more via MCP for    real-time answers.