Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits WH Deng, M Nagireddy, MSA Lee, J Singh, ZS Wu, K Holstein, H Zhu 2022 ACM Conference on Fairness, Accountability, and Transparency, 473-484, 2022 | 52 | 2022 |
A sandbox tool to bias (Stress)-test fairness algorithms NJ Akpinar, M Nagireddy, L Stapleton, HF Cheng, H Zhu, S Wu, H Heidari EAAMO 2022 Poster, 2022 | 8 | 2022 |
SocialStigmaQA: A Benchmark to Uncover Stigma Amplification in Generative Language Models M Nagireddy, L Chiazor, M Singh, I Baldini Proceedings of the 2024 AAAI Conference on Artificial Intelligence, 2023 | 6 | 2023 |
Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations S Achintalwar, AA Garcia, A Anaby-Tavor, I Baldini, SE Berger, ... arXiv preprint arXiv:2403.06009, 2024 | 1 | 2024 |
Influence Based Approaches to Algorithmic Fairness: A Closer Look S Ghosh, P Sattigeri, I Padhi, M Nagireddy, J Chen NeurIPS 2023 Workshop on XAI in Action: Past, Present, and Future Applications, 2023 | 1 | 2023 |
The RealHumanEval: Evaluating Large Language Models' Abilities to Support Programmers H Mozannar, V Chen, M Alsobay, S Das, S Zhao, D Wei, M Nagireddy, ... arXiv preprint arXiv:2404.02806, 2024 | | 2024 |
Language Models in Dialogue: Conversational Maxims for Human-AI Interactions E Miehling, M Nagireddy, P Sattigeri, EM Daly, D Piorkowski, JT Richards arXiv preprint arXiv:2403.15115, 2024 | | 2024 |
Multi-Level Explanations for Generative Language Models LM Paes, D Wei, HJ Do, H Strobelt, R Luss, A Dhurandhar, M Nagireddy, ... arXiv preprint arXiv:2403.14459, 2024 | | 2024 |
Contextual Moral Value Alignment Through Context-Based Aggregation P Dognin, J Rios, R Luss, I Padhi, MD Riemer, M Liu, P Sattigeri, ... arXiv preprint arXiv:2403.12805, 2024 | | 2024 |
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations S Achintalwar, I Baldini, D Bouneffouf, J Byamugisha, M Chang, P Dognin, ... arXiv preprint arXiv:2403.09704, 2024 | | 2024 |
Simulating Iterative Human-AI Interaction in Programming with LLMs H Mozannar, V Chen, D Wei, P Sattigeri, M Nagireddy, S Das, A Talwalkar, ... NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023 | | 2023 |
Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions M Nagireddy, M Singh, SC Hoffman, E Ju, KN Ramamurthy, KR Varshney arXiv preprint arXiv:2302.09190, 2023 | | 2023 |
Prompt Templates: A Methodology for Improving Manual Red Teaming Performance B Dominique, D Piorkowski, M Nagireddy, I Baldini CHI 2024 Workshop, HEAL: Human-centered Evaluation and Auditing of Language …, 0 | | |