Follow
Manish Nagireddy
Manish Nagireddy
IBM Research AI, MIT-IBM Watson AI Lab
Verified email at ibm.com - Homepage
Title
Cited by
Cited by
Year
Exploring How Machine Learning Practitioners (Try To) Use Fairness Toolkits
WH Deng, M Nagireddy, MSA Lee, J Singh, ZS Wu, K Holstein, H Zhu
2022 ACM Conference on Fairness, Accountability, and Transparency, 473-484, 2022
522022
A sandbox tool to bias (Stress)-test fairness algorithms
NJ Akpinar, M Nagireddy, L Stapleton, HF Cheng, H Zhu, S Wu, H Heidari
EAAMO 2022 Poster, 2022
82022
SocialStigmaQA: A Benchmark to Uncover Stigma Amplification in Generative Language Models
M Nagireddy, L Chiazor, M Singh, I Baldini
Proceedings of the 2024 AAAI Conference on Artificial Intelligence, 2023
62023
Detectors for Safe and Reliable LLMs: Implementations, Uses, and Limitations
S Achintalwar, AA Garcia, A Anaby-Tavor, I Baldini, SE Berger, ...
arXiv preprint arXiv:2403.06009, 2024
12024
Influence Based Approaches to Algorithmic Fairness: A Closer Look
S Ghosh, P Sattigeri, I Padhi, M Nagireddy, J Chen
NeurIPS 2023 Workshop on XAI in Action: Past, Present, and Future Applications, 2023
12023
The RealHumanEval: Evaluating Large Language Models' Abilities to Support Programmers
H Mozannar, V Chen, M Alsobay, S Das, S Zhao, D Wei, M Nagireddy, ...
arXiv preprint arXiv:2404.02806, 2024
2024
Language Models in Dialogue: Conversational Maxims for Human-AI Interactions
E Miehling, M Nagireddy, P Sattigeri, EM Daly, D Piorkowski, JT Richards
arXiv preprint arXiv:2403.15115, 2024
2024
Multi-Level Explanations for Generative Language Models
LM Paes, D Wei, HJ Do, H Strobelt, R Luss, A Dhurandhar, M Nagireddy, ...
arXiv preprint arXiv:2403.14459, 2024
2024
Contextual Moral Value Alignment Through Context-Based Aggregation
P Dognin, J Rios, R Luss, I Padhi, MD Riemer, M Liu, P Sattigeri, ...
arXiv preprint arXiv:2403.12805, 2024
2024
Alignment Studio: Aligning Large Language Models to Particular Contextual Regulations
S Achintalwar, I Baldini, D Bouneffouf, J Byamugisha, M Chang, P Dognin, ...
arXiv preprint arXiv:2403.09704, 2024
2024
Simulating Iterative Human-AI Interaction in Programming with LLMs
H Mozannar, V Chen, D Wei, P Sattigeri, M Nagireddy, S Das, A Talwalkar, ...
NeurIPS 2023 Workshop on Instruction Tuning and Instruction Following, 2023
2023
Function Composition in Trustworthy Machine Learning: Implementation Choices, Insights, and Questions
M Nagireddy, M Singh, SC Hoffman, E Ju, KN Ramamurthy, KR Varshney
arXiv preprint arXiv:2302.09190, 2023
2023
Prompt Templates: A Methodology for Improving Manual Red Teaming Performance
B Dominique, D Piorkowski, M Nagireddy, I Baldini
CHI 2024 Workshop, HEAL: Human-centered Evaluation and Auditing of Language …, 0
The system can't perform the operation now. Try again later.
Articles 1–13