Misam Abbas

Staff AI Engineer, LinkedIn

Staff AI Engineer at LinkedIn leading three cross-cutting GenAI reliability initiatives spanning pre-deployment evaluation, post-deployment monitoring, and AI-driven observability across the platform’s GenAI product portfolio.

LinkedIn — Technical lead for GenAI reliability; LLM-as-a-judge evaluators, statistical annotation pipelines, pre-deployment evaluation framework, and agentic observability tooling.
Dropbox — ML Tech Lead for Dash.ai; built hybrid lexical/vector search from scratch, bootstrapped evaluation with LLM-generated data, established the ML flywheel.
Meta — User representation models (graph learning) serving billions; fairness metrics for recommender systems; diversity/engagement tuning across two product launches.
Cengage Learning — Search and NLP systems for educational content platforms.

Conference speaker, peer reviewer (NeurIPS, IEEE Computer), and co-author of published AI research. MBA with Distinction from London Business School · B.Tech Computer Science from IIT Delhi · Fellow of BCS · IEEE Senior Member.

Experience

Staff AI Engineer • July 2024 - Present

Technical lead for three cross-cutting GenAI reliability initiatives spanning pre-deployment evaluation, post-deployment monitoring, and AI-driven observability across LinkedIn’s GenAI product portfolio.

Designed and deployed instruction-tuned, embedding-based weak judges and LLM-as-a-judge evaluators to detect violating content across all GenAI products, monitoring millions of member interactions daily.
Built statistical estimation pipelines for human annotation sampling, TPR/TNR calibration, and periodic prevalence calculations.
Architected a pre-deployment reliability evaluation framework that gates all GenAI deployments to production with standardized safety and quality thresholds.
Led development of an agentic research tool that surfaces behavioral patterns from product usage logs, enabling proactive identification of opportunities to better serve members.

Dropbox

Senior Machine Learning Engineer • May 2023 - July 2024

ML Tech Lead for Dash.ai, Dropbox’s AI-powered universal search product.

Built the search capability from the ground up, architecting a hybrid system combining lexical and vector search.
Used LLM-generated documents and queries to bootstrap offline search evaluation and establish early benchmarks.
Established the ML flywheel for search including retrieval, ranking, evaluation, metrics, and experimentation.

Meta (previously Facebook)

Software Engineer (Machine Learning) • October 2018 - September 2022

Worked on personalization, user representation models, and fairness in recommender systems, delivering measurable impact on systems affecting billions of users.

Developed general-purpose user representations using graph representation learning, driving adoption across Ads, FB Shops, and User Interest Profile teams.
Adapted the technology for user interest modeling, enabling a partner team to achieve a 20% improvement on their primary metric.
Developed ranking fairness metrics as part of Meta’s Responsible AI organization, collaborating with partner teams to integrate continuous monitoring.
Analyzed popularity bias in sourcing models and developed diversity/engagement tuning approaches instrumental in two product launches.

Cengage Learning

Software Engineer • 2015 - 2018

Worked on search and NLP systems for educational content platforms, building foundational experience in information retrieval and natural language processing at scale.

Education

Indian Institute of Technology, Delhi

B.Tech Computer Science •

London Business School

MBA, Finance •

Dean's List

Conference Talks & Panels

Talk - Engineering Robustness: Challenges in Quality Assurance and Performance Evaluation for Production LLMs

NHI-CON (Aembit) • Virtual • 2026-01-27

Delivered a presentation on the challenges of quality assurance and performance evaluation for production LLMs, covering robustness engineering practices for real-world deployments.

Workshop - Building a Tool-Calling Agent from Scratch: Demystifying the Core Loop

Agentic AI Summit (ODSC) • Virtual • 2026-01-21

Led a hands-on workshop session demystifying the core agentic loop by building a tool-calling agent from scratch, as part of the Agentic AI Summit.

Talk - Building Trust Through Explainable and Auditable AI

Western New England University • Springfield, MA • 2025-08-28

Explaining Model Decisions both globally and locally using capabilities like SHAP values Trust in Generative AI by focussing on logprobs and model's chain of thought and reasoning traces

Panel Discussion on AI Applications, Compliance, & Ethics

Western New England University • Springfield, MA • 2025-08-28

Participated in a group discussion on AI compliance and ethics and addressed questions on balancing the need for explanations and transparency with the performance of the AI models

Talk - LLM System Evaluation the only moat in the GenAI era

Chief AI Officers Summit • New York • 2025-06-05

Presented a framework for objective-driven evaluation pipelines, mixing golden sets and automated scoring to accelerate model iteration.

Panel - Tackling hallucinations, drift & data decay at scale

LLM Ops Summit • Virtual • 2025-05-29

Shared strategies to detect prompt and embedding drift, and to design feedback loops that surface real-world hallucinations before they reach end users.

Panel - Data-Driven Innovation

SoCal AI Responsibility Summit • UCLA, Los Angeles • 2025-04-19

Student-led discussion on responsible data practices for GenAI.

Ask-Me-Anything with Misam Abbas

AI Beyond The Buzz (AMA) • Virtual (Lu.ma) • 2025-04-03

Wide-ranging Q&A on large-scale ML, LLM safety and career advice for AI practitioners.

Panel - Highly regulated environments: optimising generative AI systems in compliance-critical industries

Generative AI Summit • Washington DC • 2025-03-05

Finance and healthcare leaders discussed the technical and regulatory hurdles of launching GenAI while meeting strict data-control and audit standards.

Panel - Explainable AI & today's ethical challenges: balancing innovation, regulation and first-mover advantage

Generative AI Summit • Washington DC • 2025-03-05

Explored transparency, fairness and accountability trade-offs when shipping LLM products under emerging AI-governance regimes.

Publications

Matched Pair Calibration for Ranking Fairness

arXiv (CoRR) • Research Article • 2023-11-30

Introduces matched-pair calibration, a test that measures subgroup-level exposure bias in score-based ranking systems by comparing outcomes of near-identical item pairs.

Attribution Quality in AI-Generated Content: Benchmarking Style Embeddings and LLM Judges

arXiv (PrePrint) Accepted for publication at ICDMW • Research Article • 2025-10-14

Proposes a benchmark for assessing attribution quality in AI-generated content by evaluating style embeddings alongside large-language-model Judges to determine how well generated output can be traced back to original sources. Accepted for presentation at the ICDM RARA Workshop (Grounding Documents with Reasoning, Agents, Retrieval, and Attribution) on 11/12/2025. https://raraworkshop.github.io/

A smarter way to evaluate LLM applications

LeadDev • Technical Article • 2025-07-30

A structured evaluation framework for LLM systems that emphasizes defining clear objectives, using curated datasets pre-deployment, and ongoing quality assessment post-deployment via feedback and monitoring.

What Happens When You Change the Temperature of Your AI?

HackerNoon • Technical Article • 2025-10-14

Explains how the temperature parameter in large language models affects creativity, determinism, and output diversity, illustrating its real-world implications for reliability and user experience.

Meet HackerNoon Contributor Misam Abbas: The LinkedIn Engineer Building Trustworthy AI Systems

HackerNoon • Interview Feature • 2025-08-14

In this *Meet the Writer* interview, LinkedIn Staff AI Engineer Misam Abbas shares his journey from Meta and Dropbox to building trustworthy AI systems that balance ethics, diversity, and innovation. He discusses his writing process, fascination with AI temperature tuning, and the importance of explainability in large language models. Beyond code, Abbas champions mentorship, storytelling, and widening community participation in the evolution of AI.

Computer Method and System for Equity Financing by Retail Investors with Collective Due Diligence Funding

US Patent 7,827,081 • Patent • 2010-11-02

Describes a platform that lets retail investors pool commitments and due diligence costs to acquire private-equity stakes in startups.

Do Computers Understand our Emotions

Blog on Gale, a Cengage Company • • 2018-03-27

An introduction to sentiment analysis that makes the case that computers don't actually understand emotions, but they can make educated guesses about sentiment by analyzing text using computational methods like sentiment analysis. While these tools can process vast amounts of text quickly and identify general positive or negative sentiment, they still fall short of human-level performance and struggle with nuances like sarcasm.

Associations

BCS

Fellow • 2025

Fellow since 2025

IEEE

Senior Member • 2025

Senior Member since 2025

IADAS

Associate Member • 2024

Associate Member since 2024

Review & Jury Roles

NeurIPS 2025

Neural Information Processing Systems Conference • 2025

Served as an official ethics reviewer, assessing compliance of research submissions for both the Regular and the Datasets and Benchmarks tracks.

Computer.org Publication 2026

IEEE Computer Journal • 2026

Served as a peer reviewer for the IEEE Computer journal, completing two review assignments in early 2026 via Publons/Clarivate.

Computer.org Publication 2025

IEEE Computer.org Journal • 2025

Served as a reviewer for a paper in Responsible AI

AI@Berkeley Hackathon 2025

UC Berkeley • 2025

Judge for the student-run AI hackathon, scoring projects on impact, creativity and technical execution.

MindStudio Hackathon 2025

MindStudio • 2025

Judged participants in the Agent Hackathon organized by MindStudio.ai.

Globee Artificial Intelligence Awards 2025

Globee Awards • 2025

Judge for the Globee Artificial Intelligence Awards, reviewing cutting edge AI products and solutions.

LinkedIn

Staff AI Engineer • July 2024 - Present

Dropbox

Senior Machine Learning Engineer • May 2023 - July 2024

Meta (previously Facebook)

Software Engineer (Machine Learning) • October 2018 - September 2022

Cengage Learning

Software Engineer • 2015 - 2018

Indian Institute of Technology, Delhi

B.Tech Computer Science •

London Business School

MBA, Finance •

Dean's List

NHI-CON (Aembit) • Virtual • 2026-01-27

Agentic AI Summit (ODSC) • Virtual • 2026-01-21

Western New England University • Springfield, MA • 2025-08-28

Western New England University • Springfield, MA • 2025-08-28

Chief AI Officers Summit • New York • 2025-06-05

LLM Ops Summit • Virtual • 2025-05-29

SoCal AI Responsibility Summit • UCLA, Los Angeles • 2025-04-19

AI Beyond The Buzz (AMA) • Virtual (Lu.ma) • 2025-04-03

Generative AI Summit • Washington DC • 2025-03-05

Generative AI Summit • Washington DC • 2025-03-05

arXiv (CoRR) • Research Article • 2023-11-30

arXiv (PrePrint) Accepted for publication at ICDMW • Research Article • 2025-10-14

LeadDev • Technical Article • 2025-07-30

HackerNoon • Technical Article • 2025-10-14

HackerNoon • Interview Feature • 2025-08-14

US Patent 7,827,081 • Patent • 2010-11-02

Blog on Gale, a Cengage Company • • 2018-03-27

Fellow • 2025

Senior Member • 2025

Associate Member • 2024

Neural Information Processing Systems Conference • 2025

IEEE Computer Journal • 2026

IEEE Computer.org Journal • 2025

UC Berkeley • 2025

MindStudio • 2025

Globee Awards • 2025

IADAS (International Academy for Digital Arts and Sciences) • 2025