As the technical lead for GenAI safety, I ensure the safe deployment of Gen AI across LinkedIn. I lead the architecture, development and deployment of comprehensive frameworks for LLM monitoring.
Team lead for Dash.ai , an AI-powered search and answers system As the ML Tech Lead for Dash.ai, I led the retrieval effort for dash.ai, Dropbox's AI-powered universal search product. I built the search capability from the ground up, architecting a hybrid system combining lexical and vector search. Established the ML flywheel for search, including retrieval, ranking, evaluation, metrics, and experimentations.
At Meta, I worked on personalization, and user representation models for recommender systems , delivering significant, measurable impact on systems affecting billions of users. Led a User Interest Modeling project that enabled a key partner team to achieve a 20% improvement on their primary metric by adapting user representation technology for interest modeling. This helped Facebook show users better interest based recommendations.
Explaining Model Decisions both globally and locally using capabilities like SHAP values Trust in Generative AI by focussing on logprobs and model's chain of thought and reasoning traces
Participated in a group discussion on AI compliance and ethics and addressed questions on balancing the need for explanations and transparency with the performance of the AI models
Presented a framework for objective-driven evaluation pipelines, mixing golden sets and automated scoring to accelerate model iteration.
Shared strategies to detect prompt and embedding drift, and to design feedback loops that surface real-world hallucinations before they reach end users.
Student-led discussion on responsible data practices for GenAI.
Wide-ranging Q&A on large-scale ML, LLM safety and career advice for AI practitioners.
Finance and healthcare leaders discussed the technical and regulatory hurdles of launching GenAI while meeting strict data-control and audit standards.
Explored transparency, fairness and accountability trade-offs when shipping LLM products under emerging AI-governance regimes.
Introduces matched-pair calibration, a test that measures subgroup-level exposure bias in score-based ranking systems by comparing outcomes of near-identical item pairs.
Proposes a benchmark for assessing attribution quality in AI-generated content by evaluating style embeddings alongside large-language-model Judges to determine how well generated output can be traced back to original sources. Accepted for presentation at the ICDM RARA Workshop (Grounding Documents with Reasoning, Agents, Retrieval, and Attribution) on 11/12/2025. https://raraworkshop.github.io/
A structured evaluation framework for LLM systems that emphasizes defining clear objectives, using curated datasets pre-deployment, and ongoing quality assessment post-deployment via feedback and monitoring.
Explains how the temperature parameter in large language models affects creativity, determinism, and output diversity, illustrating its real-world implications for reliability and user experience.
In this *Meet the Writer* interview, LinkedIn Staff AI Engineer Misam Abbas shares his journey from Meta and Dropbox to building trustworthy AI systems that balance ethics, diversity, and innovation. He discusses his writing process, fascination with AI temperature tuning, and the importance of explainability in large language models. Beyond code, Abbas champions mentorship, storytelling, and widening community participation in the evolution of AI.
Describes a platform that lets retail investors pool commitments and due diligence costs to acquire private-equity stakes in startups.
An introduction to sentiment analysis that makes the case that computers don't actually understand emotions, but they can make educated guesses about sentiment by analyzing text using computational methods like sentiment analysis. While these tools can process vast amounts of text quickly and identify general positive or negative sentiment, they still fall short of human-level performance and struggle with nuances like sarcasm.
Served as an official ethics reviewer, assessing compliance of research submissions for both the Regular and the Datasets and Benchmarks tracks.
Served as a reviewer for a paper in Responsible AI
Judge for the student-run AI hackathon, scoring projects on impact, creativity and technical execution.
Judged participants in the Agent Hackathon organized by MindStudio.ai.
Judge for the Globee Artificial Intelligence Awards, reviewing cutting edge AI products and solutions.
Judge for the Webby Awards for AI General and Features Category