Vision + Language

  • VisionArena: 230K Real World User-VLM Conversations with Preference Labels

    Christopher Chou*, Lisa Dunlap*, Koki Mashita, Krishna Mandal, Trevor Darrell, Ion Stoica, Joseph E. Gonzalez, Wei-Lin Chiang

    [Arxiv] Paper Code Dataset

    TL;DR It’s the data release for Chatbot Arena, a platform for crowdsourcing preference votes.

  • VibeCheck: Discover and Quantify Qualitative Differences in Large Language Models

    Lisa Dunlap, Krishna Mandal, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez

    [Arxiv] Paper Code Dataset

    TL;DR We find qualitative properties (vibes) in LLMs and measure how well they can distinguish models and predict user preference

  • From Crowdsourced Data to High-Quality Benchmarks: Arena-Hard and BenchBuilder Pipeline

    Tianle Li*, Wei-Lin Chiang*, Evan Frick, Lisa Dunlap, Tianhao Wu, Banghua Zhu, Joseph E. Gonzalez, Ion Stoica

    [Arxiv] Paper Code Blog Dataset Leaderboard

    TL;DR Filter large, messy NLP datasets into a smaller set of high-quality prompts using LLMs

  • Describing Differences in Image Sets with Natural Language

    Lisa Dunlap*, Yuhui Zhang*, Xiaohan Wang, R. Zhong, Trevor Darrell, Jacob Steinhardt, Joseph E. Gonzalez, Serena Yeung-Levy

    [CVPR 2024 (oral)] PaperCodeWebsite

    TL;DR Set Difference Captioning - describing differences in two large sets of images with language - has many impactful ML & data science applications

  • See, Say, and Segment: Teaching LMMs to Overcome False Premises

    Tsung-Han Wu, Giscard Biamby, David Chan, Lisa Dunlap, Ritwik Gupta, Xudong Wang, Joseph E. Gonzalez, Trevor Darrell

    [CVPR 2024] Paper Website Code

    TL;DR we train segmentation-VQA models to see if an object is present, say if the object isnt present and suggest alternatives, and segment said object

  • Diversify Your Vision Datasets with Automatic Diffusion-Based Augmentation (ALIA)

    L. Dunlap, A. Umino, P. Zhang, J. Yang, J. E. Gonzalez, T. Darrell

    [NeurIPS 2023] Paper Code Website

    TL;DR V&L models can summarize high-level spurious features in your data with language which can be used to augment your data with diffusion models

  • Diffusion Hyperfeatures: Searching Through Time and Space for Semantic Correspondence

    G. Luo, L. Dunlap, D. Huk Park, A. Holynski, T. Darrell

    [NeurIPS 2023] Paper Website Code

    TL;DR aggregating diffusion features over different layers and timesteps leads to fantastic features for semantic correspondence

  • Using Language to Extend to Unseen Domains (LADS)

    L. Dunlap, C. Mohri, D. Guillory, H. Zhang, T. Darrell, J. E. Gonzalez, A. Raghunathan, A. Rohrbach

    [ICLR 2023 (spotlight)] Website Paper Code Blog Slides

    TL;DR it’s UDA but instead of unlabeled target data its language and you want to maximize target accuracy while maintaining source accuracy

  • On Guiding Attention with Language Specification (GALS)

    S. Petryk*, L. Dunlap*, K. Nasseri, J. E. Gonzalez, T. Darrell, A. Rohrbach.

    [CVPR 2022] Paper Code

    TL;DR saliency of V+L models can be used to guide CNNs training on biased data using a language description of what to focus on

  • NBDT: Neural-Backed Decision Trees

    A. Wan, L. Dunlap*, D. Ho*, J. Yin, S. Lee, H. Jin, S. Petryk, S. A. Bargal, and J. E. Gonzalez.

    [ICLR 2021] Paper Website Blog Talk

    TL;DR training a CNN to have the class hierarchy of a decision tree increases accuracy and interpretability

  • Deep Mixture of Experts Via Shallow Embedding

    X. Wang, F. Yu, L. Dunlap, R. Wang, Y. A. Ma, A. Mirhoseini, T. Darrell, and J. E. Gonzalez.

    [UAI 2019] Paper

    TL;DR lots of MoE’s + sparse gating network = better accuracy and less computation

ML Systems

  • Improve Model Inference Cost with Image Gridding

    S. Krishnaswamy, L. Dunlap, L. Chen, M. Zaharia, J. Zou, J. Gonzalez

    [ICML 2023 DMLR workshop] Paper

    TL;DR reduce vision model API costs by gridding your images together

  • Hyperparameter Tuning with Elastic Resources

    L. Dunlap, K. Kandasamy, U. Mishra, R. Liaw, J. Gonzalez, I. Stoica, M. Jordan.

    [SOCC 2021] Paper Talk

    TL;DR given a deadline and a cloud budget, produce an optimal HP tuning experiment

  • RubberBand: Cloud Based Hyperparameter Tuning

    R. Liaw*, U. Mishra*, L. Dunlap, R. Bhardwaj, A. Tumanov, J. Gonzalez, I. Stoica.

    [EuroSys 2021] Paper Talk

    TL;DR given a HP tuning experiment and time deadline, minimize cost on the cloud

  • Hypersched: Dynamic resource allocation for model development on a deadline

    R. Liaw, R. Bhardwaj, L. Dunlap, A. Tumanov, J. E. Gonzalez, I. Stoica

    [SoCC 2019] Paper

    TL;DR when HP tuning on a time deadline, dynamically allocate resrouces to jobs

Misc

  • [Joke] MICKIE: The Magically Interpretable Cloud Komputing Inference Engine

    Everyone on the bus to the 2022 Sky Disneyland Retreat (but mostly Conor Power)

    [Under arxiv ethics review] Paper Code

    TL;DR hear me out - large language models WILL replace the cloud

  • Machine Log Parsing with Named Entity Recognition


    L. Dunlap, A. Starosta, K. Curtis, Z. Wang, C. Sarkar, R. Sriharsha.

    [Nvidia GTC 2021] Blog.

    TL;DR NER models work surprisingly well for log parsing

  • Habitat-dependent search behavior in the Colorado Checkered Whiptail (Aspidoscelis neotesselata)

    K. Utsumi, C. Kusaks, R. Pedersen, C. Staley, L. Dunlap, S. G. Smith, M. A. Eifler, D. A. Eifler.

    [Western North America Naturalist 2019] Paper

    TL;DR whiptails behave differently in shrub grassland VS pine-juniper woodland