Researchers at the University of Pennsylvania have introduced Observer, a first-of-its-kind multimodal medical dataset designed to capture anonymised, real-time interactions between patients and clinicians. Comparable in atmosphere to medical dramas that depict the intensity of life in clinical settings, Observer offers a genuine view inside primary care consultations. Unlike fictional portrayals, however, the recorded encounters are real, securely anonymised, and created specifically to support research and responsible AI development.
Historically, health care research has relied on information left behind after a clinical visit, such as clinician notes, laboratory values, and vital signs. While valuable, these records omit much of what actually shapes a medical encounter. Elements like body language, tone of voice, eye contact, room layout, and the presence of computers or digital tools are rarely documented, yet they strongly influence communication, trust, and outcomes. Observer addresses this gap by combining video, audio, transcripts, and clinical data, allowing researchers to examine what happens during care rather than inferring it after the fact.
Kevin B. Johnson, the project’s lead investigator, explains that much of medical care has remained invisible to researchers. By using technology that automatically anonymises recordings and ensures compliance with US privacy regulations, Observer enables observation of care as it unfolds. This evidence, he argues, is not only essential for improving clinical practice but also critical for developing AI systems that support care in ethical and meaningful ways.
The dataset is already being used. Pilot grants have been awarded to research teams exploring new questions using Observer, with the longer-term aim of expanding it into a national resource. Johnson describes this as a “flywheel” effect: as more researchers contribute insights and recordings, the dataset grows, enabling increasingly ambitious investigations into how care is delivered and experienced.
Clinical data have long been central to improving health care. Large datasets of medical records have underpinned decades of research and, more recently, have played a key role in training AI models to identify patterns linking diagnoses, treatments, and outcomes. Yet, as Johnson notes, understanding the whole experience of care requires data that captures what happens in the room. With Observer, researchers can now study questions such as how humour affects visits, how often clinicians engage with screens instead of patients, and how patients respond to explanations of diagnoses.
Protecting patient privacy has been a central challenge. To make Observer possible, the research team developed MedVidDeID, an automated system that de-identifies video and audio recordings. The tool removes identifying text, alters voices, and detects and blurs faces and other visual identifiers, with human reviewers providing final quality checks. This approach has dramatically reduced the time and effort required to prepare recordings for research use safely.
With initial data collection complete and early studies underway, the team plans to expand Observer and open access to qualified researchers through an application process. The goal is systemic change. As Johnson emphasises, improving care and building meaningful clinical AI depend on understanding the clinical encounter itself. When researchers can observe hundreds or thousands of real visits, genuine transformation becomes possible.
More information: Kevin B. Johnson et al, Observer: creation of a novel multimodal dataset for outpatient care research, Journal of the American Medical Informatics Association. DOI: 10.1093/jamia/ocaf182
Journal information: Journal of the American Medical Informatics Association Provided by University of Pennsylvania School of Engineering and Applied Science
