Abstract:Information search has changed the way we manage knowledge and the ubiquity of information access has made search a frequent activity, whether via Internet search engines or increasingly via mobile devices. Medical information search is in this respect no different and much research has been devoted to analyzing the way in which physicians aim to access information. Medical image search is a much smaller domain but has gained much attention as it has different characteristics than search for text documents. While web search log files have been analysed many times to better understand user behaviour, the log files of hospital internal systems for search in a PACS/RIS (Picture Archival and Communication System, Radiology Information System) have rarely been analysed. Such a comparison between a hospital PACS/RIS search and a web system for searching images of the biomedical literature is the goal of this paper. Objectives are to identify similarities and differences in search behaviour of the two systems, which could then be used to optimize existing systems and build new search engines. Log files of the ARRS GoldMiner medical image search engine (freely accessible on the Internet) containing 222,005 queries, and log files of Stanford's internal PACS/RIS search called radTF containing 18,068 queries were analysed. Each query was preprocessed and all query terms were mapped to the RadLex (Radiology Lexicon) terminology, a comprehensive lexicon of radiology terms created and maintained by the Radiological Society of North America, so the semantic content in the queries and the links between terms could be analysed, and synonyms for the same concept could be detected. RadLex was mainly created for the use in radiology reports, to aid structured reporting and the preparation of educational material (Lanlotz, 2006) [1]. In standard medical vocabularies such as MeSH (Medical Subject Headings) and UMLS (Unified Medical Language System) specific terms of radiology are often underrepresented, therefore RadLex was considered to be the best option for this task. The results show a surprising similarity between the usage behaviour in the two systems, but several subtle differences can also be noted. The average number of terms per query is 2.21 for GoldMiner and 2.07 for radTF, the used axes of RadLex (anatomy, pathology, findings, …) have almost the same distribution with clinical findings being the most frequent and the anatomical entity the second; also, combinations of RadLex axes are extremely similar between the two systems. Differences include a longer length of the sessions in radTF than in GoldMiner (3.4 and 1.9 queries per session on average). Several frequent search terms overlap but some strong differences exist in the details. In radTF the term "normal" is frequent, whereas in GoldMiner it is not. This makes intuitive sense, as in the literature normal cases are rarely described whereas in clinical work the comparison with normal cases is often a first step. The general similarity in many points is likely due to the fact that users of the two systems are influenced by their daily behaviour in using standard web search engines and follow this behaviour in their professional search. This means that many results and insights gained from standard web search can likely be transferred to more specialized search systems. Still, specialized log files can be used to find out more on reformulations and detailed strategies of users to find the right content.

Enhancing Clinical Information Retrieval through Context-Aware Queries and Indices

Clinical Information Retrieval: A Literature Review

An Act Indexing Information Model for Clinical Data Integration

A Modern Non-SQL Approach to Radiology-Centric Search Engine Design with Clinical Validation

ESR studies on membrane fluidity of Chinese hamster ovary cells grown on microcarriers and in suspension.

Inferring Conceptual Relationships When Ranking Patients

Discovering Related Clinical Concepts Using Large Amounts of Clinical Notes

ELII: A novel inverted index for fast temporal query, with application to a large Covid-19 EHR dataset

Uncovering Medical Insights from Vast Amounts of Biomedical Data in Clinical Case Reports

Conceptualizing Machine Learning for Dynamic Information Retrieval of Electronic Health Record Notes

A Patient-Screening Tool for Clinical Research Based on Electronic Health Records Using OpenEHR: Development Study

A query interface for clinical research with Chinese electronic health record using Natural Language Processing

Semantic-Enhanced Query Expansion System for Retrieving Medical Image Notes

Clinical Evaluation of Using Semantic Searching Engine for Radiological Imaging Services in Ris-Integrated Pacs

Comparing image search behaviour in the ARRS GoldMiner search engine and a clinical PACS/RIS

Focused Clinical Query Understanding and Retrieval of Medical Snippets powered through a Healthcare Knowledge Graph

An Intelligent Search & Retrieval System (IRIS) and Clinical and Research Repository for Decision Support Based on Machine Learning and Joint Kernel-based Supervised Hashing

TELII: Temporal Event Level Inverted Indexing for Cohort Discovery on a Large Covid-19 EHR Dataset

On the Combined Use of Extrinsic Semantic Resources for Medical Information Search

Sputum for cytologic diagnosis of lung cancer.

Improving Chinese Electronic Medical Record Retrieval by Field Weight Assignment, Negation Detection, and Re-ranking