Abstract:The entire scientific and academic community has been mobilized to gain a better understanding of the COVID-19 disease and its impact on humanity. Most research related to COVID-19 needs to analyze large amounts of data in very little time. This urgency has made Big Data Analysis, and related questions around the privacy and security of the data, an extremely important part of research in the COVID-19 era. The White House OSTP has, for example, released a large dataset of papers related to COVID research from which the research community can extract knowledge and information. We show an example system with a machine learning-based knowledge extractor which draws out key medical information from COVID-19 related academic research papers. We represent this knowledge in a Knowledge Graph that uses the Unified Medical Language System (UMLS). However, publicly available studies rely on dataset that might have sensitive data. Extracting information from academic papers can potentially leak sensitive data, and protecting the security and privacy of this data is equally important. In this paper, we address the key challenges around the privacy and security of such information extraction and analysis systems. Policy regulations like HIPAA have updated the guidelines to access data, specifically, data related to COVID-19, securely. In the US, healthcare providers must also comply with the Office of Civil Rights (OCR) rules to protect data integrity in matters like plasma donation, media access to health care data, telehealth communications, etc. Privacy policies are typically short and unstructured HTML or PDF documents. We have created a framework to extract relevant knowledge from the health centers’ policy documents and also represent these as a knowledge graph. Our framework helps to understand the extent to which individual provider policies comply with regulations and define access control policies that enforce the regulation rules on data in the knowledge graph extracted from COVID-related papers. Along with being compliant, privacy policies must also be transparent and easily understood by the clients. We analyze the relative readability of healthcare privacy policies and discuss the impact. In this paper, we develop a framework for access control decisions that uses policy compliance information to securely retrieve COVID data. We show how policy compliance information can be used to restrict access to COVID-19 data and information extracted from research papers.

Automated Extraction of ABAC Policies from Natural-Language Documents in Healthcare Systems.

Automated Extraction of Security Policies from Natural-Language Software Documents

ABAC policy mining method based on hierarchical clustering and relationship extraction

A Semantic and Trust Based Framework for Rbac User-Role Assignment

Learning Attribute-Based and Relationship-Based Access Control Policies with Unknown Values

Relation Extraction for Inferring Access Control Rules from Natural Language Artifacts.

Mining Attribute-based Access Control Policies

Harnessing AI for efficient analysis of complex policy documents: a case study of Executive Order 14110

Advancing Healthcare Automation: Multi-Agent System for Medical Necessity Justification

A Feasible Fuzzy-Extended Attribute-Based Access Control Technique

Adaptive ABAC Policy Learning: A Reinforcement Learning Approach

Envisioning a Human-AI collaborative system to transform policies into decision models

Using XACML to Define Access Control Policy in Information System

Inferring Access-Control Policy Properties via Machine Learning

Investigating Large Language Models and Control Mechanisms to Improve Text Readability of Biomedical Abstracts

Flexible and secure access control for EHR sharing based on blockchain

A Privacy-Preserving Attribute-Based Access Control Scheme.

FABAC: A Flexible Fuzzy Attribute-Based Access Control Mechanism.

A Policy-Driven Approach to Secure Extraction of COVID-19 Data From Research Papers

RAGent: Retrieval-based Access Control Policy Generation

GPT, Ontology, and CAABAC: A Tripartite Personalized Access Control Model Anchored by Compliance, Context and Attribute