Increasing protected data accessibility for age-related cataract research using a semi-automated honest broker
102 PDF

How to Cite

Valluripally S, Raju M, Calyam P, Lemus M, Purohit S, Mosa A, Joshi T. Increasing protected data accessibility for age-related cataract research using a semi-automated honest broker. MAIO [Internet]. 2019 Jul. 25 [cited 2021 Dec. 9];2(3):115-32. Available from:

Copyright notice

Authors who publish with this journal agree to the following terms:

  1. Authors retain copyright and grant the journal right of first publication, with the work twelve (12) months after publication simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work’s authorship and initial publication in this journal.

  2. After 12 months from the date of publication, authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.


common data model; honest broker; precision medicine; protected data access; semi-automated compliance


Ophthalmology researchers are becoming increasingly reliant on protected data sets to find new trends and enhance patient care. However, there is an inherent lack of trust in the current healthcare community ecosystem between the data custodians (i.e., health care organizations and hospitals) and data consumers (i.e., researchers and clinicians). This typically results in a manual governance approach that causes slow data accessibility for researchers due to concerns such as ensuring auditability for any authorization of data consumers, and assurance to ensure compliance with health data security standards. In this paper, we address this issue of long-drawn data accessibility by proposing a semi-automated “honest broker” framework that can be implemented in an online health application. The framework establishes trust between the data consumers and the custodians by:

1. improving the eiciency in compliance checking for data consumer requests using a risk assessment technique;

2. incorporating auditability for consumers to access protected data by including a custodian-in-the-loop only when essential; and

3. increasing the speed of large-volume data actions (such as view, copy, modify, and delete) using a popular common data model.

Via an ophthalmology case study involving an age-related cataract research use case in a community cloud testbed, we demonstrate how our solution approach can be implemented in practice to improve timely data access and secure computation of protected data for ultimately achieving data-driven eye health insights.
102 PDF


Andreu-Perez J, Poon Y, Merrifield R, Wong S, Yang G. Big Data for Health. IEEE Journal of Biomedical and Health Informatics, 2015;19(4): 1193–1208.

What is Big Data?,, [Last accessed 05/28/2019].

Genomics and translational bioinformatics trending advancements and their working groups. and- translational- bioninformations, [Last accessed: 05/28/2019].

Aronson S J, Heidi L R. Building the foundation for genomics in precision medicine. Nature 526.7573, 2015;336. doi: 10.1038/nature15816.

Suh K, Sarojini S, Youssif M, Nalley K, Milinovikj N, Elloumi F, et al. Tissue Banking, Bioinformatics, and Electronic Medical Records: The Front-End Requirements for Personalized Medicine. Journal of Oncology, 2016.

Fayyad U, Piatetsky-Shapiro G, Smyth P, Pecora A, Schecter E, Goy A. Knowledge Discovery and Data Mining: Towards a Unifying Framework. Association for the Advancement of Artificial Intelligence, 1996.

Bergner M. Quality of Life, Health Status, and Clinical Research. Advances in Health Status Assessment.1989;27(3).

An Introductory Resource Guide for Implementing the Health Insurance Portability and Accountability Act (HIPAA) Security Rule. NIST Special Publication 800-66 Revision 1, 2013; Available from:

Security and Privacy Controls for Federal Information Systems and Organizations. NIST SP800-30 Technical Report. NIST Special Publications, 2013.

OMOPCommonData Model (CDM) V5.0. ObservationalHealthData Sciences and Informatics (OHDSI), 2019; Available from:

Valluripally S, Murugesan R, Calyam P, Chisholm M, Sivarathri S, Mosa A, et al. Community Cloud Architecture to Improve Use Accessibility with Security Compliance in Health Big Data Applications. ICDCN ’19 Proceedings of the 20th International Conference on Distributed Computing and Networking. ACM, 2019; 377–380.

Raju M, Chisholm M, Mosa AS, Shyu C, Fraunfelder FW. Investigating Risk Factors for Cataract Using the Cerner Health Facts® Database. Journal of Eye and Cataract Surgery, 2017; doi: 10.21767/2471-8300.100019.

Dhir R, Patel A, Winters S, Bisceglia M, Swanson D, Aamodt R, et al. A multidisciplinary approach to honest broker services for tissue banks and clinical data. Cancer, 2008;7, 1705–1715. Available from: doi: 10.1002/cncr.23768.

Boyd A, Hunscher D, Kramer A, Hosner C, Saxman P, Athey B, et al. The Honest Broker Method of Integrating Interdisciplinary Research Data. AMIA Annu Symp Proceedings, 2005.

Orechia J, Pathak A, Shi Y, Nawani A, Belozerov A, Fontes C, et al. OncDRS: An integrative clinical and genomic data platform for enabling translational research and precision medicine. Applied & Translational Genomics, 2015;6, 18–25. doi: 10.1016/j.atg.2015.08.005.

Zhao Y, Yan B, Rocca WA,Wang Y, Shen F, Sauver J, et al. Annotating Cohort Data Elements with OHDSI Common Data Model to Promote Research Reproducibility. IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2018;1109(10): 1310–1314.

Sia Y, Wenga C. An OMOP CDM-Based Relational Database of Clinical Research Eligibility Criteria. PMC Stud Health Technol Inform, 2017;245(1): 950–954.

Lambert GCA, Kumar P. Transforming the 2.33M-patient Medicare synthetic public use files to the OMOP CDMv5: ETL-CMS soware and processed data available and feature-complete. 2016; Available from:

Ronald R. Guide for Conducting Risk Assessments. 2012; Available from:

Dickinson M, Debroy S, Calyam P, Valluripally S, Zhang Y, Antequera R B, et al. Multi-cloud Performance and Security Driven Federated WorkflowManagement. IEEE Transactions in Cloud Computing, 2018; Available from:

Oh S, Cha J, Ji M, Kang H, Kim S, Heo E, et al. Architecture Design of Healthcare Soware-as-a-Service Platform for Cloud-Based Clinical Decision Support Service. IEEE Healthcare Informatics Research,2018.

Getting your data ready for precision medicine.

Community cloud architecure for salesforce health care applications.

HumHub: Open-source Social Network Development Kit. Available from:

Berman M, Chase J, Landweber L, Nakao A, Ott M, Raychaudhuri D, et al. GENI: A Federated Testbed for Innovative Network Experiments. Elsevier Computer Network, 2014;61(14): 5–23. Available from:

102 PDF