Скачать презентацию Evaluating Health-Care Disparity Employing Linked Data and Data Скачать презентацию Evaluating Health-Care Disparity Employing Linked Data and Data

7a7df74b88324b38e37cc40fe9d149b7.ppt

  • Количество слайдов: 16

Evaluating Health-Care Disparity Employing Linked Data and Data -driven Discovery Amrapali Zaveri AKSW, Institut Evaluating Health-Care Disparity Employing Linked Data and Data -driven Discovery Amrapali Zaveri AKSW, Institut für Informatik 1 1

Outline • Motivation • Methodology o Datasets o CSV to RDF Conversion o Interlinking Outline • Motivation • Methodology o Datasets o CSV to RDF Conversion o Interlinking using SILK o Validation by Linked Data Querying • Conclusions • Limitations • Future Work 2

Motivation • According to the World Health Organization (WHO), more than one billion people Motivation • According to the World Health Organization (WHO), more than one billion people (i. e. one sixth of the world’s population) suffer from one or more neglected tropical diseases. • This shows a significant imbalance between the research intensity invested for the investigation of certain diseases and their prevalence. • Reason • current absence of accurate, interlinked data and information 3

Methodology 4 Methodology 4

Datasets DATASET LINKED DATA VERSION NUMBER OF TRIPLES Clinical. Trials. gov Linked. CT 9. Datasets DATASET LINKED DATA VERSION NUMBER OF TRIPLES Clinical. Trials. gov Linked. CT 9. 8 million Pub. Med Bio 2 RDF’s Pub. Med 797 million WHO’s Global Health Observatory (GHO) Not yet available - 5

CSV to RDF Conversion • WHO’s GHO dataset • Published as Excel sheets • CSV to RDF Conversion • WHO’s GHO dataset • Published as Excel sheets • Advantage • Readable by humans • Disadvantages • Cannot be queried efficiently • Difficult to integrate with other data (in different formats) • Our approach • Converting data into a single data model - RDF • Using SCOVO (Statistical Core Vocabulary)* • designed particularly to represent multidimensional statistical data using RDF. *Michael Hausenblas, et. al. Scovo: Using statistics on the web of data. In ESWC, 2009. 6 6

What is SCOVO? 7 What is SCOVO? 7

Semi-automated approach • Transforming CSV to RDF in a fully automated way is not Semi-automated approach • Transforming CSV to RDF in a fully automated way is not feasible. • Dimensions may often be encoded in heading or label of a sheet • Our semi-automatic approach: • As a plug-in in Onto. Wiki# • a semantic collaboration platform developed by the AKSW research group. • A CSV file is converted into RDF using SCOVO # Sören Auer et. al. : Onto. Wiki: A Tool for Social Semantic Collaboration In: Proceedings of the Workshop on Social and Collaborative Construction of Structured Knowledge CKC 2007 at the 8 16 th International WWW 2007 Banff, Canada, 2007

SCOVOfied GHO Data prefix ex: <http: //example. org/who-data> prefix scv: <http: //purl. org/NET/scovo> ex: SCOVOfied GHO Data prefix ex: prefix scv: ex: Country rdfs: sub. Class. Of scv: Dimension; rdf: type rdfs: Class; dc: title "Country". ex: Disease rdfs: sub. Class. Of scv: Dimension; rdf: type rdfs: Class; dc: title "Disease". ex: Country. Code rdfs: sub. Class. Of scv: Dimension; rdf: type rdfs: Class; dc: title "Country. Code". ex: Afghanistan rdf: type ex: Country; dc: title "Afghanistan". ex: Tuberculosis rdf: type ex: Disease; dc: title "Tuberculosis". ex: 3010 rdf: type ex: Country. Code; dc: title “ 3010”. ex: c 1 -r 6 rdf: type scv: Item; rdf: value 127; scv: dimension ex: Afghanistan; scv: dimension ex: Tuberculosis. scv: dimension ex: 3010 es R lt: u on lli i m 3 9 es l rip t

Interlinking Datasets using SILK 1 0 Interlinking Datasets using SILK 1 0

Interlinking Results Interlinks for: • Publications - already present • Disease - used SILK$ Interlinking Results Interlinks for: • Publications - already present • Disease - used SILK$ • Country - used SILK$ Number of interlinks obtained between datasets $ Julius Volz, Christian Bizer, Martin Gaedke, Georgi Kobilarov: Discovering and Maintaining Links on the Web of Data. International Semantic Web Conference (ISWC 2009), Westfields, USA, October 2009. 1 http: //www 4. wiwiss. fu-berlin. de/bizer/silk/spec/ 1

Validation by Linked Data Querying PREFIX who: <http: //ghodata. org/> PREFIX ct: <http: //data. Validation by Linked Data Querying PREFIX who: PREFIX ct: PREFIX pubmed: SELECT DISTINCT ? disease ? incidence ? country WHERE { ? x who: country "India". ? x who: incidence ? incidence. SELECT DISTINCT ? disease ? country ? no. Of. Trials ? x who: disease ? disease. WHERE { FILTER(? incidence>70) ? disease who: disease "Tuberculosis". } ? y ct: disease ? disease. ? y ct: no. Of. Trials ? no. Of. Trials. ? y ct: country ? country. } SELECT ? country COUNT(? reference) WHERE { ? disease who: disease "Tuberculosis". ? z ct: disease ? disease. ? z ct: country ? country. ? z pubmed: reference ? reference. }GROUP BY ? country 1 2

Conclusions • Which disease has the highest percentage of health-care • • • disparity Conclusions • Which disease has the highest percentage of health-care • • • disparity with respect to the burden of disease and the clinical trials conducted in a particular country? As a research policy maker, which research area would it be most beneficial to allocate funds? Who are the key people doing most research for a particular disease? What has been the trend, overtime, for the health-care disparity for a particular region? 1 3

Limitations • Information Quality • Coverage • Interlinking Quality • Propagation of Errors 1 Limitations • Information Quality • Coverage • Interlinking Quality • Propagation of Errors 1 4

Future Work • Improve Interlinking • Interlinking with other relevant datasets • Updating knowledge-base Future Work • Improve Interlinking • Interlinking with other relevant datasets • Updating knowledge-base as new data is published • Creating a user interface 1 5

Acknowledgements • Research group Agile Knowledge Engineering & Semantic Web (AKSW): http: //aksw. org Acknowledgements • Research group Agile Knowledge Engineering & Semantic Web (AKSW): http: //aksw. org • Research on Research Group: http: //researchonresearch. duhs. duke. edu/site 1 6