2be3ce6e20ff9288208540193d94094e.ppt
- Количество слайдов: 34
Measuring Coding Accuracy Artificial Intelligence in Medicine National Cancer Institute
Project l This project was funded in part by Contract Number 263 -MQ-514922 from the National Cancer Institute l Participating registries l l Kentucky Cancer Registry Los Angeles Cancer Registry Atlanta Cancer Registry New Jersey Cancer Registry
Objective l Develop a software tool that measures the accuracy of an automated coding system against a reference data set. Sub-tasks l Define a coding accuracy model. l Create a software tool that accepts input from any automated coding system to produce accuracy data.
Automated coding CLINICAL HISTORY/MACROSCOPY Right mastectomy and axillary tissue. A right mastectomy specimen with overlying skin measuring 220 mm x 85 mm and underlying breast tissue measuring 220 mm x 100 mm x 70 mm. The axillary tail measures 125 x 60 mm. The nipple is slightly retracted and located centrally. The superior margin is painted red, the inferior margin painted green and the deep cut margin is painted blue. Cut sections of the underlying breast tissue shows an ill-defined grey white yellow lesion with patchy areas of haemorrhage measuring 35 x 35 mm located immediately below the nipple, 20 mm from the inferior margin, 45 mm from the deep cut margin, 50 mm from the superior margin, 85 mm from the medial margin and 100 mm from the lateral cut margin. A 1 - nipple, B 1 - upper outer quadrant, C 1 - upper inner quadrant, D 1 - lower outer quadrant, E 1 - lower inner quadrant, F 1, G 1 - tumour composite blocks, H 1, I 1 - tumour composite blocks, J 1 - deep cut margin, K 1 - superior margin, L 1 – inferior margin, M 4 - lymph nodes, N 4 - lymph nodes, O - 3 serial slices, lymph node, P - 3 lymph nodes. MICROSCOPY This right mastectomy specimen demonstrates an invasive ductal carcinoma with the following pathological features: TUMOUR HISTOLOGY & GRADE The tumour is of an infiltrating poorly differentiated ductal carcinoma of non-otherwise specified type. The tumour is poorly defined and extremely infiltrative, comprising poorly-formed tubules, nests or strands of cuboidal tumour cells displaying high grade nuclei. The tumour cells are set within fibrotic desmoplastic stroma. Many lactiferous ducts are entrapped within the tumour. Frequent tumour mitoses are seen. Microcalcification is seen in some neoplastic tubules. M-80103 M-85003 M-80003 C 50. 9 C 77. 9
What is measured? Sensitivity, specificity, reducibility and confidence l l Of a single code, either (topography) or (morphology) l Of a pair of codes (topography, morphology)
Notation: Adenocarcinoma “M-81403” as the subject code 6 Possibilities Notation The report was coded by the: Reference method Automatic method X: X As X (M-81403) X: O As X (M-81403) As other than X (M-82113) O: X As other than X (M-82003) As X (M-81403) O: O As other than X (M-82003) X : X+O As X (M-81403) As X plus other codes (M-81403, M-80103) O : X+O As other than X (M-82003) As X plus other codes (M-81403, M-82003)
Definitions Notation The report was coded by the: Reference method Automatic method X: X As X X: O As X As other than X O: X As other than X As X O: O As other than X X : X+O As X plus other codes O : X+O As other than X As X plus other codes Note: The labeled areas include only the portions that are bounded by the arcs, not the entire circle. B X: O Q O: O A X: X R O: X Venn Diagram C X: X+O S O: X+O
Definitions Notation The report was coded by the: Reference method Automatic method X: X As X only X: O As X As other than X O: X As other than X As X only O: O As other than X X : X+O As X plus other codes O : X+O As other than X As X plus other codes Note: The labeled areas include only the portions that are bounded by the arcs, not the entire circle. B X: O Q O: O A X: X R O: X Venn Diagram C X: X+O S O: X+O
Definitions Notation The report was coded by the: Reference method Automatic method X: X As X only X: O As X As other than X O: X As other than X As X only O: O As other than X X : X+O As X plus other codes O : X+O As other than X As X plus other codes Venn Diagram
Definitions Notation The report was coded by the: Reference method Automatic method X: X As X only X: O As X As other than X O: X As other than X As X only O: O As other than X X : X+O As X plus other codes O : X+O As other than X As X plus other codes Note: The labeled areas include only the portions that are bounded by the arcs, not the entire circle. B X: O Q O: O A X: X R O: X Venn Diagram C X: X+O S O: X+O
Definitions Notation The report was coded by the: Reference method Automatic method X: X As X only X: O As X As other than X O: X As other than X As X only O: O As other than X X : X+O As X plus other codes O : X+O As other than X As X plus other codes Note: The labeled areas include only the portions that are bounded by the arcs, not the entire circle. B X: O Q O: O A X: X R O: X Venn Diagram C X: X+O S O: X+O
Definitions Notation The report was coded by the: Reference method Automatic method X: X As X only X: O As X As other than X O: X As other than X As X only O: O As other than X X : X+O As X plus other codes O : X+O As other than X As X plus other codes Note: The labeled areas include only the portions that are bounded by the arcs, not the entire circle. B X: O Q O: O A X: X R O: X Venn Diagram C X: X+O S O: X+O
Definitions Notation The report was coded by the: Reference method Automatic method X: X As X only X: O As X As other than X O: X As other than X As X only O: O As other than X X : X+O As X plus other codes O : X+O As other than X As X plus other codes Note: The labeled areas include only the portions that are bounded by the arcs, not the entire circle. B X: O Q O: O A X: X R O: X Venn Diagram C X: X+O S O: X+O
Definitions Coding accuracy measures Note: The labeled areas include only the portions that are bounded by the arcs, not the entire circle. Sensitivity = (A+C) / (A+B+C) How often is the reference code returned in those reports where it is the subject code. B X: O Q O: O A X: X R O: X Specificity = Q / (Q+R+S) C X: X+O S O: X+O How often is the subject code not returned in those reports where it is not the reference code. Reducibility = (A+R) / (A+R+C+S) How often is the subject code the only code identified when the code is identified. Confidence = A / (A+R) How much confidence can we place in the result.
Definitions Coding accuracy measures Note: The labeled areas include only the portions that are bounded by the arcs, not the entire circle. Sensitivity = (A+C) / (A+B+C) How often is the reference code returned in those reports where it is the subject code. B X: O Q O: O A X: X R O: X Specificity = Q / (Q+R+S) C X: X+O S O: X+O How often is the subject code not returned in those reports where it is not the reference code. Reducibility = (A+R) / (A+R+C+S) How often is the subject code the only code identified when the code is identified. Confidence = A / (A+R) How much confidence can we place in the result.
Definitions Coding accuracy measures Note: The labeled areas include only the portions that are bounded by the arcs, not the entire circle. Sensitivity = (A+C) / (A+B+C) How often is the reference code returned in those reports where it is the subject code. B X: O Q O: O A X: X R O: X Specificity = Q / (Q+R+S) C X: X+O S O: X+O How often is the subject code not returned in those reports where it is not the reference code. Reducibility = (A+R) / (A+R+C+S) How often is the subject code the only code identified when the code is identified. Confidence = A / (A+R) How much confidence can we place in the result.
Definitions Coding accuracy measures Note: The labeled areas include only the portions that are bounded by the arcs, not the entire circle. Sensitivity = (A+C) / (A+B+C) How often is the reference code returned in those reports where it is the subject code. B X: O Q O: O A X: X R O: X Specificity = Q / (Q+R+S) C X: X+O S O: X+O How often is the subject code not returned in those reports where it is not the reference code. Reducibility = (A+R) / (A+R+C+S) How often is the subject code the only code identified when the code is identified. Confidence = A / (A+R) How much confidence can we place in the result.
Data Flow Source Data Coded Pathology Reports Reference Codes Automated Coding System Machine Generated Codes Input Data Comparison, Matching and Analysis Accuracy Calculations Display Accuracy Data SQL Data Base With Query Assist CODAC Spread Sheets & Graphs SQL Database Output
Software inputs
CODAC Front End
Software description l Written in C#, uses latest. NET technology l Runs on Standard Pentium workstation l Imports and exports (CSV). Files can be edited with use text editor or Excel l Optional Links to SQL database engine l The performance of any automated coding system can be measured by using the specified data format
Software operation l We ran 17128 pathology reports through the software. l Software automatically calculates accuracy parameters by comparing reference data to test data.
Example of high confidence l l l M-81403 (Adenocarcinoma) Sensitivity 0. 82 Specificity 0. 90 Reducibility 0. 08 Confidence 0. 87 Reference count 2647=15%
Example of high confidence l l l M-81403 , C 61. 9 (Adenocarcinoma, Prostate) Sensitivity 0. 87 Specificity 0. 99 Reducibility 0. 06 Confidence 1. 00 Reference count 1008=6%
Example of low confidence l l l C 44. 9 (Skin) Sensitivity. 57 Specificity. 76 Reducibility. 05 Confidence. 03 Reference count 67 =. 4%
Morphology Accuracy Plots
Code Pairs
An Experiment l Modify Auto. Code to produce output as follows: Take the largest morphology value Take the smallest topography value Example: Reduce M-82403 M-80001 C 17. 0 C 16. 9 C 17. 9 To M-82403 C 16. 9
Morphology – Min. Max rule
Morphology – Before & After
Code Pairs – Before & After
Wrap Up l Created a coding accuracy measurement system l Applied to AIM’s Auto. Code, but can be used to measure any coding system. l Software available to public domain
Topography
Code Pairs – Min Max Rule