PseudoVet - Create Randomizer + Aging Algorithm Proof of Concept Challenge #2

Key Information

Register
Submit
The challenge is finished.

Challenge Overview

Welcome to the “PseudoVet - Create Randomizer + Aging Algorithm Proof of Concept Challenge #2” .

 

Overview

 

PseudoVet is an automated patient data fabrication engine which provides a set of active synthetic patients and clinical data that can be used for healthcare software development. Development against real patient data unnecessarily exposes patient health information (PHI) and personally identifiable information (PII) and cannot be used by developers outside of the VA network. However, fully functional, realistic data sets can be used safely in development, testing, training and other non-production environments in compliance with the Health Information Technology for Economic and Clinical Health Act (HITECH Act) and other regulations. Development against current fabricated data is not useful because the data sets are outdated, which requires development teams to spend time developing data sets to use in lieu of writing code or require licenses and cannot be shared.

 

Challenge Requirements

 

This is a follow up challenge. We had earlier ran this challenge which generated CCDA template based on certain patient data. We would now like to modify the winning submission (provided in forums) from the earlier challenge for an updated dataset.

 

The existing algorithm uses the following data format (CCDA template)

 

PATIENT SEED DATA:

  • Names: Last, First, Middle, Suffixes (last names should be real)

  • Phone

  • Number

  • Email

  • Address

  • Social Security Numbers (Must be valid but from deceased or otherwise assignable for development and testing purposes. Can use 000 or 666)

  • Occupation and Incomes

  • Gender

  • Race

  • Height

  • Weight

  • Religion

  • Language

  • Next of Kin

  • Emergency Contact

  • Smoking and Alcohol History

  • ICN: Local and National

  • Allergies / Drug Sensitivities

  • Medications

  • Vital Signs

  • Military Service

  • Service Connected and Non-Service Connected Disabilities:

  • Locations (Address data)

  • War Eras

    • WWII, Korea, Vietnam, Gulf War

  • ICD-10, SNOMED Diagnosis, DSM and Procedure Codes

  • TIU Note Templates

  • Lab Value Mapping

  • Diagnosis Mapping

  • Family History Data

  • Immunizations

  • Genetic Diagnosis Information

  • Hospitalizations

  • Insurance Data

  • Consent

  • Social History

  • Problem

  • Advance Directives

  • Encounter Data (Outpatient/Inpatient)

  • Appointments

  • Procedures (Clinical, Surgical, Physical)

  • Facility Name

  • Facility Address

  • Primary Care Provider Assignment

  • Providers (Other) Assignment

  • Clinics

  • Consults

  • Referrals

  • Clinical Instructions

  • Medications Administered

  • Location of Admission and Discharge

  • Health Plan Authorization Act

  • Mapping between Lab Values and Diagnosis

  • Mapping between Diagnosis and Procedure Codes

 

HEALTHCARE PROVIDER SEED DATA:

(i.e. Clerical, Nursing, Physician, Radiologist, Social Work, Laboratory, et al)

  • Address

  • Room Numbers

  • Clinic Locations

  • Clinical Hours

  • Clinic Types (Initial)

    • Internal Medicine

    • Primary Care Outpatient

    • Audiology

    • Mental Health

    • ENT

    • Optometry

    • Dental

    • Cardiology

    • Emergency Room

 

We would now want to change the algo to use the following data that we have collected in last couple of challenges (provided in forums)

  • Date of Birth

  • Date of Death (if exists)

  • Gender (M/F/U)

  • Height

  • Weight

  • Language Codes

  • Language Preference Indicator

  • Military Branch (if exists)

  • Military start and end dates (if exists)

  • Military era (if exists, or can be found, else null)

  • Behaviors (Smoking, Drinking etc.)

  • And others as may be determined

Morbidity data for the following wars (provided in forums)

  • World War II

  • Korean Conflict

  • Vietnam War

  • Persian Gulf War

 

The produced dataset MUST

 

  • Incorporate ICD-10 Codes & Sub-classifications

  • Incorporate database inputs from SNOMED, procedure and diagnostic codes, lab values & billing codes

  • Be a data collection with medically relevant identified connection points

 

Algorithm Requirements

 

  • Must use the CCDA template defined above and populate 1000 CCDA records with random but accurate data based on the new veteran and morbidity data

  • Once the 1000 CCDA records are created, the algo must be able to create versions of each record at 5 years, 10 years and 15 years

  • The output of your algorithm must be populated CCDA records with random but accurate data, and then copies of those records at 5 years, 10 years, and 15 years


    Submission Evaluation

    The submissions will be evaluated by the copilot and PM on a scale of 1-10 (5 being the passing score) based on accuracy of data (and copies for 5, 10, 15 years) and compliance to CCDA template format. There will be no appeals or appeals responses. E.g.

    - How well does the CCDA model match the template

    - Does the submission generate data copies for various year durations as required by spec
    - There should be no junk data (e.g. asdfasdfad) for any of the fields.
     



Final Submission Guidelines

 
  • Algorithm code with instructions on how to run the algorithm, input data and receive results

  • Zipped dataset

  • Demo video of how to configure and run your algo

ELIGIBLE EVENTS:

2018 Topcoder(R) Open

REVIEW STYLE:

Final Review:

Community Review Board

Approval:

User Sign-Off

SHARE:

ID: 30061829