Early Detection of Lung Cancer

/, Data Science, Healthcare, Web Development/Early Detection of Lung Cancer
Early Detection of Lung Cancer 2017-11-16T19:34:54+00:00

Project Description

Project Brief

The goal was to develop prediction models of malignancy risk of abnormalities (aka nodules) found in patients’ lungs. The predictive models are based on a diverse set of machine learning algorithms whose inputs are various measured characteristics of the nodules and patient demographics, such as:

  • Size
  • Shape
  • Density
  • Volume
  • Patient demographics such as years smoking, age, gender, race, etc.

In addition, computer vision algorithms were developed to extract these characteristics directly and ‘auto-magically’ from the images so as to automate the detection process. The core technologies were then deployed to a cloud based infrastructure providing a software-as-a-service (behind web APIs) which physicians could access as an aide in the detection of cancer and the quantification of malignancy risk for a given nodule.

Finally, plugins for multiple DICOM viewers were developed to facilitate physician interaction with said services.

Skills Needed

A diverse set of skills were needed for this project as it spanned multiple technologies (java, python, javascript, cloud services, desktop application development, etc) and disciplines (computer vision, web api development, project coordination) requiring the coordination of a team of more than ten individuals.

Project Planning 98%
Data Science 92%
Computer Vision 98%
Web Development 96%


Initial Concept Planning

The goals were ambitious. The key was isolating the essential and estimating what could be done with given the risk appetite of the client. It was agreed upon that five distinct classes of machine learning models would be applied to the cancer risk and prediction problem in order to account for the distinct classes of nonlinearities that could be hiding in the data. It was also agreed that four distinct classes of features would be auto-extracted directly from the images using a set of computer vision algorithms at our disposal.

Drafts & Revisions

A prototype of the system was rapidly developed which illustrated a proof of concept of the automated feature extraction. The most accessible machine learning models were implemented and compared to benchmarks in the preexisting literature. As progress looked quite favorable, further work was approved.

Final Delivery

The system was delivered in a cloud based infrastructure which allowed for usage of the system as a software for a service. Due to the ambitious and experimental nature of the project it was agreed that there would be no predetermined end, rather a virtuous cycle of iterative improvements.

Contact Us