Thanks for stopping by and checking out my personal page. I am a clinical fellow in hematology & oncology at Weill Cornell Medicine, NewYork-Presbyterian Hospital. Prior to joining Weill Cornell for fellowship, I completed an MD/PhD from the Johns Hopkins University within the School of Medicine and Whiting School of Engineering, followed by my internal medicine residency at the Mount Sinai Hospital in the Icahn School of Medicine.
My doctoral work (Thesis Advisor: Drew M. Pardoll, MD.,PhD., Co-Mentor: Alexander S. Baras, MD,.PhD.) was at the intersection of artificial intelligence, machine learning, and cancer immunogenomics where I worked on several projects applying and/or developing novel algorithmic approaches to yield insights into the interaction of cancer and the immune system.
Prior to matriculating into the MD/PhD program at Johns Hopkins, I completed a Bachelor of Science in Engineering (BSE) in Biomedical Engineering with a minor in mathematics and certificate of entrepreneurship at the University of Michigan in 2011. Following my undergraduate degree, I went on to complete a Master’s of Science & Engineering in Biomedical Engineering within the Center for Bioengineering Innovation and Design at Johns Hopkins. During this time (read about my experience in CBID here), I developed an interest and focus in medical device design, applying methods in computational mechanics to invent devices such as the CryoPop, a low-cost cryotherapy unit that utilizes readily available carbon dioxide tanks for the treatment of cervical cancer. The CryoPop has been issued a patent from the USPTO, completed its first clinical trial (NCT02367625), is currently enrolling in its second one (NCT04154644) to demonstrate efficacy, and is commercially available for clinical use through Pregna International Ltd.
I also serve on the board of the student chapter of the Coptic Medical Association of North America (CMANA) as its current treasurer. As part of the student chapter, I have helped organized our annual conferences and local events that allow medical trainees at all stages to experience medical missions through the mother organization. Our student chapter also organizes opportunities for networking including online talks/forums, an annual gala, and quarterly newsletter.
When I am not coding away or in the hospital, I enjoy staying physically active (CrossFit, Olympic weightlifting, skiing, and most recently, triathlon), playing and producing music (singing, piano, guitar), and amateur photography. Otherwise, I enjoy visiting new cities, eating good food, and having a fun time with friends, family, & my dog Leo.
I hope this page provides insight into the work I have done and areas of research I hope to continue to pursue in my future career as well as my own personal interests! Feel free to contact me and reach out with any questions or research collaboration opportunities!
Doctorate of Medicine, 2021
Johns Hopkins University School of Medicine
Doctorate of Philosophy in Biomedical Engineering, 2021
Johns Hopkins University Whiting School of Engineering
Master's of Science & Engineering in Biomedical Engineering, 2012
Johns Hopkins University Whiting School of Engineering
Bachelor of Science in Engineering in Biomedical Engineering, 2011
University of Michigan
Deep Learning Nanodegree
Udacity School of AI
… until now.
— 𝐉𝐨𝐡𝐧-𝐖𝐢𝐥𝐥𝐢𝐚𝐦 𝐒𝐢𝐝𝐡𝐨𝐦, 𝐌𝐃, 𝐏𝐡𝐃 (@John_Will_I_Am) November 29, 2023
I could not be more thrilled to have matched at @WeilCornell this year for heme/onc fellowship and fulfill a dream that has been over a decade in the making.
Excited to join @WCM_MeyerCancer being led by the amazing @wolchokj and @merghout
See you in July!!! https://t.co/YfQzbVGk29 pic.twitter.com/gmH8VHdvcW
During my PhD, I was a visiting scholar at @MSFTResearch
— 𝐉𝐨𝐡𝐧-𝐖𝐢𝐥𝐥𝐢𝐚𝐦 𝐒𝐢𝐝𝐡𝐨𝐦, 𝐌𝐃, 𝐏𝐡𝐃 (@John_Will_I_Am) November 3, 2023
We extended the multiple-instance learning framework developed in #DeepTCR to somatic variants in cancer.
Excited to share Aggregation Tool for Genomic Concepts (#ATGC), now out in @natBME https://t.co/BCymfrXAp7
One of my greatest fears upon entering residency was the potential toll it could take on the foundational habits of diet and exercise that I had diligently maintained for many years. I was aware that within a few years, if I didn't intentionally prioritize my physical health, it… pic.twitter.com/qw991kIEQR
— 𝐉𝐨𝐡𝐧-𝐖𝐢𝐥𝐥𝐢𝐚𝐦 𝐒𝐢𝐝𝐡𝐨𝐦, 𝐌𝐃, 𝐏𝐡𝐃 (@John_Will_I_Am) October 2, 2023
First open water 5k swim in the books (1:52), and I could not be more proud to have done it for such a personally meaningful cause (#SAA), being part of an effort that raised 1.8 million dollars today to continue supporting groundbreaking cancer research.
— 𝐉𝐨𝐡𝐧-𝐖𝐢𝐥𝐥𝐢𝐚𝐦 𝐒𝐢𝐝𝐡𝐨𝐦, 𝐌𝐃, 𝐏𝐡𝐃 (@John_Will_I_Am) July 29, 2023
•
8 weeks ago, when I… pic.twitter.com/eUipCHdOGE
5 years ago, I proposed using #DeepLearning to predict response to immunotherapy in TCR-Seq.
— 𝐉𝐨𝐡𝐧-𝐖𝐢𝐥𝐥𝐢𝐚𝐦 𝐒𝐢𝐝𝐡𝐨𝐦, 𝐌𝐃, 𝐏𝐡𝐃 (@John_Will_I_Am) September 16, 2022
🧬🧬🧬🧬
Last year, we published #DeepTCR. Today, I am proud to share its application in cancer immunology, now out in @ScienceAdvances #ScienceResearch
🔶🔷🔶🔷https://t.co/VRxPqwZP5F
It’s not lost on me how special it is to be given this award on 3 separate occasions. Thanks @sitcancer for the continuous and repeated support of my research in the field of cancer immunology. #sitc2021 #sitc21 pic.twitter.com/OtjnYUWdic
— 𝐉𝐨𝐡𝐧-𝐖𝐢𝐥𝐥𝐢𝐚𝐦 𝐒𝐢𝐝𝐡𝐨𝐦, 𝐌𝐃, 𝐏𝐡𝐃 (@John_Will_I_Am) November 12, 2021
In 2017, I attended a talk by @Google at @AACR on #DeepLearning. I realized then the potential for deep learning for analyzing TCR-Seq data & thus, the idea for #DeepTCR was born. 4 years later, our manuscript is now available at @NatureComms https://t.co/Gx6ujCt9ux
— 𝐉𝐨𝐡𝐧-𝐖𝐢𝐥𝐥𝐢𝐚𝐦 𝐒𝐢𝐝𝐡𝐨𝐦, 𝐌𝐒𝐄 (@John_Will_I_Am) March 11, 2021
Deep learning for diagnosis of Acute Promyelocytic Leukemia via recognition of genomically imprinted morphologic features
A deep learning framework for revealing sequence concepts within T-cell repertoires
Synthesizing the worlds of engineering and clinical medicine
A Graphical User Interface for Streamlining Analysis of High-Dimensional Cytometry Data
My first foray into deep learning
A Structural Bioinformatics Tool for T-cell Repertoire Analysis
Enabling Affordable Prevention of Cervical Cancer in the Developing World
Filtering immunogenic wear debris to extend the lifetime of artificial joint replacement
Digital manometry for diagnosis of dyssynergic defecation
Thesis from course in nonlinear dynamics and chaos theory
A dynamic unloading technology to extend the lifetime of artificial joint replacement
Large-scale genomic data are well suited to analysis by deep learning algorithms. However, for many genomic datasets, labels are at the level of the sample rather than for individual genomic measures. Machine learning models leveraging these datasets generate predictions by using statically encoded measures that are then aggregated at the sample level. Here we show that a single weakly supervised end-to-end multiple-instance-learning model with multi-headed attention can be trained to encode and aggregate the local sequence context or genomic position of somatic mutations, hence allowing for the modelling of the importance of individual measures for sample-level classification and thus providing enhanced explainability. The model solves synthetic tasks that conventional models fail at, and achieves best-in-class performance for the classification of tumour type and for predicting microsatellite status. By improving the performance of tasks that require aggregate information from genomic datasets, multiple-instance deep learning may generate biological insight.
T cell receptor (TCR) sequencing has been used to characterize the immune response to cancer. However, most analyses have been restricted to quantitative measures such as clonality that do not leverage the complementarity-determining region 3 (CDR3) sequence. We use DeepTCR, a framework of deep learning algorithms, to reveal sequence concepts that are predictive of response to immunotherapy. We demonstrate that DeepTCR can predict response and use the model to infer the antigenic specificities of the predictive signature and their unique dynamics during therapy. The predictive signature of nonresponse is associated with high frequencies of TCRs predicted to recognize tumor-specific antigens, and these tumor-specific TCRs undergo a higher degree of dynamic changes on therapy in nonresponders versus responders. These results are consistent with a biological model where the hallmark of nonresponders is an accumulation of tumor-specific T cells that undergo turnover on therapy, possibly because of the dysfunctional state of these T cells in nonresponders.
SARS-CoV-2 infection is characterized by a highly variable clinical course with patients experiencing asymptomatic infection all the way to requiring critical care support. This variation in clinical course has led physicians and scientists to study factors that may predispose certain individuals to more severe clinical presentations in hopes of either identifying these individuals early in their illness or improving their medical management. We sought to understand immunogenomic differences that may result in varied clinical outcomes through analysis of T-cell receptor sequencing (TCR-Seq) data in the open access ImmuneCODE database. We identified two cohorts within the database that had clinical outcomes data reflecting severity of illness and utilized DeepTCR, a multiple-instance deep learning repertoire classifier, to predict patients with severe SARS-CoV-2 infection from their repertoire sequencing. We demonstrate that patients with severe infection have repertoires with higher T-cell responses associated with SARS-CoV-2 epitopes and identify the epitopes that result in these responses. Our results provide evidence that the highly variable clinical course seen in SARS-CoV-2 infection is associated to certain antigen-specific responses.
Acute promyelocytic leukemia (APL) is a subtype of acute myeloid leukemia (AML), classified by a translocation between chromosomes 15 and 17 [t(15;17)], that is considered a true oncologic emergency though appropriate therapy is considered curative. Therapy is often initiated on clinical suspicion, informed by both clinical presentation as well as direct visualization of the peripheral smear. We hypothesized that genomic imprinting of morphologic features learned by deep learning pattern recognition would have greater discriminatory power and consistency compared to humans, thereby facilitating identification of t(15;17) positive APL. By applying both cell-level and patient-level classification linked to t(15;17) PML/RARA ground-truth, we demonstrate that deep learning is capable of distinguishing APL in both discovery and prospective independent cohort of patients. Furthermore, we extract learned information from the trained network to identify previously undescribed morphological features of APL. The deep learning method we describe herein potentially allows a rapid, explainable, and accurate physician-aid for diagnosing APL at the time of presentation in any resource-poor or -rich medical setting given the universally available peripheral smear.
Deep learning algorithms have been utilized to achieve enhanced performance in pattern-recognition tasks. The ability to learn complex patterns in data has tremendous implications in immunogenomics. T-cell receptor (TCR) sequencing assesses the diversity of the adaptive immune system and allows for modeling its sequence determinants of antigenicity. We present DeepTCR, a suite of unsupervised and supervised deep learning methods able to model highly complex TCR sequencing data by learning a joint representation of a TCR by its CDR3 sequences and V/D/J gene usage. We demonstrate the utility of deep learning to provide an improved ‘featurization’ of the TCR across multiple human and murine datasets, including improved classification of antigen-specific TCRs and extraction of antigen-specific TCRs from noisy single-cell RNA-Seq and T-cell culture-based assays. Our results highlight the flexibility and capacity for deep neural networks to extract meaningful information from complex immunogenomic data for both descriptive and predictive purposes.
With the advent of flow cytometers capable of measuring an increasing number of parameters, scientists continue to develop larger panels to phenotypically explore characteristics of their cellular samples. However, these technological advancements yield high-dimensional data sets that have become increasingly difficult to analyze objectively within traditional manual-based gating programs. In order to better analyze and present data, scientists partner with bioinformaticians with expertise in analyzing high-dimensional data to parse their flow cytometry data. While these methods have been shown to be highly valuable in studying flow cytometry, they have yet to be incorporated in a straightforward and easy-to-use package for scientists who lack computational or programming expertise. To address this need, we have developed ExCYT, a MATLAB-based Graphical User Interface (GUI) that streamlines the analysis of high-dimensional flow cytometry data by implementing commonly employed analytical techniques for high-dimensional data including dimensionality reduction by t-SNE, a variety of automated and manual clustering methods, heatmaps, and novel high-dimensional flow plots. Additionally, ExCYT provides traditional gating options of select populations of interest for further t-SNE and clustering analysis as well as the ability to apply gates directly on t-SNE plots. The software provides the additional advantage of working with either compensated or uncompensated FCS files. In the event that post-acquisition compensation is required, the user can choose to provide the program a directory of single stains and an unstained sample. The program detects positive events in all channels and uses this select data to more objectively calculate the compensation matrix. In summary, ExCYT provides a comprehensive analysis pipeline to take flow cytometry data in the form of FCS files and allow any individual, regardless of computational training, to use the latest algorithmic approaches in understanding their data.
Despite a dramatic increase in T-cell receptor (TCR) sequencing, few approaches biologically parse the data in a fashion that both helps yield new information about immune responses and may guide immunotherapeutic interventions. To address this issue, we developed a method, ImmunoMap, that utilizes a sequence analysis approach inspired by phylogenetics to examine TCR repertoire relatedness. ImmunoMap analysis of the CD8 T-cell response to self-antigen (Kb-TRP2) or to a model foreign antigen (Kb-SIY) in naïve and tumor-bearing B6 mice showed differences in the T-cell repertoire of self- versus foreign antigen-specific responses, potentially reflecting immune pressure by the tumor, and also detected lymphoid organ–specific differences in TCR repertoires. When ImmunoMap was used to analyze clinical trial data of tumor-infiltrating lymphocytes from patients being treated with anti–PD-1, ImmunoMap, but not standard TCR sequence analyses, revealed a clinically predicative signature in pre- and posttherapy samples. Cancer Immunol Res; 6(2); 151–62. ©2017 AACR.
A device for providing a cryotherapy ablation treatment includes a piping assembly and a snow horn adapted to create a spray of snow from a pressurized source of a low-temperature liquid, a tubular applicator for collecting a mass of snow at a prescribed density that is sufficient to allow the mass to serve as the needed, low temperature, thermal reservoir for the device after the applicator’s distal end has been disconnected from the snow horn end so that it can to be used during the treatment process, and an applicator tip adapted to allow it to connect to the applicator’s distal end and be used to treat those specific locations which are to receive this treatment.
Be Still, My Soul