A data scientist in cancer proteogenomics

My passion is cancer research


Hello, and welcome to my online portfolio!

My passion is cancer research. Specifically I want to transform clinical practice through an improved understanding of this disease. I believe that an improved understanding of cancer will increasingly lead to personalized treatments. To support this vision, I have recently started an independent fellowship between the University of Gdansk and the University of Edinburgh in cancer vaccine science at the International Centre for Cancer Vaccine Science (Gdansk, Poland).

I am a data scientist, specialized in cancer genomics and proteomics (proteogenomics). I develop algorithms aimed at the identification of antigens presented on the cancer cell surface. I help to prioritize which of these antigens would make good cancer vaccines. I analyze very large datasets to create clinically useful diagnostics that will help personalize therapy. I collaborate with biophysicists and technologists working on cutting edge technologies in genomics or proteomics to support their translation to clinical research. While I investigate all types of cancers, my current focus is in oral adenocarcinomas, sarcomas and ovarian cancers.

To summarize, I support cancer vaccine research at the ICCVS through these three research arms:

Algorithm and Pipeline Development

Building and benchmarking pipelines for the identification of neoantigens presented on the surface of cancer cells

Clinical Proteogenomics

Neoantigen prioritization based on proteogenomic datasets derived from patient cohorts

Applying emerging technologies

Identifying emerging technologies in genomics and proteomics and supporting their initial application to cancer vaccine science.

Algorithm and pipeline development

At the intersection of proteomics and genomics, proteogenomics represents one of the last frontiers of integrative -omics. There is an abundance of fresh questions in this field, all relating to how technological improvements in genomics and transcriptomics can improve our understanding of the expressed human proteome. Within this field, I focus on the detection of aberrant gene products at the peptide level. Borrowing from lessons learned utilizing proteomic data to refine genome models, I benchmarking existing strategies and apply these methods in the analysis of patient tumours.

I am developing and improving pipelines for the proteogenomic identification of protein variants within mass-spectrometry datasets. I am using a mixed supervised and semi-supervised approach to probe mass-spectrometry datasets. Standard mass-spectrometry search algorithms rely on matching MS2 spectra to peptides within a databases of proteins expected within the sample. Identifying variant peptides involves finding the best ways to generate such databases by leveraging on matched genomics and transcriptomics datasets. Alternatively, variant peptides harboring SNV and short indel mutations can be identified by a so-called 'open-search' approach, which searches for small modifications to existing peptides.

These variant peptides, if presented on the cell surface could make excellent neoantigen targets in cancer vaccine research. I am actively seeking students interested in developing novel algorithms for the detection of protein variants within mass-spectrometry datasets. Students should have an algorithmic/computer Science focus.

Clinical Proteogenomics

The clinical relevance of immune cells in the control of human cancers is now well established. However, the identification of tumour-specific antigens that allow the immune system to differentiate cancer cells from normal cells remains a challenge. To be immunogenic, somatic mutations must give rise to peptides that are processed and bind to any of the major histocompatibility complex (MHC) class I or class II allelic products in the patient. Breakthroughs in genomics and proteomics have made it possible to discover recurring and patient-specific neoantigens arising as a consequence of tumor-specific mutations. However, the fraction of somatic mutations yielding an epitope in any patient is low, as is the fraction of the population expected to present a recurring mutation. Hence, the prioritization of which neoantigens to characterize is essential for the success of cancer vaccine science and relies on large clinical cohorts and the development of bioinformatics pipelines.

At the International Centre for Cancer Vaccine research, I am conducting proteogenomic screens in oral adenocarcinoma, sarcoma, ovarian cancer, Renal Cell Carcinoma. It is my goal to understand how the process of antigen presentation is perturbed in cancer and to find new ways to prioritize neoantigen discovery.

Emerging Technologies in Proteomics

Two things are clear after initial attempts to identify variant proteins. First, as proteomic technologies improve, integrative –omic studies will undoubtedly improve molecular diagnostics. Second, that complete sequence coverage of the human proteome remains elusive with between 15-30% sequence coverage per protein identified by global proteomics. This low sequence coverage impedes the characterization of cancer-relevant proteoforms from global proteomes. We seek new, more sensitive emerging technologies in proteomics. These technologies require computational innovation as they are initially applied in a clinical setting, and I aim to forge this link.

Dr. Javier Alfaro

Dr. Javier Alfaro uses a variety of different molecular profiling strategies to interrogate cancers. Javier believes that while cancer is a disease of the genome, the selection pressures that support the development of cancer hallmarks act on the expressed phenotypic traits of the cancer cell. This means that a comprehensive molecular portrait of the tumour must include the proteome and the metabolome, which are significant endpoints of gene processing. Javier’s most recent work has focused on the interrogation of proteomic data to identify mutations within important cancer genes. He uses predictions from genomics about proteome content alongside algorithmic strategies for mass-spectrometry based mutation identification to guide his analysis. He has a PhD in Medical Biophysics (University of Toronto), specializing in computational proteomics, a Masters in Biochemistry specializing in bioinformatics (Dalhousie University) and a double major in Biochemistry and Computer Science (University of Victoria).

Publications and CV
  1. 1. Alfaro JA, Ignatchenko A, Ignatchenko V, Sinha A, Boutros PC and Kislinger T. Detectng protein variants by mass-spectrometry: A comprehensive study in cancer cell-lines. 2017. Genome Medicine. 9:62. (DOI: 10.1186/s13073-017-0454-9)
  2. 2. Sinha A, Alfaro JA, and Kislinger T. 2017. Characterization of protein content present in exosomes isolated from conditioned media and urine. Current Protocols in Protein Science. 87:24.9.1-24.9.12.
  3. 3. Alfaro JA, Sinha A, Kislinger T and Boutros PC. Onco-proteogenomics: cancer proteomics joins forces with genomics. 2014. Nature Methods. 11(11):1107-13.
  4. 4. Weinreb I, Piscuoglio S, Martelotto LG, Waggott D, Ng CK, Perez-Ordonez B, Harding NJ, Alfaro JA, et al. Hotspot activating PRKD1 somatic mutations in polymorphous low-grade adenocarcinomas of the salivary glands. 2014. Nature Genetics. 46(11):1166-9.
  5. 5. Planello AC, Ji J, Sharma V, Singhania R, Mbabaali F, Müller F, Alfaro JA, Bock C, De Carvalho DD and Batada NN. Aberrant DNA methylation reprogramming during induced pluripotent stem cell generation is dependent on the choice of reprogramming factors. 2014. Cell Regeneration. 3(1):1.
  6. 6. Johal AR, Blackler RJ, Alfaro JA, Schuman B, Borisova S and Evans SV. pH-induced conformational changes in human ABO(H) blood group glycosyltransferases confirm the importance of electrostatic interactions in the formation of the semi-closed state. 2014. Glycobiology. 24(3):237-46.
  7. 7. Johal AR, Schuman B, Alfaro JA, Borisova S, Seto NOL and Evans SV. Sequence-dependent effects of cryoprotectants on the active sites of the human ABO(H) blood group A and B glycosyltransferases. 2012. Acta Cryst. D68, 268–276.
  8. 8. Alfaro JA, Zheng RB, Persson M, Letts JA, Polakowski R, Bai Y, Borisova SN, Seto NO, Lowary TL, Palcic MM and Evans SV. ABO(H) blood group A and B glycosyltransferases recognize substrate via specific conformational changes. 2008. J Biol Chem. 283(15):10097-10108.

Join ICCVS Bioinformatics

I hope to help the next generation of researchers to use computational techniques with the ultimate goal of curing cancer. Students joining ICCVS will have the opportunity to complete joint PhD degrees between the University of Edinburgh (Scotland) and the University of Gdansk (Poland). Students joining ICCVS Bioinformatics will be co-supervised by Dr. Ted Hupp (University of Edinburgh) and Dr. Javier Alfaro (Gdansk) and can cover any of the projects suggested above. Students with their own project ideas are encouraged to bring up their ideas. Students will have the opportunity to spend a portion of their PhD degrees abroad to network and facilitate their research projects.

I encourage all interested students to contact me at the contact provided below.

Get In Touch

E-mail: Javier.Alfaro AT ug.edu.pl