Rayan Chikhi rayan.chikhi@univ-lille1.fr @RayanChikhi I am a CNRS researcher in bioinformatics at University of Lille 1, France. My current work is mainly related to DNA sequencing. Recently, I contributed to the assembly of the giraffe genome and the gorilla Y-chromosome. Short bio: I studied Computer Science at ENS Rennes and obtained a PhD under the supervision of D. Lavenier. After a postdoc at Penn State in P. Medvedev's lab, CNRS hired me as a junior researcher in 2014. I am currently part of the Bonsai bioinformatics team. Research interests Genome analysis Algorithms and data structures De novo assembly
Rayan Chikhi

Software Minia assembler Whole genome de novo assembler with very low memory usage, described in [11]. Kmergenie Automatic detection of the k-mer size for de novo assembly, described in [14]. DSK K-mer counting software, low-memory, low disk usage, supports large values of k, described in [13]. BCALM 2 Very scalable de Bruijn graph compaction, described in [24]. GATB Library C++ library for the development of reference-free Illumina data analysis software, described in [17]. Publications [24] R. Chikhi, A. Limasset, P. Medvedev, Compacting de Bruijn graphs from sequencing data quickly and in low memory, ISMB (2016) [PDF] [23] M. Agaba et al., Giraffe genome sequence reveals clues to its unique morphology and physiology, Nature Communications (2016) [PDF] [22] M. Tomaszkiewicz et al., A time- and cost-effective strategy to sequence mammalian Y Chromosomes: an application to the de novo assembly of gorilla Y, Genome Research (2016) [PDF] [21] K. Sahlin, R. Chikhi, L. Arvestad, Genome scaffolding with PE-contaminated mate-pair libraries, WABI (2015) [Open-access] [20] R. Chikhi, P. Medvedev, M. Milanic, S. Raskhodnikova, On the readability of overlap digraphs, CPM (2015) [Open-access] [19] R. Uricaru et al., Reference-free detection of isolated SNPs, Nucleic Acids Research (2014) [Open-access] [Webpage] [18] G. Rizk, A. Gouin, R. Chikhi, C. Lemaitre, MindTheGap: integrated detection and assembly of short and long insertions, Bioinformatics (2014) [Open-access] [Webpage] [17] E. Drezen et al., GATB: Genome Assembly & Analysis Tool Box, Bioinformatics (2014) [Open-access] [Webpage] [16] R. Chikhi, A. Limasset, S. Jackman, J. Simpson, P. Medvedev, On the representation of de Bruijn graphs, RECOMB (2014) [PDF] [15] K. R. Bradnam et al., Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species, GigaScience (2013) [PDF] [14] R. Chikhi, P. Medvedev, Informed and Automated k-Mer Size Selection for Genome Assembly, Bioinformatics (2013), HiTSeq (2013) Best Paper Award [PDF] [Webpage] [13] G. Rizk, D. Lavenier, R. Chikhi, DSK: k-mer counting with very low memory usage, Bioinformatics (2013) [PDF] [Webpage] [12] N. Maillet, C. Lemaitre, R. Chikhi, D. Lavenier, P. Peterlongo, Compareads: comparing huge metagenomic experiments, RECOMB Comparative Genomics (2012) [PDF] [Webpage] [11] R. Chikhi, G. Rizk. Space-efficient and exact de Bruijn graph representation based on a Bloom filter, WABI (2012) [PDF] [Webpage] [10] P. Peterlongo, R. Chikhi, Mapsembler, targeted and micro assembly of large NGS datasets on a desktop computer, BMC Bioinformatics (2012) [PDF] [Webpage] [9] G. Sacomoto et al., KisSplice: de novo calling alternative splicing events from RNA-seq data, RECOMB-seq, BMC Bioinformatics (2012) [PDF] [Webpage] [8] D. A. Earl et al., Assemblathon 1: A competitive assessment of de novo short read assembly methods, Genome Research (2011) [PDF] [7] G. Chapuis, R. Chikhi, D. Lavenier, Parallel and memory-efficient reads indexing for genome assembly, PPAM Parallel Bio-Computing Workshop (2011) [PDF] [6] R. Chikhi, D. Lavenier, Localized genome assembly from reads to scaffolds: practical traversal of the paired string graph, WABI (2011) [PDF] [5] R. Chikhi, L. Sael, D. Kihara, Protein binding ligand prediction using moment-based methods, Protein function prediction for omics era, D. Kihara ed., Springer (2011) [PDF] [4] D. Kihara, L. Sael, R. Chikhi, J. Esquivel-Rodriguez, Molecular surface representation using 3D Zernike descriptors for protein shape comparison and docking, Curr. Protein and Peptide Science (2010) [PDF] [3] R. Chikhi, L. Sael, D. Kihara, Real-time ligand binding pocket database search using local surface descriptors. Proteins: Structure, Function, and Bioinformatics (2010) [PDF] [2] R. Chikhi, D. Lavenier, Paired-end read length lower bounds for genome re-sequencing (Meeting Abstract) BMC Bioinformatics (2009) [PDF] [1] R. Chikhi, S. Derrien, A. Noumsi, P. Quinton, Combining flash memory and FPGAs to efficiently implement a massively parallel algorithm for content-based image retrieval, International Journal of Electronics (2008) [PDF] Talks ISMB, 2016, Compacting de Bruijn graphs from sequencing data quickly and in low memory [PDF] ALEA, 2016, On the representation of de Bruijn graphs (focusing on navigational data structures) [PDF] SMPGD keynote, 2016, de Bruijn graphs of sequencing data [PDF] Evomics Workshop on Genomics, 2016, de novo assembly [PDF] [Lab] RECOMB, 2014, On the representation of de Bruijn graphs [PDF] Evomics Workshop on Genomics, 2014, de novo assembly [PDF] [Blog post] [Lab] ISMB/HiTSeq, 2013, Informed and Automated k-Mer Size Selection for Genome Assembly [PDF] Evomics Workshop on Genomics, 2013, de novo assembly (introduction) [PDF] WABI, 2012, Space-efficient and exact de Bruijn graph representation based on a Bloom filter [PDF] Thesis slides, 2012, Computational methods for de novo assembly of NGS data [PDF] WABI, 2011, Localized genome assembly from reads to scaffolds: practical traversal of the paired string graph [PDF] IBL, 2011, de novo assembly tools, Monument, Mapsembler [PDF] ISCBSC, 2009, Paired-end read length lower bounds for genome re-sequencing [PDF] Reports R. Chikhi, Computational Methods for de novo Assembly of Next-Generation Genome Sequencing Data, PhD Thesis, 2008-2012 [PDF] Summary: We discuss computational methods (theoretical models and algorithms) to perform the reconstruction (de novo assembly) of DNA sequences produced by high-throughput sequencers. This thesis introduces the following contributions - quantification of the maximum theoretical genome coverage achievable by recent sequencing data (Chapter 2) - theoretical models for paired-end assembly (Chapter 3) - two concepts for practical assembly: localized assembly and memory-efficient paired reads indexing (Chapter 4) - implementation details of a de novo assembly software, the Monument assembler (Chapter 5) - an algorithm that enumerates variants in sequencing data, implemented in the Mapsembler software (Chapter 6) R. Chikhi, Study of Unentanglement in Quantum Computing, Manuscript, research internship at MIT, Spring 2008 [PDF] Summary: We investigate the conjecture that one cannot simulate QMA(2) protocols in QMA using a quantum operation called a disentangler. Our results show that, when exponential precision is required, this conjecture holds unless P = NP. Moreover, also in the exponential precision case, we show that one only needs a stronger hypothesis to prove the conjecture. R. Chikhi, Protein surface descriptors for binding sites comparison and ligand prediction, Manuscript, research internship at Purdue University, Summer 2007 [PDF] Summary: We present a model for two dimensional ligand binding pockets representation and we apply it to pocket-pocket matching and binding ligand prediction. Retired software Mapsembler Targeted assembly on a desktop computer, see reference [10]. Paired reads repetitions Software package for computing the ratio of single and paired (as in paired NGS reads) exact repetitions within a genome. Useful for obtaining re-sequencing lower bounds inspired by [Whiteford 05]. See [2] and the corresponding talk for sample results and details. Monument Whole genome de novo assembler, described in [6] and [7] and [Phd Thesis]. (recommended instead: Minia) de Bruijn graph construction Hash table-free implementation of the de Bruijn graph for a set of reads. Also includes a tool that computes the union of two de Bruijn graphs and the cartesian product of abundances, useful for construction a multi-dataset de Bruijn graph. (recommended instead: BCALM 2) Pocket-Surfer Protein ligand binding pocket type prediction using a database of known binding sites. See [3] for more details.(recommended instead: 3D-Surfer)