Rayan Chikhirayan.chikhi@univ-lille1.fr @RayanChikhi I am a CNRS researcher in bioinformatics at University of Lille 1, France. My current work is mainly related to DNA sequencing. Recently, I contributed to the assembly of the giraffe genome and the gorilla Y-chromosome. Short bio: I studied Computer Science at ENS Rennes and obtained a PhD under the supervision of D. Lavenier. After a postdoc at Penn State in P. Medvedev's lab, CNRS hired me as a junior researcher in 2014. I am currently part of the Bonsai bioinformatics team.Research interestsGenome analysis Algorithms and data structures De novo assembly

SoftwareMinia assemblerWhole genome de novo assembler with very low memory usage, described in [11].KmergenieAutomatic detection of the k-mer size for de novo assembly, described in [14].DSKK-mer counting software, low-memory, low disk usage, supports large values of k, described in [13].BCALM 2Very scalable de Bruijn graph compaction, described in [24].GATB LibraryC++ library for the development of reference-free Illumina data analysis software, described in [17].Publications[24] R. Chikhi, A. Limasset, P. Medvedev,, ISMB (2016) [PDF] [23] M. Agaba et al.,Compacting de Bruijn graphs from sequencing data quickly and in low memory, Nature Communications (2016) [PDF] [22] M. Tomaszkiewicz et al.,Giraffe genome sequence reveals clues to its unique morphology and physiology, Genome Research (2016) [PDF] [21] K. Sahlin, R. Chikhi, L. Arvestad,A time- and cost-effective strategy to sequence mammalian Y Chromosomes: an application to the de novo assembly of gorilla Y, WABI (2015) [Open-access] [20] R. Chikhi, P. Medvedev, M. Milanic, S. Raskhodnikova,Genome scaffolding with PE-contaminated mate-pair libraries, CPM (2015) [Open-access] [19] R. Uricaru et al.,On the readability of overlap digraphs, Nucleic Acids Research (2014) [Open-access] [Webpage] [18] G. Rizk, A. Gouin, R. Chikhi, C. Lemaitre,Reference-free detection of isolated SNPs, Bioinformatics (2014) [Open-access] [Webpage] [17] E. Drezen et al.,MindTheGap: integrated detection and assembly of short and long insertions, Bioinformatics (2014) [Open-access] [Webpage] [16] R. Chikhi, A. Limasset, S. Jackman, J. Simpson, P. Medvedev,GATB: Genome Assembly & Analysis Tool Box, RECOMB (2014) [PDF] [15] K. R. Bradnam et al.,On the representation of de Bruijn graphs, GigaScience (2013) [PDF] [14] R. Chikhi, P. Medvedev,Assemblathon 2: evaluating de novo methods of genome assembly in three vertebrate species, Bioinformatics (2013), HiTSeq (2013) Best Paper Award [PDF] [Webpage] [13] G. Rizk, D. Lavenier, R. Chikhi,Informed and Automated k-Mer Size Selection for Genome Assembly, Bioinformatics (2013) [PDF] [Webpage] [12] N. Maillet, C. Lemaitre, R. Chikhi, D. Lavenier, P. Peterlongo,DSK: k-mer counting with very low memory usage, RECOMB Comparative Genomics (2012) [PDF] [Webpage] [11] R. Chikhi, G. Rizk.Compareads: comparing huge metagenomic experiments, WABI (2012) [PDF] [Webpage] [10] P. Peterlongo, R. Chikhi,Space-efficient and exact de Bruijn graph representation based on a Bloom filter, BMC Bioinformatics (2012) [PDF] [Webpage] [9] G. Sacomoto et al.,Mapsembler, targeted and micro assembly of large NGS datasets on a desktop computer, RECOMB-seq, BMC Bioinformatics (2012) [PDF] [Webpage] [8] D. A. Earl et al.,KisSplice: de novo calling alternative splicing events from RNA-seq dataGenome Research (2011) [PDF] [7] G. Chapuis, R. Chikhi, D. Lavenier,Assemblathon 1: A competitive assessment of de novo short read assembly methods,, PPAM Parallel Bio-Computing Workshop (2011) [PDF] [6] R. Chikhi, D. Lavenier,Parallel and memory-efficient reads indexing for genome assembly, WABI (2011) [PDF] [5] R. Chikhi, L. Sael, D. Kihara,Localized genome assembly from reads to scaffolds: practical traversal of the paired string graph, Protein function prediction for omics era, D. Kihara ed., Springer (2011) [PDF] [4] D. Kihara, L. Sael, R. Chikhi, J. Esquivel-Rodriguez,Protein binding ligand prediction using moment-based methods, Curr. Protein and Peptide Science (2010) [PDF] [3] R. Chikhi, L. Sael, D. Kihara,Molecular surface representation using 3D Zernike descriptors for protein shape comparison and dockingProteins: Structure, Function, and Bioinformatics (2010) [PDF] [2] R. Chikhi, D. Lavenier,Real-time ligand binding pocket database search using local surface descriptors.(Meeting Abstract) BMC Bioinformatics (2009) [PDF] [1] R. Chikhi, S. Derrien, A. Noumsi, P. Quinton,Paired-end read length lower bounds for genome re-sequencing, International Journal of Electronics (2008) [PDF]Combining flash memory and FPGAs to efficiently implement a massively parallel algorithm for content-based image retrievalTalksISMB, 2016,[PDF] ALEA, 2016,Compacting de Bruijn graphs from sequencing data quickly and in low memory(focusing on navigational data structures) [PDF] SMPGD keynote, 2016,On the representation of de Bruijn graphs[PDF] Evomics Workshop on Genomics, 2016,de Bruijn graphs of sequencing data[PDF] [Lab] RECOMB, 2014,de novo assembly[PDF] Evomics Workshop on Genomics, 2014,On the representation of de Bruijn graphs[PDF] [Blog post] [Lab] ISMB/HiTSeq, 2013,de novo assembly[PDF] Evomics Workshop on Genomics, 2013,Informed and Automated k-Mer Size Selection for Genome Assembly[PDF] WABI, 2012,de novo assembly (introduction)[PDF] Thesis slides, 2012,Space-efficient and exact de Bruijn graph representation based on a Bloom filter[PDF] WABI, 2011,Computational methods for de novo assembly of NGS data[PDF] IBL, 2011,Localized genome assembly from reads to scaffolds: practical traversal of the paired string graph[PDF] ISCBSC, 2009,de novo assembly tools, Monument, Mapsembler[PDF]Paired-end read length lower bounds for genome re-sequencingReportsR. Chikhi,, PhD Thesis, 2008-2012 [PDF] Summary: We discuss computational methods (theoretical models and algorithms) to perform the reconstruction (de novo assembly) of DNA sequences produced by high-throughput sequencers. This thesis introduces the following contributions - quantification of the maximum theoretical genome coverage achievable by recent sequencing data (Chapter 2) - theoretical models for paired-end assembly (Chapter 3) - two concepts for practical assembly: localized assembly and memory-efficient paired reads indexing (Chapter 4) - implementation details of a de novo assembly software, the Monument assembler (Chapter 5) - an algorithm that enumerates variants in sequencing data, implemented in the Mapsembler software (Chapter 6) R. Chikhi,Computational Methods for de novo Assembly of Next-Generation Genome Sequencing Data, Manuscript, research internship at MIT, Spring 2008 [PDF] Summary: We investigate the conjecture that one cannot simulate QMA(2) protocols in QMA using a quantum operation called a disentangler. Our results show that, when exponential precision is required, this conjecture holds unless P = NP. Moreover, also in the exponential precision case, we show that one only needs a stronger hypothesis to prove the conjecture. R. Chikhi,Study of Unentanglement in Quantum Computing, Manuscript, research internship at Purdue University, Summer 2007 [PDF] Summary: We present a model for two dimensional ligand binding pockets representation and we apply it to pocket-pocket matching and binding ligand prediction.Protein surface descriptors for binding sites comparison and ligand predictionRetired softwareMapsemblerTargeted assembly on a desktop computer, see reference [10].Paired reads repetitionsSoftware package for computing the ratio of single and paired (as in paired NGS reads) exact repetitions within a genome. Useful for obtaining re-sequencing lower bounds inspired by [Whiteford 05]. See [2] and the corresponding talk for sample results and details.MonumentWhole genome de novo assembler, described in [6] and [7] and [Phd Thesis]. (recommended instead: Minia)de Bruijn graph constructionHash table-free implementation of the de Bruijn graph for a set of reads. Also includes a tool that computes the union of two de Bruijn graphs and the cartesian product of abundances, useful for construction a multi-dataset de Bruijn graph. (recommended instead: BCALM 2)Pocket-SurferProtein ligand binding pocket type prediction using a database of known binding sites. See [3] for more details.(recommended instead: 3D-Surfer)