The Distant Segments Kernel: a tutorial
This post is a tutorial on how to use the distant segments kernel. String kernels were recently introduced as a more precise way to perform pairwise comparisons. First, I will define the concept of kernels. A kernel is a function that takes two objects in an input space and multiply them by mapping them to a vectorial feature space. A kernel associates a real number to a pair of instances. String kernels are a particular case of kernels. The input space of string kernels is the set of strings. Recall that strings are sequences generated with a given alphabet. The distant segments kernel is a string kernel. For the distant segments kernel, the feature vector associated to a string is the distribution of its distant segments. See this paper for more details. In this tutorial, command lines are shown in red. First, download the source code. seb@ubuntu:~$ wget http://boisvert.info/software/PermutationDSKernel.cpp This software performs the kernel matrix computation of a set of stri...