Software similarity and classification is an emerging topic with wide
applications. It is applicable to the areas of malware detection,
software theft detection, plagiarism detection, and software clone
detection. Extracting program features, processing those features into
suitable representations, and constructing distance metrics to define
similarity and dissimilarity are the key methods to identify software
variants, clones, derivatives, and classes of software. Software
Similarity and Classification reviews the literature of those core
concepts, in addition to relevant literature in each application and
demonstrates that considering these applied problems as a similarity and
classification problem enables techniques to be shared between areas.
Additionally, the authors present in-depth case studies using the
software similarity and classification techniques developed throughout
the book.