Historical linguistics (also diachronic linguistics or comparative linguistics) is primarily the study of the ways in which languages change over time, by means of examining languages which are recognizably related through similarities such as vocabulary, word formation, and syntax, as well as the surviving records of ancient languages. Historical linguistics aims to classify the world's languages by their genetic affiliations and to trace the historic development of languages. Modern historical linguistics grew out of the earlier discipline of philology, the study of ancient texts and documents. In its early years, historical linguistics focused on the well-known Indo-European languages; but since then, significant comparative linguistic work has been done on the Uralic languages, Austronesian languages and various families of Native American languages, among many others.
Languages change over time. What were once dialects of the same language may eventually diverge enough that they are no longer mutually intelligible. They have become separate languages.
One method to illustrate the relationship between such divergent yet related languages is to construct family trees, an idea pioneered by the 19th century historical linguist August Schleicher. The basis for the trees is the comparative method: languages presumed to be related are compared with one another, and linguists look for regular sound correspondences based on what is generally known about how languages can change, and use them to reconstruct the best hypothesis about the nature of the common ancestor language from which the attested languages are descended.
Use of the comparative method is validated by its application to languages whose common ancestor is known. Thus, when the method is applied to the Romance languages (which include French, Spanish, Portuguese, Italian, and Romanian), the reconstructed common ancestor language comes out rather similar to Latin - not the classical Latin of Horace and Cicero, but Vulgar Latin, the colloquial Latin spoken in various dialects in the late Roman Empire.
The comparative method can be used to reconstruct languages for which no written records exist, either because none were preserved or because the speakers were illiterate. Thus, the Germanic languages (which include German, Dutch, English, Norwegian, Swedish, Danish, Faroese, Icelandic, Yiddish, and the extinct Gothic) can be compared to reconstruct Proto-Germanic, a language that was probably contemporaneous with Latin and for which no records are preserved.
Germanic and Latin (more precisely, Proto-Italic, the ancestor of Latin and a few of its neighbors) are themselves related, being co-descended from Proto-Indo-European, spoken perhaps 5000 years ago. Scholars have reconstructed Proto-Indo-European on the basis of data from its nine surviving daughter branches, which are: Germanic, Italic, Celtic, Greek, Baltic, Slavic, Albanian, Armenian, Indo-Iranian, and from the two dead branches Tocharian and Anatolian.
The comparative method is used to distinguish true linguistic descent - that is, the passing of a language from parents to children, down through the generations - from accidental resemblance due to cultural contact. For example, c. 30% percentage of the vocabulary of Persian is taken from Arabic, as a result of the Arab conquest of Iran in the 8th century and much subsequent cultural contact. Yet Persian is Indo-European, being a member of the Indo-Iranian branch that also includes Sanskrit and many of the languages of modern India. The clue that Persian is Indo-European is that its core vocabulary generally has Indo-European cognates (as in mâdar 'mother'), and its essential grammatical elements are likewise Indo-European (as in bûd 'was', which includes elements related to English "be" and the English past tense ending "-ed".)
The comparative method has been successfully used to reconstruct some very large language families, notably Austronesian (which includes Hawaiian, Tagalog, Javanese, and Malagasy) and Niger-Congo (the majority of the languages of modern Africa). Once the various changes in the daughter branches have been worked out, and a fair amount of the core vocabulary and grammar of the protolanguage are understood, then scholars will quite generally agree that a relationship of genetic relatedness has been proven.