How to recognize the language of the text?
If you only need to distinguish between a few languages, then I expect that something based on digraph or trigraph frequency, or on frequent common words, would probably easily do the job,
eg if it contains many "the" & "a" then it's probably english; "der" "die" "das" "ein" is probably german; "le" "la" probably french etc
hmm that's a kind-of interesting project: find common words unique(ish) to single languages
Microsoft has a
C# library to do it offline, but that doesn't help us much here, although it does give confidence that it is possible.