bytes is bytes, strings are bytes that have been "massaged" to appear a certain way.
(just as an image is bytes that have been "massaged" to appear as an image. eg, a
byte with a value of 0x02 is a perfectly valid value in an image, but it's not printable
as text.)
text files saved under certain operating systems will store end of line as 2 bytes.
text files saved under unix-like systems will store end of line as 1 byte.
text files saved using a particular character set will look strange if handled using a
different character set.
in java, a byte is a signed entity, so when you reach 127, the values go negative.
in the context of a text file, when java sees the negative value,it knows how to
deal with it (there no negative chr() values). under different languages, a byte
can be unsigned and contain a value up to 255.
a text file may have been saved using ascii, utf8, unicode or any number of other sets.
in the case of, eg, ascii, its "alphabet" can only have 255 characters because
each character is represented by 1 byte and a byte's maximum value is 255
(unless you're programming in java, where you'll see negative values when you reach
127). as for utf8, each character can be represented by up to 4 bytes (thus allowing for
millions of characters, divided up into many different character sets).
when you read a text file as a string, the os reads the bytes (of course) and converts
them to a string of characters based on your system's default character set (known as
your locale). if you see garbage when displaying the string, it's because the file
was saved using a different set (either deliberately or by default). once you see the
garbage, you probably should not try to convert it by getting its bytes and using a
different set. once you see the garbage, it's too late for that. you have to go back
and re-read the file as bytes and then convert to a string using a different character
set until you get it right. that's why websites indicated the character set in the html
document. with a 3rd party text file, you don't know until you read it and see the
garbage, if any. then you go back and start over.