I have this huge file which contains unicode strings at the beginning (first ~10,000 character or so) I don't care about the unicode part, parts I'm interested aren't unicode but whenever I try to read those parts I get I have this huge file which contains unicode st