測试样例:
Java读取UTF-8的txt文件第一行出现乱码“?”及解决
test.txt文件内容:
1
00:00:06,000 --> 00:00:06,010
<b>Allerleirauh</b> (2012)
<i>dTV - Das Erste - 20. Januar 2013</i>
2
00:00:10,280 --> 00:00:12,680
Was geh?rt zu einer guten Suppe?
3
00:00:14,200 --> 00:00:15,839
Eine gute Suppe...
test.txt文件採用写字板保存为UTF-8格式(此处为带有BOM的UTF-8文件)
保存并关闭后使用写字板再次打开该UTF-8文档,中文、字母正常显示
public static String srt2Txt(String filename){ File infile = new File(filename); String realfile = filename.substring(0, filename.lastIndexOf(".srt")) + ".txt"; String tempfile = realfile.replace('/', '\');//Windows写入文件路径格式 File outfile = new File(tempfile); BufferedReader bufferedReader = null; BufferedWriter bufferedWriter = null; try { bufferedReader = new BufferedReader(new FileReader(infile)); bufferedWriter = new BufferedWriter(new FileWriter(outfile)); String line;// 用来保存每次读取一行的内容 while ((line = bufferedReader.readLine()) != null) { line = new String(line.getBytes("ISO-8859-1"), "ISO-8859-1"); bufferedWriter.write(line); bufferedWriter.newLine();// 表示换行 bufferedWriter.flush(); } } catch (IOException e) { e.printStackTrace(); }finally{ if(null != bufferedReader){ try { bufferedReader.close(); } catch (IOException e) { e.printStackTrace(); } } if(null != bufferedWriter){ try { bufferedWriter.close(); } catch (IOException e) { e.printStackTrace(); } } } return realfile; }測试结果:
??
00:00:06,000 --> 00:00:06,010
<b>Allerleirauh</b> (2012)
<i>dTV - Das Erste - 20. Januar 2013</i>
2
00:00:10,280 --> 00:00:12,680
Was geh?rt zu einer guten Suppe?
3
00:00:14,200 --> 00:00:15,839
Eine gute Suppe...
解决方法:
使用UltraEdit将上边的txt文件另存为UTF-8无BOM格式;或者
使用Notepad++打开上边的txt文件运行例如以下操作“格式-->以UTF-8无BOM格式编码”,改动后将txt文本进行保存。