Python调用Google翻译

Python调用Google翻译
出自:http://blog.csdn.net/zhaoyl03/article/details/8830806

最近想动手做一个文档自动下载器，需要模拟浏览器的行为。虽然感觉思路上没有困难，但在技术细节上需要自己一步一步试探。在网上搜索相关内容的过程中，发现有人用Python调用Google翻译。我自己也试着实现这个小玩意，从而熟练和学习一些技术，如正则表达式匹配，模拟浏览器等。将这个小结果记录下来，以激励自己。

用Python调用Google翻译，就是模拟人将原文本(英语)粘贴在Google翻译的左边文本框，选择翻译设置从英文到简体中文，然后点击翻译，最后复制右边文本框中的翻译结果，并保存的过程。我比文献《用Python实现调用Google翻译》的高明之处在，在提取翻译后的结果时，用正则表达式匹配很轻巧地抓取到了翻译后的文本。另外，代码完整。

我用的Pyhon版本2.66，源码如下：
[python] view plain copy
1. # -*- coding: utf-8 -*-
2. #Python -V: Python 2.6.6
3. #filename:GoogleTranslation1.2.py
5. __author__ = "Yinlong Zhao (zhaoyl[at]sjtu[dot]edu[dot]cn)"
6. __date__ = "$Date: 2013/04/21 $"
8. import re
9. import urllib,urllib2
11. #urllib:
12. #urllib2: The urllib2 module defines functions and classes which help in opening
13. #URLs (mostly HTTP) in a complex world — basic and digest authentication,
14. #redirections, cookies and more.
18. def translate(text):
20. '''''模拟浏览器的行为，向Google Translate的主页发送数据，然后抓取翻译结果 '''
22. #text 输入要翻译的英文句子
23. text_1=text
24. #'langpair':'en'|'zh-CN'从英语到简体中文
25. values={'hl':'zh-CN','ie':'UTF-8','text':text_1,'langpair':"'en'|'zh-CN'"}
26. url='http://translate.google.cn/translate_t'
27. data = urllib.urlencode(values)
28. req = urllib2.Request(url,data)
29. #模拟一个浏览器
30. browser='Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1; SV1; .NET CLR 2.0.50727)'
31. req.add_header('User-Agent',browser)
32. #向谷歌翻译发送请求
33. response = urllib2.urlopen(req)
34. #读取返回页面
35. html=response.read()
36. #从返回页面中过滤出翻译后的文本
37. #使用正则表达式匹配
38. #翻译后的文本是'TRANSLATED_TEXT='等号后面的内容
39. #.*? non-greedy or minimal fashion
40. #(?<=...)Matches if the current position in the string is preceded
41. #by a match for ... that ends at the current position
42. p=re.compile(r"(?<=TRANSLATED_TEXT=).*?;")
43. m=p.search(html)
44. text_2=m.group(0).strip(';')
45. return text_2
47. if __name__ == "__main__":
48. #text_1 原文
49. #text_1=open('c:\text.txt','r').read()
50. text_1='Hello, my name is Derek. Nice to meet you! '
51. print('The input text: %s' % text_1)
52. text_2=translate(text_1).strip("'")
53. print('The output text: %s' % text_2)
55. #保存结果
56. filename='c:\Translation.txt'
57. fp=open(filename,'w')
58. fp.write(text_2)
59. fp.close()
61. report='Master, I have done the work and saved the translation at '+filename+'.'
62. print('Report: %s' % report)
运行结果：
[python] view plain copy
1. >>>
2. The input text: Hello, my name is Derek. Nice to meet you!
3. The output text: 你好，我的名字是德里克。很高兴见到你！
4. Report: Master, I have done the work and saved the translation at c:Translation.txt.
5. >>>
感想：
1. 个人觉得Python的各种包稍显混乱，需要很好的规范，才能更好的发展。

2. 若有想法，则从人类已有的“技术栈”寻找，进而实现

3. 用模拟浏览器模拟人的上网、浏览、按键行为在数据获取阶段非常重要

4. 熟悉一门语言不是一蹴而就的，需要代码量，不断积累。等有了很多砖头块后，才能在建造大厦时游刃有余。
相关阅读:
网络编程
 初识正则表达式
 面向对象---内置函数,反射,内置方法
 面向对象----属性,类方法,静态方法
 面向对象--抽象类,多态,封装
 面向对象--继承
 初识面向对象
 类名称空间,查询顺序,组合
 经典例题
 ⽣成器和⽣成器表达式
原文地址：https://www.cnblogs.com/flyingZFX/p/5140550.html