在复制pdf文本的时候,因为排版问题,总是会出现一堆换行符,导致段落被割裂成一行一行
写了个小工具处理剪贴板,复制之后自动去除冗余换行符,句尾/段尾换行符均不去除,因为暂时没想到区分句尾和段尾的方法,凑合用
需要python环境和pyperclip包
import pyperclip import time def delete_enter(str): result = '' start = 0 finish = str.find(' ') while (finish != -1): result += str[start:finish-1] if (str[finish-2]=='.'): result += ' ' else: result += ' ' start = finish+1 finish = str.find(' ', start) result += str[start:] return result if __name__ == '__main__': content_old = '' while (True): if (content_old != pyperclip.paste()): pyperclip.copy(delete_enter(pyperclip.paste())) content_old = pyperclip.paste() time.sleep(0.5)