• gnucom.cc — Using the Stanford Parser with Jython.


    gnucom.cc — Using the Stanford Parser with Jython.

    Using the Stanford Parser with Jython.

    The following code is a Jython adaptation of the example Java code that comes with the Stanford Parser. I felt like this would be pretty useful to have as a resource because Python doesn’t have a parser that generates grammatical relationships in a sentence, and I wasn’t able to find any example code to help developers get started.

    import sys
    sys.path.append('/path/to/jar/stanford-parser-2008-10-26.jar')
     
    from java.io import CharArrayReader
    from edu.stanford.nlp import *
     
    lp = parser.lexparser.LexicalizedParser('/path/to/englishPCFG.ser.gz')
    tlp = trees.PennTreebankLanguagePack()
    lp.setOptionFlags(["-maxLength", "80", "-retainTmpSubcategories"])
     
    sentence = 'One of my favorite features of functional programming \
    languages is that you can treat functions like values.'
     
    toke = tlp.getTokenizerFactory().getTokenizer(CharArrayReader(sentence));
    wordlist = toke.tokenize()
     
    if (lp.parse(wordlist)):
    	parse = lp.getBestParse()
     
    gsf = tlp.grammaticalStructureFactory()
    gs = gsf.newGrammaticalStructure(parse)
    tdl = gs.typedDependenciesCollapsed()
     
    print parse.toString() 
    print tdl

    Using Jython one can easily generate structural and grammatical parse trees! The code here produces the following structural output. The context-free grammar is easy to analyze and easy to use. There is no end to how you can use this data in your application.

    (ROOT
      (S [126.504]
        (NP [70.320]
          (NP [8.540] (CD [4.252] One))
          (PP [61.414] (IN [0.666] of)
            (NP [59.002]
              (NP [26.440] 
                (PRP$ [3.699] my) 
                (JJ [8.020] favorite) 
                (NNS [8.095] features))
              (PP [32.021] (IN [0.666] of)
                (NP [30.954] (JJ [8.203] functional) 
                  (NN [8.844] programming) 
                  (NNS [9.164] languages))))))
        (VP [53.354] (VBZ [0.144] is)
          (SBAR [47.716] (IN [0.637] that)
            (S [46.752]
              (NP [4.591] (PRP [3.341] you))
              (VP [41.830] (MD [2.354] can)
                (VP [37.270] (VB [7.289] treat)
                  (NP [10.882] (NNS [8.323] functions))
                  (PP [16.113] (IN [5.239] like)
                    (NP [10.201] (NNS [7.216] values))))))))
        (. [0.002] .)))

    In addition to the structural output, the code also produces the grammatical relations in the sentence. These relationships can be used to easily and accurately pick out the subjects, modifiers, and objects in any correctly formatted sentence. If semantic meaning and understanding is something that your application requires, this is the best tool to use.

    [nsubj(is-10, One-1), 
    poss(features-5, my-3), 
    amod(features-5, favorite-4), 
    prep_of(One-1, features-5), 
    amod(languages-9, functional-7), 
    nn(languages-9, programming-8), 
    prep_of(features-5, languages-9), 
    complm(treat-14, that-11), 
    nsubj(treat-14, you-12), 
    aux(treat-14, can-13), 
    ccomp(is-10, treat-14), 
    dobj(treat-14, functions-15), 
    prep_like(treat-14, values-17)]

    Looking for the Stanford Parser files? You can get them from their home page at http://nlp.stanford.edu/software/lex-parser.shtml#Download. Hope that this code helps you get started and if you have any questions about using the Stanford Parser I would be glad to help you.

  • 相关阅读:
    [ Algorithm ] N次方算法 N Square 动态规划解决
    [ Algorithm ] LCS 算法 动态规划解决
    sql server全文索引使用中的小坑
    关于join时显示no join predicate的那点事
    使用scvmm 2012的动态优化管理群集资源
    附加数据库后无法创建发布,error 2812 解决
    浅谈Virtual Machine Manager(SCVMM 2012) cluster 过载状态检测算法
    windows 2012 r2下安装sharepoint 2013错误解决
    sql server 2012 数据引擎任务调度算法解析(下)
    sql server 2012 数据引擎任务调度算法解析(上)
  • 原文地址:https://www.cnblogs.com/lexus/p/2777740.html
Copyright © 2020-2023  润新知