nala:自然语言突变的文本挖掘提到

  • k0_735342
    了解作者
  • 24.5MB
    文件大小
  • zip
    文件格式
  • 0
    收藏次数
  • VIP专享
    资源类型
  • 0
    下载次数
  • 2022-05-27 01:46
    上传日期
:index_pointing_up: 我们搬家了 该库不再维护。 我们将nala移到 : 纳拉 文本挖掘方法,用于提取以标准(ST)格式(例如“ E6V”)或复杂自然语言(NL)(例如“谷氨酸在残基6处被缬氨酸取代”)编写的序列变体(基因或蛋白质)。 出版物: 安装 需要Python 3.6 从来源 git clone https://github.com/Rostlab/nala.git cd nala poetry shell poetry install python3 -m nalaf.download_data 注意:如果您更喜欢使用pip (而不是poetry )进行安装,则需要pip> = 19.0,然后执行以下操作: pip install -r requirements.txt pip install . 发展 测试 如果要运行单元测试(慢速测试除外),请执行以下操作: nosetests -
nala-develop.zip
内容介绍
<html xmlns="http://www.w3.org/1999/xhtml"><head><meta charset="utf-8"><meta name="generator" content="pdf2htmlEX"><meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1"><link rel="stylesheet" href="https://csdnimg.cn/release/download_crawler_static/css/base.min.css"><link rel="stylesheet" href="https://csdnimg.cn/release/download_crawler_static/css/fancy.min.css"><link rel="stylesheet" href="https://csdnimg.cn/release/download_crawler_static/19137102/raw.css"><script src="https://csdnimg.cn/release/download_crawler_static/js/compatibility.min.js"></script><script src="https://csdnimg.cn/release/download_crawler_static/js/pdf2htmlEX.min.js"></script><script>try{pdf2htmlEX.defaultViewer = new pdf2htmlEX.Viewer({});}catch(e){}</script><title></title></head><body><div id="sidebar" style="display: none"><div id="outline"></div></div><div id="pf1" class="pf w0 h0" data-page-no="1"><div class="pc pc1 w0 h0"><img class="bi x0 y0 w1 h1" alt="" src="https://csdnimg.cn/release/download_crawler_static/19137102/bg1.jpg"><div class="t m0 x1 h2 y1 ff1 fs0 fc0 sc0 ls0 ws0">April<span class="_ _0"> </span>28,<span class="_ _0"> </span>2007<span class="_ _1"> </span>13:18<span class="_ _1"> </span>WSPC/INSTR<span class="_ _2"></span>UCTION<span class="_ _0"> </span>FILE<span class="_ _3"> </span>annotation&#729;guidelines</div><div class="t m0 x2 h3 y2 ff2 fs1 fc0 sc0 ls0 ws0">1</div><div class="t m0 x3 h4 y3 ff3 fs0 fc0 sc0 ls0 ws0">App<span class="_ _4"></span>endix<span class="_ _5"> </span>A.<span class="_ _6"> </span>Annotation<span class="_ _5"> </span>Guidelines</div><div class="t m0 x3 h2 y4 ff1 fs0 fc0 sc0 ls0 ws0">The<span class="_ _7"> </span>corpus<span class="_ _7"> </span>used<span class="_ _7"> </span>in<span class="_ _7"> </span>the<span class="_ _7"> </span>dev<span class="_ _2"></span>elopment<span class="_ _7"> </span>of<span class="_ _7"> </span>M<span class="_ _2"></span>utationFinder</div><div class="t m0 x4 h5 y5 ff4 fs2 fc0 sc0 ls0 ws0">a</div><div class="t m0 x5 h2 y4 ff1 fs0 fc0 sc0 ls0 ws0">is<span class="_ _7"> </span>annotated<span class="_ _7"> </span>using<span class="_ _7"> </span>the</div><div class="t m0 x3 h2 y6 ff1 fs0 fc0 sc0 ls0 ws0">Kno<span class="_ _2"></span>wtator<span class="_ _8"> </span>plugin<span class="_ _8"> </span>for<span class="_ _8"> </span>the<span class="_ _8"> </span>Protege<span class="_ _8"> </span>On<span class="_ _2"></span>tology<span class="_ _8"> </span>Editor<span class="_ _8"> </span>and<span class="_ _8"> </span>Kno<span class="_ _2"></span>wledge<span class="_ _8"> </span>Acquisition</div><div class="t m0 x3 h2 y7 ff1 fs0 fc0 sc0 ls0 ws0">System.<span class="_ _7"> </span>(See<span class="_ _5"> </span>Figure<span class="_ _7"> </span>1.)<span class="_ _9"> </span>A<span class="_ _9"> </span>new<span class="_ _9"> </span>&#8220;mutation<span class="_ _9"> </span>on<span class="_ _2"></span>tology&#8221;<span class="_ _9"> </span>has<span class="_ _9"> </span>b<span class="_ _4"></span>een<span class="_ _9"> </span>developed<span class="_ _7"> </span>for<span class="_ _9"> </span>this</div><div class="t m0 x3 h2 y8 ff1 fs0 fc0 sc0 ls0 ws0">purp<span class="_"> </span>ose<span class="_"> </span>,<span class="_ _5"> </span>th<span class="_ _2"></span>e<span class="_ _0"> </span>comp<span class="_ _4"></span>onen<span class="_ _2"></span>ts<span class="_ _0"> </span>of<span class="_ _0"> </span>which<span class="_ _0"> </span>are<span class="_ _0"> </span>summarized<span class="_ _0"> </span>as<span class="_ _0"> </span>follo<span class="_ _2"></span>ws:</div><div class="t m0 x6 h2 y9 ff5 fs0 fc0 sc0 ls0 ws0">&#8226;<span class="_ _8"> </span><span class="ff1">m<span class="_ _2"></span>utation<span class="_ _0"> </span>ev<span class="_ _2"></span>en<span class="_ _2"></span>t</span></div><div class="t m0 x7 h2 ya ff3 fs0 fc0 sc0 ls0 ws0">&#8211;<span class="_ _8"> </span><span class="ff1">deletion<span class="_ _0"> </span>(subclass)</span></div><div class="t m0 x8 h2 yb ff5 fs0 fc0 sc0 ls0 ws0">&#8727;<span class="_ _8"> </span><span class="ff1">deleted<span class="_ _8"> </span>elemen<span class="_ _2"></span>t<span class="_ _a"> </span>(slot<span class="_ _a"> </span>for<span class="_ _8"> </span><span class="ff6">biolo<span class="_ _2"></span>gic<span class="_ _b"></span>al<span class="_ _8"> </span>se<span class="_ _b"></span>quenc<span class="_ _b"></span>e<span class="_ _8"> </span>element,<span class="_ _8"> </span>biolo<span class="_ _2"></span>gic<span class="_ _b"></span>al</span></span></div><div class="t m0 x9 h2 yc ff6 fs0 fc0 sc0 ls0 ws0">subse<span class="_ _b"></span>quenc<span class="_ _b"></span>e<span class="ff1">,<span class="_ _0"> </span>or<span class="_ _0"> </span></span>biolo<span class="_ _2"></span>gic<span class="_ _b"></span>al<span class="_ _5"> </span>se<span class="_ _b"></span>quenc<span class="_ _b"></span>e<span class="_ _5"> </span>p<span class="_ _b"></span>osition<span class="_ _c"></span><span class="ff1">)</span></div><div class="t m0 x7 h2 yd ff3 fs0 fc0 sc0 ls0 ws0">&#8211;<span class="_ _a"> </span><span class="ff1">insertion<span class="_ _0"> </span>(sub<span class="_ _4"></span>class)</span></div><div class="t m0 x8 h2 ye ff5 fs0 fc0 sc0 ls0 ws0">&#8727;<span class="_ _a"> </span><span class="ff1">inserted<span class="_ _d"> </span>elemen<span class="_ _2"></span>t<span class="_ _d"> </span>(slot<span class="_ _d"> </span>for<span class="_ _e"> </span><span class="ff6">biolo<span class="_ _b"></span>gic<span class="_ _b"></span>al<span class="_ _0"> </span>se<span class="_ _b"></span>quenc<span class="_ _b"></span>e<span class="_ _0"> </span><span class="ff1">or<span class="_ _d"> </span></span>biolo<span class="_ _b"></span>gic<span class="_ _2"></span>al<span class="_ _d"> </span>se<span class="_ _b"></span>quenc<span class="_ _2"></span>e</span></span></div><div class="t m0 x9 h2 yf ff6 fs0 fc0 sc0 ls0 ws0">element<span class="_ _f"></span>)</div><div class="t m0 x8 h2 y10 ff5 fs0 fc0 sc0 ls0 ws0">&#8727;<span class="_ _a"> </span><span class="ff1">insertion<span class="_ _0"> </span>start<span class="_ _0"> </span>(slot<span class="_ _0"> </span>for<span class="_ _0"> </span><span class="ff6">biolo<span class="_ _b"></span>gic<span class="_ _2"></span>al<span class="_ _5"> </span>se<span class="_ _b"></span>quenc<span class="_ _b"></span>e<span class="_ _0"> </span>p<span class="_ _2"></span>osition<span class="_ _c"></span><span class="ff1">)</span></span></span></div><div class="t m0 x7 h2 y11 ff3 fs0 fc0 sc0 ls0 ws0">&#8211;<span class="_ _a"> </span><span class="ff1">substitution<span class="_ _0"> </span>(sub<span class="_ _4"></span>class)</span></div><div class="t m0 x8 h2 y12 ff5 fs0 fc0 sc0 ls0 ws0">&#8727;<span class="_ _a"> </span><span class="ff1">wild<span class="_ _5"> </span>t<span class="_ _2"></span>ype<span class="_ _5"> </span>element<span class="_ _0"> </span>(slot<span class="_ _5"> </span>for<span class="_ _5"> </span><span class="ff6">biolo<span class="_ _b"></span>gic<span class="_ _b"></span>al<span class="_ _5"> </span>se<span class="_ _b"></span>quenc<span class="_ _b"></span>e<span class="_ _9"> </span>element<span class="ff1">,<span class="_ _5"> </span></span>biolo<span class="_ _b"></span>gic<span class="_ _b"></span>al</span></span></div><div class="t m0 x9 h2 y13 ff6 fs0 fc0 sc0 ls0 ws0">se<span class="_ _b"></span>quenc<span class="_ _b"></span>e<span class="_ _5"> </span>p<span class="_ _b"></span>osition<span class="ff1">,<span class="_ _0"> </span>or<span class="_ _0"> </span></span>biolo<span class="_ _b"></span>gic<span class="_ _2"></span>al<span class="_ _0"> </span>subse<span class="_ _2"></span>quenc<span class="_ _b"></span>e<span class="_ _c"></span><span class="ff1">)</span></div><div class="t m0 x8 h2 y14 ff5 fs0 fc0 sc0 ls0 ws0">&#8727;<span class="_ _a"> </span><span class="ff1">mutan<span class="_ _b"></span>t<span class="_ _5"> </span>elemen<span class="_ _b"></span>t<span class="_ _5"> </span>(slot<span class="_ _0"> </span>for<span class="_ _0"> </span><span class="ff6">biolo<span class="_ _2"></span>gic<span class="_ _b"></span>al<span class="_ _5"> </span>se<span class="_ _b"></span>quenc<span class="_ _b"></span>e<span class="_ _5"> </span>element<span class="_ _7"> </span><span class="ff1">or<span class="_ _0"> </span></span>biolo<span class="_ _2"></span>gic<span class="_ _b"></span>al</span></span></div><div class="t m0 x9 h2 y15 ff6 fs0 fc0 sc0 ls0 ws0">se<span class="_ _b"></span>quenc<span class="_ _b"></span>e<span class="_ _f"></span><span class="ff1">)</span></div><div class="t m0 x6 h2 y16 ff5 fs0 fc0 sc0 ls0 ws0">&#8226;<span class="_ _a"> </span><span class="ff1">biological<span class="_ _0"> </span>sequence</span></div><div class="t m0 x7 h2 y17 ff3 fs0 fc0 sc0 ls0 ws0">&#8211;<span class="_ _a"> </span><span class="ff1">p<span class="_ _4"></span>olyp<span class="_ _4"></span>eptide<span class="_ _0"> </span>sequence<span class="_ _0"> </span>(sub<span class="_ _4"></span>class)</span></div><div class="t m0 x7 h2 y18 ff3 fs0 fc0 sc0 ls0 ws0">&#8211;<span class="_ _a"> </span><span class="ff1">sequence<span class="_ _0"> </span>(slot<span class="_ _0"> </span>for<span class="_ _0"> </span>string)</span></div><div class="t m0 x6 h2 y19 ff5 fs0 fc0 sc0 ls0 ws0">&#8226;<span class="_ _a"> </span><span class="ff1">biological<span class="_ _0"> </span>sequence<span class="_ _0"> </span>p<span class="_ _4"></span>osition</span></div><div class="t m0 x6 h2 y1a ff5 fs0 fc0 sc0 ls0 ws0">&#8226;<span class="_ _a"> </span><span class="ff1">biological<span class="_ _0"> </span>sequence<span class="_ _0"> </span>element</span></div><div class="t m0 x7 h2 y1b ff3 fs0 fc0 sc0 ls0 ws0">&#8211;<span class="_ _a"> </span><span class="ff1">amino<span class="_ _0"> </span>acid<span class="_ _0"> </span>(sub<span class="_ _4"></span>class)</span></div><div class="t m0 x8 h2 y1c ff5 fs0 fc0 sc0 ls0 ws0">&#8727;<span class="_ _a"> </span><span class="ff1">Alanine,<span class="_ _0"> </span>Ala,<span class="_ _0"> </span>A</span></div><div class="t m0 x8 h2 y1d ff5 fs0 fc0 sc0 ls0 ws0">&#8727;<span class="_ _a"> </span><span class="ff1">Glycine,<span class="_ _0"> </span>Gly<span class="_ _b"></span>,<span class="_ _0"> </span>G</span></div><div class="t m0 x8 h2 y1e ff5 fs0 fc0 sc0 ls0 ws0">&#8727;<span class="_ _a"> </span><span class="ff1">etc.</span></div><div class="t m0 x7 h2 y1f ff3 fs0 fc0 sc0 ls0 ws0">&#8211;<span class="_ _a"> </span><span class="ff1">p<span class="_ _4"></span>osition<span class="_ _0"> </span>in<span class="_ _0"> </span>sequence<span class="_ _0"> </span>(slot<span class="_ _0"> </span>for<span class="_ _0"> </span><span class="ff6">biolo<span class="_ _b"></span>gic<span class="_ _2"></span>al<span class="_ _5"> </span>se<span class="_ _b"></span>quenc<span class="_ _b"></span>e<span class="_ _0"> </span>p<span class="_ _2"></span>osition<span class="_ _c"></span><span class="ff1">)</span></span></span></div><div class="t m0 x6 h2 y20 ff5 fs0 fc0 sc0 ls0 ws0">&#8226;<span class="_ _a"> </span><span class="ff1">biological<span class="_ _0"> </span>subsequence</span></div><div class="t m0 x7 h2 y21 ff3 fs0 fc0 sc0 ls0 ws0">&#8211;<span class="_ _a"> </span><span class="ff1">p<span class="_ _4"></span>olyp<span class="_ _4"></span>eptide<span class="_ _0"> </span>subsequence<span class="_ _0"> </span>(sub<span class="_ _4"></span>class)</span></div><div class="t m0 x3 h6 y22 ff7 fs3 fc0 sc0 ls0 ws0">a</div><div class="t m0 xa h3 y23 ff2 fs1 fc0 sc0 ls0 ws0">At<span class="_ _5"> </span>presen<span class="_ _2"></span>t,<span class="_ _5"> </span>MutationFinder<span class="_ _5"> </span>leverages<span class="_ _5"> </span>only<span class="_ _5"> </span>a<span class="_ _9"> </span>subset<span class="_ _5"> </span>of<span class="_ _5"> </span>the<span class="_ _9"> </span>annotation<span class="_ _5"> </span>data<span class="_ _9"> </span>in<span class="_ _5"> </span>the<span class="_ _5"> </span>annotated</div><div class="t m0 x3 h3 y24 ff2 fs1 fc0 sc0 ls0 ws0">corpus.<span class="_ _0"> </span>(The<span class="_ _0"> </span>MutationFinder<span class="_ _0"> </span>distribution<span class="_ _d"> </span>includes<span class="_ _0"> </span>data<span class="_ _0"> </span>&#64257;les<span class="_ _0"> </span>that<span class="_ _0"> </span>ha<span class="_ _2"></span>ve<span class="_ _d"> </span>b<span class="_ _4"></span>een<span class="_ _0"> </span>generated<span class="_ _0"> </span>from<span class="_ _0"> </span>the</div><div class="t m0 x3 h3 y25 ff2 fs1 fc0 sc0 ls0 ws0">Knowtator<span class="_ _0"> </span>pro<span class="_ _c"></span>jects<span class="_ _5"> </span>used<span class="_ _0"> </span>in<span class="_ _5"> </span>this<span class="_ _5"> </span>annotation<span class="_ _5"> </span>task,<span class="_ _0"> </span>but<span class="_ _5"> </span>hav<span class="_ _b"></span>e<span class="_ _5"> </span>b<span class="_ _4"></span>een<span class="_ _5"> </span>preprocessed<span class="_ _5"> </span>to<span class="_ _5"> </span>simplify<span class="_ _5"> </span>their</div><div class="t m0 x3 h3 y26 ff2 fs1 fc0 sc0 ls0 ws0">conten<span class="_ _b"></span>ts.)<span class="_ _5"> </span>As<span class="_ _0"> </span>of<span class="_ _0"> </span>this<span class="_ _5"> </span>writing,<span class="_ _0"> </span>the<span class="_ _0"> </span>data<span class="_ _5"> </span>sets<span class="_ _0"> </span>provided<span class="_ _0"> </span>with<span class="_ _0"> </span>MutationFinder<span class="_ _5"> </span>include<span class="_ _0"> </span>only<span class="_ _0"> </span>ordered</div><div class="t m0 x3 h3 y27 ff2 fs1 fc0 sc0 ls0 ws0">lists<span class="_ _d"> </span>of<span class="_ _d"> </span>substitution<span class="_ _d"> </span>ev<span class="_ _2"></span>ents.<span class="_ _e"> </span>Insertion<span class="_ _d"> </span>and<span class="_ _d"> </span>deletion<span class="_ _d"> </span>even<span class="_ _b"></span>ts<span class="_ _d"> </span>are<span class="_ _d"> </span>not<span class="_ _d"> </span>currently<span class="_ _e"> </span>supp<span class="_ _4"></span>orted<span class="_ _d"> </span>by<span class="_ _e"> </span>the<span class="_ _d"> </span>to<span class="_ _4"></span>ol,</div><div class="t m0 x3 h3 y28 ff2 fs1 fc0 sc0 ls0 ws0">and<span class="_ _d"> </span>all<span class="_ _d"> </span>information<span class="_ _0"> </span>relating<span class="_ _e"> </span>to<span class="_ _0"> </span>ho<span class="_ _b"></span>w<span class="_ _d"> </span>sp<span class="_ _4"></span>eci&#64257;c<span class="_ _0"> </span>spans<span class="_ _e"> </span>of<span class="_ _0"> </span>text<span class="_ _e"> </span>are<span class="_ _0"> </span>associated<span class="_ _d"> </span>with<span class="_ _d"> </span>each<span class="_ _d"> </span>substitution<span class="_ _d"> </span>is</div><div class="t m0 x3 h3 y29 ff2 fs1 fc0 sc0 ls0 ws0">absent.<span class="_ _10"> </span>While<span class="_ _10"> </span>the<span class="_ _e"> </span>MutationFinder<span class="_ _10"> </span>system<span class="_ _10"> </span>itself<span class="_ _e"> </span>is<span class="_ _10"> </span>capable<span class="_ _10"> </span>of<span class="_ _e"> </span>con<span class="_ _b"></span>veying<span class="_ _10"> </span>basic<span class="_ _11"> </span>positional<span class="_ _11"> </span>information</div><div class="t m0 x3 h3 y2a ff2 fs1 fc0 sc0 ls0 ws0">about<span class="_ _11"> </span>where<span class="_ _11"> </span>substitution-ev<span class="_ _2"></span>ent<span class="_ _10"> </span>mentions<span class="_ _10"> </span>b<span class="_ _4"></span>egin<span class="_ _10"> </span>and<span class="_ _11"> </span>end<span class="_ _11"> </span>in<span class="_ _10"> </span>piece<span class="_ _11"> </span>of<span class="_ _10"> </span>text,<span class="_ _11"> </span>the<span class="_ _11"> </span>system&#8217;s<span class="_ _10"> </span>output<span class="_ _11"> </span>cannot</div><div class="t m0 x3 h3 y2b ff2 fs1 fc0 sc0 ls0 ws0">currently<span class="_ _e"> </span>b<span class="_ _4"></span>e<span class="_ _d"> </span>used<span class="_ _d"> </span>to<span class="_ _d"> </span>automate<span class="_ _d"> </span>the<span class="_ _d"> </span>full<span class="_ _d"> </span>annotation<span class="_ _d"> </span>pro<span class="_ _4"></span>cess<span class="_ _d"> </span>describ<span class="_ _4"></span>ed<span class="_ _d"> </span>here.</div></div><div class="pi" data-data='{"ctm":[1.568627,0.000000,0.000000,1.568627,0.000000,0.000000]}'></div></div></body></html>
评论
    相关推荐