正则表达式匹配所有文档中两个字符串之间的字符

问题描述:

我有这样的文字,我想捕捉正则表达式匹配所有文档中两个字符串之间的字符

标题:CRB: Mysticete鲸鱼的濒危种群的遗传多样性: 线粒体DNA和历史人口学类型:NSF奖组织:DEB最新修改日期:8月1日,1991年 文件:a9000006

奖号:9000006原音乐奖:授予继续
PRGM经理:斯科特·科林s
环境生物学DEB划分
BIO生物科学直接开始日期:1990年6月1日到期日:1992年11月30日(预计) 预计总金额。 :$一十七万九千七百二十零(估计)调查员: 斯蒂芬R. Palumbi(首席研究员电流)主办单位:夏威夷马诺阿 2530多尔街 檀香山ü ,HI 968222225 808/956-7800

NSF计划:1127系统学&人口BIOLO FLD Applictn: 0000099其他应用NEC
61生命科学生物计划编号:9285,摘要

  Commercial exploitation over the past two hundred years drove     
      the great Mysticete whales to near extinction. Variation in     
      the sizes of populations prior to exploitation, minimal       
      population size during exploitation and current population      
      sizes permit analyses of the effects of differing levels of      
      exploitation on species with different biogeographical       
      distributions and life-history characteristics. Dr. Stephen     
      Palumbi at the University of Hawaii will study the genetic      
      population structure of three whale species in this context,     
      the Humpback Whale, the Gray Whale and the Bowhead Whale. The     
      effect of demographic history will be determined by comparing     
      the genetic structure of the three species. Additional studies     
      will be carried out on the Humpback Whale. The humpback has a     
      world-wide distribution, but the Atlantic and Pacific       
      populations of the northern hemisphere appear to be discrete     
      populations, as is the population of the southern hemispheric     
      oceans. Each of these oceanic populations may be further      
      subdivided into smaller isolates, each with its own migratory     
      pattern and somewhat distinct gene pool. This study will      
      provide information on the level of genetic isolation among      
      populations and the levels of gene flow and genealogical      
      relationships among populations. This detailed genetic       
      information will facilitate international policy decisions      
      regarding the conservation and management of these magnificent     
      mammals. 

我想匹配“标题”和“摘要”之间的每个字符。我试过(?< =标题)(。)(?= Asbtract)\ bTitle \ b(。?)\ bAbstract \ b,但没有奏效。我无法弄清楚,什么是正确的语法。

+1

其实,你需要指定你的语言。所以,正则表达式将变得非常清晰。 –

+0

这是用于java的,谢谢 –

+0

我给出了我的正则表达式 –

\\bTitle\\b([\\s\\S]*?)\\bAbstract\\b 

.不匹配newlines通过default.So使用s标志或[\s\S]

观看演示。

https://regex101.com/r/lR1eC9/6

+0

注意:反斜杠需要被另一个反斜杠转义,因为Java通过字符串文字处理正则表达式,反斜杠已经有了特殊的含义(用于转义)。然后实际生成的正则表达式是:'\ bTitle \ b([\ s \ S] *?)\ bAbstract \ b'。 – sp00m

+0

@ vks解决了谢谢 –

您应该使用正则表达式如下:

Title\s*\:(.*?)Abstract\s*\: