Solr自动生成ID配置实践
Solr基于Lucene的索引,而索引中的最基本的单元式Document,在Solr中,管理每一个Document(更新、删除、查询),基本都会用到对应的ID,类似关系数据表中的主键。但是,如果我希望Solr能够自动生成这个唯一的ID,有时候也省去了不少的工作,而只需要在Solr中进行配置即可。

下面,通过实例来说明,如何配置Solr支持的UUID。首先,示例的schema.xml对应的表结构,如图所示:
在Solr中配置生成唯一UUID,需要修改两个配置文件:
- schema.xml
schema.xml文件的配置内容,增加如下类型配置:
1 |
< fieldType name = "uuid" class = "solr.UUIDField" indexed = "true" />
|
再增加ID字段的类型,如下所示:
1 |
< field name = "id" type = "uuid" indexed = "true" stored = "true" multiValued = "false" required = "true" />
|
这样还不够,还需要指定在更新索引的时候,使用这个更新策略,即配置一个requestHandler元素。
- solrconfig.xml
配置solrconfig.xml文件,修改更新索引的requestHandler 配置,内容如下所示:
1 |
< requestHandler name = "/update" class = "solr.UpdateRequestHandler" >
|
2 |
< lst name = "defaults" >
|
3 |
< str name = "update.chain" >dispup</ str >
|
4 |
</ lst >
|
5 |
</ requestHandler >
|
上面的update.chain就是我们实际要指定的使用UUID进行更新的策略的配置,如下所示:
1 |
< updateRequestProcessorChain name = "dispup" >
|
2 |
< processor class = "solr.UUIDUpdateProcessorFactory" >
|
3 |
< str name = "fieldName" >id</ str >
|
4 |
</ processor >
|
5 |
< processor class = "solr.LogUpdateProcessorFactory" />
|
6 |
< processor class = "solr.DistributedUpdateProcessorFactory" />
|
7 |
< processor class = "solr.RunUpdateProcessorFactory" />
|
8 |
</ updateRequestProcessorChain >
|
经过上面两步配置,在进行索引的时候,就不需要指定Document所要求的ID了,完全有Solr自动生成这个ID字符串。下面看看,我们配置后,生成的Document的信息,示例如下所示:
01 |
< response >
|
02 |
< lst name = "responseHeader" >
|
03 |
< int name = "status" >0</ int >
|
04 |
< int name = "QTime" >1</ int >
|
05 |
</ lst >
|
06 |
< result name = "response" numFound = "86773" start = "0" >
|
07 |
< doc >
|
08 |
< int name = "log_id" >6410</ int >
|
09 |
< long name = "start_time" >87318</ long >
|
10 |
< long name = "end_time" >88282</ long >
|
11 |
< int name = "prov_id" >1</ int >
|
12 |
< int name = "city_id" >105</ int >
|
13 |
< int name = "area_id" >0</ int >
|
14 |
< int name = "idt_id" >5100</ int >
|
15 |
< int name = "cnt" >29</ int >
|
16 |
< int name = "net_type" >5</ int >
|
17 |
< int name = "time_type" >1</ int >
|
18 |
< int name = "time_id" >20130810</ int >
|
19 |
< str name = "id" >4cb43476-eb96-498e-a3a0-8d13c0a6c8c5</ str >
|
20 |
< long name = "_version_" >1443405623457742848</ long >
|
21 |
</ doc >
|
22 |
< doc >
|
23 |
< int name = "log_id" >6410</ int >
|
24 |
< long name = "start_time" >87318</ long >
|
25 |
< long name = "end_time" >88282</ long >
|
26 |
< int name = "prov_id" >1</ int >
|
27 |
< int name = "city_id" >105</ int >
|
28 |
< int name = "area_id" >0</ int >
|
29 |
< int name = "idt_id" >5101</ int >
|
30 |
< int name = "cnt" >29</ int >
|
31 |
< int name = "net_type" >5</ int >
|
32 |
< int name = "time_type" >1</ int >
|
33 |
< int name = "time_id" >20130810</ int >
|
34 |
< str name = "id" >faef555d-1587-489e-889a-c7c696607d3b</ str >
|
35 |
< long name = "_version_" >1443405623459840000</ long >
|
36 |
</ doc >
|
37 |
</ result >
|
38 |
</ response >
|
可见,正好满足我们的需要。