Most applications store data in relational databases like mysql , oracle , db2 ...and searching over such data is a common use-case. The DataImportHandler is a Solr contrib that provides a configuration driven way to import this data into Solr. For the same we will create the data-config.xml. The data-config.xml will have the variable in the query.
The data-config for MySql will look like this :
<dataConfig>
<dataSource
name="ds-db" driver="com.mysql.jdbc.Driver"
url="jdbc:mysql://localhost/dbname" user="root"
password="root"/>
<dataSource
name="ds-file" type="BinFileDataSource"/>
<document
name="documents">
<entity
name="book" dataSource="ds-db" query="select
distinct
book.id
as id,
book.title,
book.author,
book.publisher,
from
Books book
where
book.book_added_date >= to_date($
{dataimporter.request.lastIndexDate}, 'DD/MM/YYYY
HH24:MI:SS')))"
transformer="DateFormatTransformer">
<field
column=”id” name=”id”/>
<field
column=”title” name=”title”/>
<field
column=”author” name=”author”/>
<field
column=”publisher” name=”publisher”/>
<entity
name=”content” query=”select description from content where
content_id='${book.id}'”>
<field
column=”description” name=”description”/>
</entity>
</entity>
</document>
</dataConfig>
In
the url you need to pass the variable resolver with value.
The
url to start the data-import in this case will be :
http://localhost:8080/solr/admin/select/?qt=/dataimport&command=full-import&clean=false&commit=true&lastIndexDate='08/05/2011
20:16:11'
For
the first time indexing you need pass “lastIndexDate=null”.
The
data-config for Oracle will look like this :
<dataConfig>
<dataSource
name="ds-db" driver="oracle.jdbc.driver.OracleDriver"
url="jdbc:oracle:thin:@127.0.0.1:1521:test"
user="dev" password="dev"/>
<dataSource
name="ds-file" type="BinFileDataSource"/>
<document
name="documents">
<entity
name="book" dataSource="ds-db" query="select
distinct
book.id
as id,
book.title,
book.author,
book.publisher,
from
Books book
where
book.book_added_date >= to_date($
{dataimporter.request.lastIndexDate}, 'DD/MM/YYYY
HH24:MI:SS')))"
transformer="DateFormatTransformer">
<field
column=”id” name=”id”/>
<field
column=”title” name=”title”/>
<field
column=”author” name=”author”/>
<field
column=”publisher” name=”publisher”/>
<entity
name=”content” query=”select description from content where
content_id='${book.Id}'”>
<field
column=”description” name=”description”/>
</entity>
</entity>
</document>
</dataConfig>
The
change here in data-config.xml for oracle id ${book.Id}
and not the ${book.id}.
It took me long time to find out this by debugging.
No comments:
Post a Comment