Fedora Generic Search Service Version 2.7

  • compatible with Fedora Version 3.7.1
  • compatibility with Lucene 4.6.1, Solr 4.6.1, PDFBox 1.8.4, and Tika 1.4
  • easier configuration of GSearch for Islandora

apt-get install ant
apt-get install unzip
nano -w /usr/local/fedora/server/config/fedora-users.xml
    <user name="fgsAdmin" password="******">
      <attribute name="fedoraRole">
wget http://garr.dl.sourceforge.net/project/fedora-commons/services/3.7/fedoragsearch-2.7.zip
unzip fedoragsearch-2.7.zip
cp fedoragsearch-2.7/fedoragsearch.war /var/lib/tomcat7/webapps/
cp -R /var/lib/tomcat7/webapps/fedoragsearch/FgsConfig ./
cd FgsConfig/

ant generateIndexingXslt

cp fgsconfig-basic-for-islandora.properties fgsconfig-basic-for-islandora.properties.ORI
nano -w fgsconfig-basic-for-islandora.properties
# file.name=fgsconfig-basic-for-islandora.properties

# This is a version of fgsconfig-basic.properties tailored for islandora

# These properties are used by running from command line:
#   >ant -f fgsconfig-basic.xml -Dlocal.FEDORA_HOME=$FEDORA_HOME -propertyfile fgsconfig-basic-for-islandora.properties
# Be sure you have permissions to write to finalConfigPath.

# You must tailor the lines between #>>>>>>>>>> and #<<<<<<<<<<

# configDisplayName is displayed on the admin pages, so you know, which set of config files is in action.
# configDisplayName is also used as directory name of the config within the FgsConfigTemplate directory.

# gsearchBase is used for SOAP deployment.

# gsearchAppName is used for SOAP deployment.

# gsearchUser is used for SOAP deployment.

# gsearchPass is used for SOAP deployment.

# finalConfigPath must be in the classpath of the web server, must be an absolute path.

# At startup, GSearch will find the file log4j.xml in tomcat classpath.
# logFilePath is where to find the log file.

# logLevel can be DEBUG, INFO, WARN, ERROR, FATAL.

# namesOfRepositories separated by space.

# namesOfIndexes separated by space.

# Assuming there is one repository:

  # fedoraBase is base url of the repository.

  # fedoraAppName is Fedora app name of this repository.

  # fedoraUser is the user name to access this repository.

  # fedoraPass is the password to access this repository.

  # fedoraVersion is the Fedora version of this repository.

  #objectStoreBase must be the location of the objects of this repository.

#Assuming there is one index:

  # indexEngine is Lucene, Solr, or Zebra.

  # FgsIndex: indexBase is the server base url, in case of Solr or Zebra.

  # FgsIndex: indexDir is the path to the index.

  # FgsIndex: indexingDocXslt is the name of the indexing stylesheet.
nano -w FgsConfigReposTemplate/repositoryInfo.xml
<?xml version="1.0" encoding="UTF-8"?>
  <AdminInfo>V2P2 repository 3.7.1
  <RepositoryLongName>Repository for V2P2 project</RepositoryLongName>
  <RepositoryDeveloper>Giancarlo Birello, UIT@Ceris</RepositoryDeveloper>
nano -w FgsConfigIndexTemplate/Solr/indexInfo.xml  (nothing TODO)
<?xml version="1.0" encoding="UTF-8"?>
<resultPage indexName="INDEXNAME">
  <AdminInfo>The contents of this page is just an example,
  you may edit it in indexInfo.xml,
  and it is displayed by the getIndexInfo operation
  with the adminGetIndexInfoToHtml.xslt stylesheet.</AdminInfo>
  <IndexLongName>INDEXNAME index on Solr</IndexLongName>
  <EngineLongName>Apache Lucene project</EngineLongName>
  <EngineDescription>The Apache Solr project develops open-source search software.</EngineDescription>
  <EngineTags>solr lucene apache open-source search software</EngineTags>
  <QueryLanguage>See e.g. http://lucene.apache.org/java/docs/queryparsersyntax.html</QueryLanguage>
  <SampleSearch>dc.title:fedora AND dc.creator:"thornton staples"</SampleSearch>
  <IndexFieldNameList>PID, repositoryName,<BR/>
                      property.label, property.contentModel, property.createdDate,<BR/>
                      property.lastModifiedDate, property.state, property.type,<BR/>
                      dc.creator, dc.date, dc.description, dc.format, dc.identifier,<BR/>
                      dc.publisher, dc.relations, dc.right, dc.source,<BR/>
                      dc.subject, dc.title,<BR/>
                      others depending on the indexing stylesheet.
  <EngineDeveloper>Apache Lucene Solr project</EngineDeveloper>
  <EngineAttribution>The Apache Lucene Solr project &#169; 2005, The Apache Lucene Solr project,
   All Rights Reserved</EngineAttribution>
export FEDORA_HOME=/usr/local/fedora
ant -f fgsconfig-basic.xml -Dlocal.FEDORA_HOME=$FEDORA_HOME -propertyfile fgsconfig-basic-for-islandora.properties
Buildfile: /home/giancarlo/FgsConfig/fgsconfig-basic.xml


    [mkdir] Created dir: /home/giancarlo/FgsConfig/configForIslandora/fgsconfigFinal
     [copy] Copying 22 files to /home/giancarlo/FgsConfig/configForIslandora/fgsconfigFinal
     [copy] Copying 1 file to /home/giancarlo/FgsConfig/configForIslandora

    [mkdir] Created dir: /home/giancarlo/FgsConfig/configForIslandora/fgsconfigFinal/repository/FgsRepos
     [copy] Copying 3 files to /home/giancarlo/FgsConfig/configForIslandora/fgsconfigFinal/repository/FgsRepos

    [mkdir] Created dir: /home/giancarlo/FgsConfig/configForIslandora/fgsconfigFinal/index/FgsIndex
     [copy] Copying 23 files to /home/giancarlo/FgsConfig/configForIslandora/fgsconfigFinal/index/FgsIndex
     [copy] Copying 1 file to /home/giancarlo/FgsConfig
     [copy] Copying 1 file to /home/giancarlo/FgsConfig/configForIslandora/fgsconfigFinal
    [mkdir] Created dir: /var/lib/tomcat7/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal
     [copy] Copying 49 files to /var/lib/tomcat7/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal
     [copy] Copying 1 file to /var/lib/tomcat7/webapps/fedoragsearch/WEB-INF/classes

Total time: 1 second
cd /usr/local/solr/islandora/conf/
cp /var/lib/tomcat7/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal/index/FgsIndex/conf/schema-4.6.1-for-fgs-2.7.xml ./
mv schema.xml schema.xml.ORI
cp schema-4.6.1-for-fgs-2.7.xml schema.xml

nano -w schema.xml

	+ <dynamicField name="*_hlt" type="text_general"   indexed="true"  stored="true" termVectors="true" termPositions="true" termOffsets="true"/>
	+ <dynamicField name="dc.*"  type="text_general" indexed="true" stored="true" multiValued="true"/>

We need right foxmlToSolr.xslt to index every fields for Islandora modules.

apt-get install git
cd ~
git clone git://github.com/discoverygarden/basic-solr-config

cd /var/lib/tomcat7/webapps/fedoragsearch/WEB-INF/classes/fgsconfigFinal/index/FgsIndex/
cp foxmlToSolr.xslt foxmlToSolr.xslt.ORI
cp ~/basic-solr-config/foxmlToSolr.xslt ./
cp -R ~/basic-solr-config/islandora_transforms ./

nano -w foxmlToSolr.xslt
     adjust full path to islandora_transforms

nano -w islandora_transforms/slurp_all_MODS_to_solr.xslt
     adjust full path to library

apt-get install maven
cd ~
git clone git://github.com/discoverygarden/dgi_gsearch_extensions
cd dgi_gsearch_extensions/
mvn package
cp target/gsearch_extensions-0.1.0-jar-with-dependencies.jar /var/lib/tomcat7/webapps/fedoragsearch/WEB-INF/lib/

service tomcat7 restart

TEST installation

