Infinit.e - troubleshooting knowledgebase

Installation issues

MongoDB doesn't install

This is usually caused by an incorrect "/opt/infinite-install/conf/infinite.configuration.properties". Note the install may take up to 30 minutes to fail out if it is waiting for other components to start.

Install problems have also been observed when:

  • Directories from old DB installations have not been cleared out ("/mnt/opt/db-home", "/opt/db-home", "/data", the latter 2 are symbolic links). These directories should be deleted.
  • Running Infinit.e components (API, harvester) were trying to connect to the DB (ie to the previous install). These should be shut down.

I want to run Infinit.e on a 32b machine but the DB won't load (or I have less than 20GB available)

The default "oplog" (20GB) is too large for 32b machines. You can set "db.oplog=256" in the "infinite.configuration.properties" file and then restart the DB (and rerun "/opt/infinite-home/setupAdminShards.sh").

I want to run Infinit.e on a machine/VM with less than 4GB of RAM and am running into memory issues

This can most likely be fixed by setting "bootstrap.mlockall" to "false" in the  "infinite.configuration.properties" file, running "/opt/infinite-home/scripts/rewrite_property_files.sh" and then restarting the index engine ("service infinite-index-engine restart"). In addition, the Infinit.e-related files in "/etc/sysconfig" ("tomcat6-interface-engine", "infinite-index-engine", plus "infinte-px-engine.sh" in "/opt/infinite-home/bin/") allow the per-process memory to be controlled, though this shouldn't be necessary once the "mlockall" parameter has been changed.

I upgraded MongoDB to 2.4 and everything stopped working!

Downgrade back as described below, and then follow the instructions for upgrading.

  • yum -y downgrade mongo-10gen-2.2.0-mongodb_1; yum -y downgrade mongo-10gen-server-2.2.0-mongodb_1;
    • (If you were previously running 2.2.0 and just want to get everything back the way it was!) 
  • OR:
  • yum -y downgrade mongo-10gen-2.2.3-mongodb_1; yum -y downgrade mongo-10gen-server-2.2.3-mongodb_1;
    • (If you were previously running 2.2.3 or are feeling adventurous) 

And then restart the DB ("service mongo_infinite restart")

I upgraded MongoDB to 2.6 and everything stopped working!

Downgrade back as described above, and then follow the instructions for upgrading.

Installation via "yum" no longer works!

As of Jan 2014, the yum repository is hosted at "yum.ikanow.com", not "www.ikanow.com". Therefore the files "/etc/yum.repos.d/ikanow.repo" and "/etc/yum.repos.d/ikanow-infinite.repo" should be modified with the new URL. The new file contents can be found here.

I will be trying to keep the repository linked to "www.ikanow.com" but it will not be as reliably up since I don't control that site.

Yum returns the error "Package does not match intended download. Suggestion: run yum --enablerepo=<reponame> clean metadata"

This means the download of the RPM was corrupted. The most likely reason for this is that your IT department has a maximum download size restriction. You can either do an offline install (having obtained the IKANOW tar balls externally) or contact your IT department to lift the restrictions for the period of download. 

Components fail to start

Elasticsearch fails to start, "org.elasticsearch.bootstrap.ElasticSearch not found in gnu.gcj.runtime.system.ClassLoader"

This indicates that the wrong version of JAVA is being used. "/usr/bin/java" (normally a symbolic link) probably points to "/etc/alternatives/java" instead of "/usr/java/default/bin/java" - simply change the symbolic link. Also check JAVA_HOME isn't being set incorrectly (it should either be not set, or point to "/usr/java/default"), elasticsearch uses that to select its JAVA binary.

Elasticsearch fails to start, "The stack size specified is too small, Specify at least 160k"

This happens if Java7 is the default - eg "/usr/java/default" is a symbolic link that points (likely via /usr/java/latest) to something like "/usr/java/jre1.7.0_13". Unfortunately a number of Infinit.e components, including elasticsearch 0.19, are not currently compatbile with Java 7.

To fix, just delete "/usr/java/latest" and re-create it as a link to "/usr/java/jre1.6.0_30/" (which is part of the Infinit.e install), eg "ln -s /usr/java/jre1.6.0_30/ /usr/java/latest"

Tomcat fails to start

This is normally an error in one of the main configuration files (particularly if some post-install editing has occurred there!):

  • /opt/tomcat-infinite/interface-engine/conf/context.xml
  • /opt/tomcat-infinite/interface-engine/conf/server.xml

The logs for tomcat can be found in "/opt/tomcat-infinite/interface-engine/logs". Use "ls -lrt /opt/tomcat-infinite/interface-engine/logs" to see which files have been written to last and "tail" them looking for errors.

Tomcat is running but "/manager/" returns 404

This happens if selinux is enabled. Disable and reboot. Note that you may well need to reinstall everything - the user who reported this issue had many other "odd" problems, which all went away after he reinstalled.

MongoDB fails to start

This is normally one of the following issues:

  • One of the configuration servers is not running
  • The IP addresses in the "db.*" fields in "/opt/infinite-install/conf/infinite.configuration.properties" are incorrect. (Eg IP addresses are used not hostnames, and one of them has changed).
  • Some MongoDB-related problem (for which google is the best cure!)

To get some idea of the likely problem, the following 2 steps are useful:

  • Run "service mongo_infinite start" and see if the warnings and errors written to the console help narrow down the problem.
  • The MongoDB logs are in "/var/log/mongo".

Note that if the MongoDB install intially failed (eg due to a problem with the db.* configuration) then after it is fixed you need to run "/opt/db-home/setupAdminShards.sh" (on at least one DB node in the cluster) in order to load the default users. Also the geo database will not have been loaded - untar "/opt/infinite-install/data/feature/geo.bson.tar.gz" and then run "mongorestore /opt/infinite-install/data/feature/geo.bson".

It might be preferable to just uninstall the DB RPM and then install it again.

Note that we don't use the "mongod" start script, we use "mongo_infinite". You should never have to run things like "service mongod start", or start the mongos/mongod processes by hand.

User interface log-in issues

When I try to connect remotely to the GUI I get the following error: "Error 103 (net::ERR_CONNECTION_ABORTED)"

There is likely a firewall running on the Infinit.e server. Get the system admin to allow access on the web port (80 or 8080)

I can't even access the GUI or JSP webapps (eg I get a 404 error)

This is almost certainly because tomcat isn't running, or because the client browser does not have connectivity through to the server cluster.

If running locally (eg on a Centos6+ VM), try "http://localhost:8080" instead of "http://localhost". 

The GUI starts to load but then hangs at a grey screen (often the Manager webapp will still work)

This is always because a REST call back to the API (keepalive) has failed. There are 2 common reasons that have been seen for this:

  • The "ui.end.point.url" is incorrect in "/opt/infinite-install/config/infinite.configuration.properties", ie points to an IP address or hostname that is not accessible from the client browser.
  • If "ui.end.point.url" is not specified then the system tries to use the URL entered into the browser to get back to the API. Earlier versions of that code would sometimes fail and return 0 for the port instead of 80 (this has since been fixed). If you are running on an earlier version of the code and this is happening then just set "ui.end.point.url" by hand (or upgrade).
  • (If the crossdomain.xml is incorrect this will also occur, though this is less common - and only applies if the GUI files have been manually edited after installation, or if the API resides on a different logical hostname than the GUI)

I can log into the GUI but not the JSP webapps

This has been caused by running the wrong of JAVA.

"/usr/bin/java" (normally a symbolic link) probably points to "/etc/alternatives/java" instead of "/usr/java/default/bin/java" - simply change the symbolic link. Also check JAVA_HOME isn't being set incorrectly (it should either be not set, or point to "/usr/java/default"), elasticsearch uses that to select its JAVA binary.

This could also be caused by the "AppConstants.js" file containing an address (generated from "/opt/infinite-install/conf/infinite.configuration.properties") that is addressable from client machines but not internally (eg a DNS name on a DNS domain to which the server cluster does not have access). 

This could also be caused by the IP tables configuration being wrong (eg if the IP address has changed since the interface-engine RPM was run, "/etc/sysconfig/iptables" needs to be updated manually with the new address .. .eg "iptables -A OUTPUT -t nat -p tcp -d `hostname` --dport 80 -j REDIRECT --to-port 8080" then "service iptables save" as root)

I can't log into the GUI (but can log into the JSP webapps)

This could be caused by the "AppConstants.js" file containing an address (generated from "/opt/infinite-install/conf/infinite.configuration.properties") that is not addressable from the client machines hosting the web browser.

I can't log into the GUI or the JSP webapps

This is likely caused by the database being down. The URL "http://HOSTNAME/api/auth/login/ping/ping" can be useful in diagnosing issues - if it returns a JSON object saying "login failed" then the DB and index are both working, if you get an HTTP error (normally with some useful text - eg see below) then either the DB or index is down.

It could also be caused by the user not existing, or the password being incorrect. If you have sys admin access, check the database (social.person and security.authentication) for details. If not, contact a sys admin.

Sys admins: if the database didn't start properly during install for any reason (but a subsequent "service mongo_infinite restart" brought it to life), then you need to re-run "sh /opt/db-home/setupAdminShards.sh" to create the base users and communities).

It is also possible that this could be caused by restarting the DB without restarting the interface engine (eg the error "can't call something : /127.0.0.1:27017/social" from the "ping/ping" trick) - if so simply restart all the interface engines.

When I log into the GUI, I get error popup boxes (or when performing queries if already logged-in)

This often happens for one of 2 reasons:

  • Just after the system has been installed - when the harvester runs for the first time, it initializes some parts of the data store and index without which errors like this will occur. Check the 
    • (This can also happen during system operation just after new communities are added, if those communities are enabled in the GUI, simply wait for the next harvest cycle and disable them from the GUI source manager in the meantime)
  • Very occasionally the GUI "loses" a user's selection of enabled communities. Simply bring up the source manager, observe no communities are select, and re-select those desired.
  • (System administrators) If the above doesn't work then check elasticsearch is running (eg "curl localhost:9200"). If anyone has manually modified the indexes then they might have become corrupted, the "infinit.e.mongo.indexer" with "–verify" option can be used to correct this issue, see Infinit.e maintenance - command line utilities and other scripts and log files)
    • (Note that the "ping/ping" trick discussed above under "I can't login into the GUI or the JSP webapps" can be used to determine if the index is running if you can't easily login to the server)

(System administrators) While debugging a GUI problem I noticed some worrying errors in the "catalina.out" log file

(In "/opt/tomcat-infinite/interface-engine/logs")

There are some scary-looking but harmless entries in the log files that should be ignored:

  • anything starting "log4j:WARN"
  • anything starting "log4j:ERROR"

I updated the interface-engine RPM but the web pages/widgets didn't update as expected

Usually Infinit.e will prevent the browser from caching Flash objects such as the main GUI, the widgets, or the source monitor. If it appears that an old version is still present then the first step is always to clear the browser cache and refresh the page with CTRL+F5.

We also recommend turning off local Flash storage, eg by visiting the Macromedia control panel, and dragging the storage slider to "None".

For the management web pages (which are JSP based), it should not normally be necessary to clear the cache. However it has occasionally been necessary to clear out the "/opt/tomcat-infinite/interface-engine/work/" directory and restart the interface engine.

Harvesting

My source won't harvest

Source Editor GUI, use the "test" button to see if anything is returned. (This also returns more verbose errors in some cases).

Use the source monitor (<root>/InfiniteSourceMonitor.html), or the Source Editor GUI, or one of the REST calls "config/source/get", "config/source/user". "config/source/good".

Check out the harvest object of the source object:

  • If the "harvest_status" is "in_progress", then either the harvest is ongoing or (less likely) the harvester has shut down uncleanly while the harvest was ongoing. In the first case, the harvested documents should be available soon. In the latter (less likely) case, you may need to wait for a day
  • If "harvestBadSource" is set to true, then the source has been disabled due to source-level errors: it is probably not working.
  • If "isApproved" is set to false, then either the source has been automatically disabled 
  • If "harvest_status" is "error" (or even "success" but no documents have been harvested): check the "harvest_message", it is possible that some non-fatal error due to source misconfiguration is occurring.
  • If the "harvest" object does not exist

Otherwise:

  • If "searchCycle_secs" is set to "-1" (or a very big number) then this is expected behavior. Setting it to "-1" is a standard way of temporarily disabling sources.
  • If the last harvest was a long time ago, it is possible the harvester is not running: check with the system administrator.

Other common issues:

  • If using AlchemyAPI or OpenCalais, it is possible the API key is not set, or has not been propagated. Get the system admin to check:
    • Check "/opt/infinite-home/logs/infinite.service.log" for error messages
    • Check "/opt/infinite-install/config/infinite.configuration.properties" and check the API key is set
      • Don't forget to then run "sh /opt/infinite-home/scripts/rewrite_property_files.sh" to propagate the changes
      • Check "/opt/infinite-home/config/infinite.service.properties" to confirm the changes.

And some other observed issues:

  • (Administrators) If it starts and then exits immediately with no error or log message, check the MongoDB feature database for a collection called sync_lock:
bash> mongo feature
mongos> db.sync_lock.find()

If the collection is non-empty, then there may be an orphaned lock in the database. "sync_lock" is normally only called from a batch script called "sync_features.sh" that is run at 3am on Sunday (see here for more details). This takes a few hours to run. Check that "sync_features.sh" is not running anywhere on the system and then from mongo run "db.sync_lock.drop()" to remove the orphaned lock. If "sync_features.sh" has been running for a long then the database might have hung (in which case it may be necessary to restart and repair the DB).

I tried testing source and all I got was an alert dialog that complained about "undefined" objects

This is almost always because "Test Source" is trying to pop up a new browser window to show the results, but is being blocked by the browser. Some browsers will prompt you, others will just silently fail. It is an easy browser-dependent task to allow them (eg if Chrome doesn't prompt, there is a small red "x" to the right of the URL bar, which you can right click on etc).

After I published a source, the harvester hangs for ages and then crashes without finishing

There is a parameter that controls the maximum number of documents extracted per source per harvest cycle ("harvest.maxdocs_persource"). If this is set too high then the harvester can run out of memory (it is limited to 1 or 2GB to avoid interfering with the API or Lucene index). For sources that generate documents with many entities and associations (eg large rich PDFs), this default number is occasionally too high and should be reduced. (A roadmap issue is to limit sources on the number of features rather than the number of documents, which would stop this from happening).

I published a source and it just vanished!

Earlier versions of the Source Editor had a nasty tendency to do this if the JSON was invalid, but this has been fixed as of the June 2013 release (together with the addition of a built-in JSON validator and many other goodies).

One other way this can happen is if you create a new source, then paste an old published source into the editor. The "_id" and "key" fields will make the server think that you're editing an old source, so that will get overwritten (and the new source will seem to disappear). To avoid doing this, simply use the "Scrub" button (June 2013+; or remove by hand on earlier versions). 

(Administrators) The harvester appears to be running, but the log is empty

This often occurs if the "infinite-px-engine.sh" script is run as root. The correct way to run it by hand is

runuser - tomcat -c "/opt/infinite-home/bin/infinite-px-engine"
sudo runuser - tomcat -c "/opt/infinite-home/bin/infinite-px-engine"

If it is run as root, it will then take ownership of the log file and the normal harvester (which runs as tomcat) will not be able to write to it anymore. To fix the problem simply "chown tomcat.tomcat /opt/infinite-home/logs/infinite.service.properties".

I want to migrate my documents from one community to another (needs Administrators)

The UI recommends duplicating a new source in a different community and deleting the old one. This will only re-import documents that are still visible though (eg in the RSS stream/in the import directory/etc), so it may not be viable. There is an alternative available to users with administrative access to the cluster.

Assume you are moving source with key "K" from community with _id "A" to community "B". Perform the following steps:

  • Move the source:
    • mongo ingest --eval 'db.source.update('{key:"K"},{$set:{communityIds:[ObjectId("B")]}});'
  • Move the documents in the database
    • mongo doc_metadata --eval 'db.metadata.update({sourceKey:"K"},{$set:{communityId:ObjectId("B")}},false,true);'
  • Synchronize the documents:
    • sh /opt/infinite-home/bin/infinite_indexer.sh --doc --query '{"sourceKey":"K"}'

This won't populate the feature.entity and feature.association tables that are used (primarily for) the search suggest function. These tables will be rebuilt during the weekly resync (Sat night/Sun morning). If necessary, the resync can be launched manually via "sh /opt/infinite-home/bin/sync_features.sh" on the elasticsearch master node. (Warning: can take a while and will impact performance signficantly)

System administration

I restarted MongoDB and the database is now empty!

One possible reason for that is if the database was restarted using "/etc/init.d/mongod start|stop|restart" instead of the correct "/etc/init.d/mongo_infinite start|stop|restart". If you did this, just stop the existing mongod and restart using mongo_infinite.

I changed the configuration in infinite.configuration.properties but nothing happened even after I restarted the service

(Almost all configuration changes require at least an interface engine restart)

In order to distribute changes from the central "/opt/infinite-install/config/infinite.configuration.properties" to the configuration files read by the individual services (in "/opt/infinite-home/config" and "/opt/tomcat-infinite/interface-engine/conf"), there are 2 scripts that must be run:

  •  "/opt/infinite-home/scripts/rewrite_property_files.sh" - always
  • "/opt/tomcat-infinite/interface-engine/scripts/create_appconstants.sh" - only if changes that affect the GUI have been made ("app.saas", "ui.*", "access.timeout") 

Before the service restart.

I changed the admin password (or username) in the infinite.configuration.properties file but nothing happened, even though I restarted the service and ran all the scripts mentioned above

The "admin.email" and "admin.password" parameters in the central "/opt/infinite-install/config/infinite.configuration.properties" are only read on install. To change the email addresses and passwords of any users post install, use the People tab of the Manager webapp.

I changed the configuration in one of the properties files in /opt/infinite-home/config and later the change was overwritten

As above, configuration changes should generally only be made to  the central "/opt/infinite-install/config/infinite.configuration.properties" and then distributed using the 2 scripts mentioned above. Otherwise, running one of the 2 scripts or updating the config RPM will overwrite changes.

There is an exception: some "beta" or "internal" configuration parameters are ignored by the distribution scripts (the configuration infrastructure is in need of an overhaul...). These need to be inserted into "/opt/infinite-home/config/infinite.api.properties.TEMPLATE" or "/opt/infinite-home/config/infinite.service.properties.TEMPLATE", in which case the changes are copied correctly when the service-specific configuration files are rebuilt.

I couldn't change a user's password from the Manager web app

Older versions of the manager webapp had an encoding error for passwords whose hashes generated "/" characters. This has been fixed - if you are running on an older version then just use the "Save User Account" button instead of the "Update Password" button.

I would like to export the data in Infinit.e

There are a few options:

  • The MongoDB database is backed up nightly (overwriting the previous night's, unless S3 export is enabled) - see here, under "DB backups (mongodb)"
  • The "export to HDFS" option in the plugin manager can be used (if HDFS is not installed this data dump goes to ~tomcat/completed/<communityid>_/<jobtitle>).
  • You can cycle through the data using the API, as described below:

curl -o test.json 'http://ROOTURL/api/knowledge/document/query/*?infinite_api_key=API_KEY&input.sources=SOURCE_KEY&output.docs.numReturn=10000&score.relWeight=0&score.sigWeight=0&score.scoreEnts=false&output.docs.skip=SKIP'

Where SKIP is 0, 10000, 20000, etc (it takes about 1 minute/10K docs). There are a couple of ways of deciding when to stop:
  • stop when you get data: []
  • read the stats.found param and decide how many times to call it

Copyright © 2012 IKANOW, All Rights Reserved | Licensed under Creative Commons