This article will explain how to deploy a basic Endeca pipeline quickly and easily using the Endeca deployment template.
Blue Fish Development Group
701 Brazos St. #700
Austin, TX 78701
(512) 469-9300
This article will explain how to deploy a basic Endeca pipeline quickly and easily using the Endeca deployment template.
Endeca pipelines can be deployed in any number of ways, though some ways are better than others. How you implement and deploy Endeca pipelines can impact the maintainability and sustainability of your solution, and a good, clean deployment process can reduce the risk of future complications early on. Thankfully, Endeca has recently introduced a generic deployment template which is based on best practices and deployment process standards.
This article will highlight the benefits of using the deployment template, address some common pitfalls of non-standardized deployments and explain the basic scripting mechanisms provided by the template. It is assumed that the reader has a basic understanding of developing Endeca pipelines and is deploying pipelines using at least Endeca Information Access Platform 5.0.
An Endeca pipeline deployment consists of configuring pipeline components and scripts in the Endeca Application Controller (or Job Control Daemon) and ensuring the pipeline can be kicked off via a script so that it can run on a schedule.
Prior to the introduction of Endeca
Because there was no mandatory way to configure an Endeca pipeline or its supporting scripts, a developer chose his own preferred method. This made it difficult for another developer to understand the configuration. If the pipeline was using auxiliary scripts, the developer may not have been using the provided scripting utility, this made it harder to coordinate scripts with the Endeca Application Controller and share scripts among other projects. Additionally, logging and index archival did not typically exist, often resulting in a loss of essential log files and indices.
With the advent of the deployment template, all the guess work is taken out of deploying an Endeca pipeline.
The release of Endeca’s generic deployment template enables developers to standardize the deployment process. The deployment template is a set of scripts and configuration which takes care of a project’s deployment requirements. The details of deployment requirements will be explained in depth later in the article. The template provides many features to ease the deployment process. Some of the features provided are:
These features are what make the deployment process easy, repeatable and supportable. Anyone with experience using the template will easily understand any pipeline, its configuration, and its supporting scripts.
Before we get too far, I want to provide a quick overview of what EAC, or the Endeca Application Controller, is and what it does. The Endeca Application Controller, introduced in Endeca IAP 5.0, is a web service which manages the pipeline configuration, environment configuration and scripts. EAC is also used to execute pipeline components and scripts it has been configured to run.
The user interface for EAC is a web application called Web Studio which can be used to view or update configuration information and execute scripts and components. Since EAC runs as a web service, developers are not bound to using Web Studio to interact with it. The deployment template configures EAC during initialization and updates, and developers are also free to interact with it.
The deployment template may be downloaded from Endeca’s support site. The latest version at the time of this article is 2.1.
In addition to its supporting install files the deployment template unzips to provide two scripts in the bin directory: deploy.bat or deploy.sh, to support both Windows and Unix/Linux environments. Running the deploy script will prompt the user to verify the installed version of Endeca, along with a few questions: project name, installation location, and EAC port (the default EAC port is 8888). This will configure the template to the user’s Endeca environment and install the template in the provided path.
There are a few files that are very relevant to using the template successfully. This list is a breakdown of the most important paths and scripts provided:
Note: In this article, it is assumed the template was installed under Windows, therefore all scripts have the .bat extension. Had the template been installed in a *nix environment, everything would be the same except the scripts would have a .sh extension, built for running under those environments. All behavior and functionality should remain the same.
Most of the default configurations and scripts provided by the deployment template in the AppConfig are good but there are a few things that might be worth double checking.
Installing the pipeline involves removing the wine pipeline and its specific configuration, and replacing it with the new pipeline. To accomplish this:
Once the engine is running, the index may be viewed with the provided reference application.
EAC scripts are scripts that are written in Java (supported by the BeanShell framework) and live in the AppConfig xml file. When the deployment template initializes or updates EAC with the AppConfig the provided scripts will be included as well. Once a script is in EAC it can be kicked off from Web Studio or by sending a command from the command prompt. There are a few scripts that are provided ‘out of the box’, and are a great starting point for writing scripts to meet your specific needs. EAC scripts can be used for virtually any purpose imaginable. The baseline script, for instance, will archive logs and the index, kick off a full crawl and then restart the engine. Developers can write scripts that are necessary for their project as well.
Consider a project that has a pipeline component which depends on data that is retrieved from an LDAP server. The developer can write an EAC script that will retrieve the data from LDAP and store it locally. After the data is retrieved the script can kick off the crawl which processes the LDAP data. Another scenario I’ve personally encountered is when an indexing process is dependent on the completion of several concurrent processes. EAC can solve this using its provided lock manager, one of the many features of the EAC scripting framework, to synchronize the processes. The concurrent processes can all obtain locks using the lock manager, and the index will wait for all the obtained locks to be released before executing.
The implementation details of EAC scripts are beyond the scope of this article, but the scripts that are provided out of the box are a great place to start. EAC scripting is the optimal solution for supporting pipelines; its flexibility allows even the most complex problems to be solved. Check back later for an article on EAC scripting.
The deployment template makes it easy to quickly deploy Endeca pipelines and provide excellent configuration management for the pipeline without additional development time. Developers need not worry about a new index wiping out an old one or of log files over-writing each other as long as the deployment template is being leveraged, and configuring archaic control systems is no longer a concern either. The deployment template will quickly become an essential part of any Endeca toolkit after just one use.
Mail questions or comments to:
Apaar Trivedi (atrivedi@bluefishgroup.com)
Subscribe to our newsletter to be notified when new articles are posted. You can unsubscribe at any time.
You must be logged in to post a comment.