Monday, December 3, 2007

Around and Beyond SC07

Before I even noticed, two whole months flew around that emotional and exhausting conference in Reno. I was unintentionally and indirectly reminded the other day that anxious readers have been waiting for news updates, here comes the chronicle summary I guess.

  • CIMA High Availability prototype finished before SC07. Several application level bugs and protocol limitations have been discovered, fixed, or planned for improvements. After enormous physical cable pulling tests in the machine room, we were finally confident with what to expect in various failure cases, and set up the two failover servers in geographically distributed locations. Well, 50 miles apart may not be much, but it's better than 5 inches.

  • SC07 had been quite eventful: attended GCE07 because my name incidentally appeared in one of the papers; gave a booth presentation on the CIMA High Availability project; served as the mobile communication center between reno booth and CIMA east coast headquarter office; met old friends scattered in booths around the show floor; and most important of all, witnessed the great conquer of data capacitor winning the bandwidth challenge, which makes us all proud.

  • Heading back and onto the portal world, we started a project to replace open direct data links in CIMA portal with GridFTP on demand transferring services. The idea is rather straightforward, while sorting through complex code structures resulted from a blend of Windows and Linux developers over the years had not been. A prototype is now in place after all the funs with jars, and the portlet is about twice slower than the original. Our clients cheered whole-heartedly, "Impressive! Much faster than I expected!" Though paying a performance price in exchange for security is expected, we are still seeking ways to bargain.

  • iCenter project that collaborates with the biology department to establish data management infrastructures for a new light-microscopy core facility is steadily progressing. A couple of productive meetings acquiring user requirements have been held.

  • WIYN ODI Data Pipeline and Distribution project is also marching forward, and we had a whole day planning and discussion meeting with visitors from LSST and NOAO just recently.

  • Now talking about meetings, there had been a few others too, like final TeraGrid weekly status meeting, status meeting with glorious data capacitor, and strategic meeting preparation meeting. Does Skype video conferencing count?

Looking forward, a couple of other portal projects are in the lineup too, and I'm hoping to work/write more about the high availability project with DRBD and heartbeat in my copious free time, before it's completely spaced.

With the holiday season right around the corner, I'm very attempted to make a resolution about more dedicated blogging in the coming new year, via a youtube-banned video. So long, 2007.

Sunday, December 2, 2007

Leopard widget repair DIY

After (thankfully) upgrading to Leopard, a couple of my favorite dashboard widgets started to have problems. Obviously it's hard for me to live without knowing the current weather outside, or the current date in Lunar calendar, I started my hunting trip of those little tiny lines of codes that broke under the latest Safari/Leopard, and here's a summary of how:

  • Sharp the weapon: in Developer Tools that came with Leopard, Dashcode is a nice debugging environment that one can view, modify and test run widgets from source. Once installed, it can be found under /Developer/Applications/

  • Know the enemy: all downloaded third-party widgets are installed in ./Library/Widgets/ under user's home directory, and the widget source can be viewed via "Show Package Contents" option with right mouse click.

  • Onto the weather: after downloading WeatherBug Local Weather version released on Nov. 7, 2007, Leopard complains that it can't be installed because it's not a widget. Dashcode revealed the truth that file Info.plist is missing. Actually it was not missing, but named as info.plist, with the small "i". That's all needed to get radar maps back on board.

  • Finally the date: China Calendar widget stopped reacting to mouse clicks on any buttons under Leopard, and its "confirm" button appeared to have misalignment on edges. Only file mycal.html needs to be modified for complete repair: replacing the parameter this.tag with in all mouse movement functions fixes the former, and specifying source of genericButton.js to use the system one at file:///System/Library/WidgetResources/button/genericButton.js fixes the later.

One less thing to be unhappy about Leopard.

Friday, November 30, 2007

CIMA Portal Link

My contribution to make CIMA Portal more popular for the world.

Sunday, October 21, 2007

STONITH with DRBD and Heartbeat

Node fencing in Heartbeat is implemented by STONITH. Cutting the documentation chase, suppose an external STONITH plugin already exists, how to configure it with Heartbeat? Here comes our special feature for the week: what can /var/log/messages + Google do for you.

Before heading down the road, make sure the standalone stonith works:

stonith -t external/myplugin -T reset -p "NODE IP_ADDR USER STONITH_PASSWD_FILE" nodename

Now configure either stonith or stonith_host directive in /etc/ha.d/ Note that the two directives are mutually exclusive. Also note the set of directives automatically implied when crm is turned on.

For version 1 Heartbeat (i.e. crm no), this should be sufficient. However, during our test, after simulating a failure with kill -9 heartbeat_master_process_id, and successfully STONITH-ed the node, resources did not migrate. /var/log/messages revealed the following:

ResourceManager: info: Running /etc/ha.d/resource.d/drbddisk mysql start
ResourceManager: ERROR: Return code 20 from /etc/ha.d/resource.d/drbddisk
ResourceManager: CRIT: Giving up resources due to failure of drbddisk::mysql
ResourceManager: info: Releasing resource group:

which means that one of our drbd resources failed to start on the new node, and heartbeat hence stopped all resources defined in the same group as the drbd resource. This turns out to be related to the heartbeat deadtime and drbd ping time. Without much clue on how to relatively tune up the two parameters, I simply followed the hack mentioned in a rather informative discussion, modifying /etc/ha.d/resource.d/drbddisk. Simply increase the value of variable try to 20 in:

case "$CMD" in
# try several times, in case heartbeat deadtime
# was smaller than drbd ping time
fixed the problem.

More tweaks are required for version 2 Heartbeat (i.e. crm yes) though. The simplest way to configure STONITH with crm is still using /usr/lib64/heartbeat/ to automatically generate /var/lib/heartbeat/crm/cib.xml from /etc/ha.d/ However, /usr/lib64/heartbeat/ has a typo on the line:

if option_details[0] == "stonith_enabled" and enable_stonith:

where "stonith_enabled" should have been ""stonith-enabled"" instead. This bug would result STONITH disabled in the generated /var/lib/heartbeat/crm/cib.xml:

<nvpair id="cib-bootstrap-options-stonith-enabled" name="stonith-enabled" value="false"/>

Fix either Python or XML to enable STONITH with crm.

Progressing further, we again experienced the similar problem that resources didn't migrate after successful STONITH. This time /var/log/messages was more descriptive:

lrmd: info: RA output: (drbddisk_2:start:stderr) ioctl(,SET_STATE,) failed:
lrmd: info: RA output: (drbddisk_2:start:stderr) Permission denied Partner is already primary
lrmd: info: RA output: (drbddisk_2:start:stderr) Command '/sbin/drbdsetup /dev/drbd0 primary' terminated with exit code 20
lrmd: info: RA output: (drbddisk_2:start:stderr) drbdadm aborting

No sweat, it's the same old deadtime vs. ping time issue. Such messages repeated 5 times in the log, indicating /etc/ha.d/resource.d/drbddisk attempts. But wait, didn't we already increased the "try" to 20? Why only 5? Well, noticed this in the log?

lrmd: WARN: on_op_timeout_expired: TIMEOUT: operation start on heartbeat::drbddisk::drbddisk_2 for client 22836, its parameters: CRM_meta_op_target_rc=[7] 1=[mysql] CRM_meta_timeout=[5000] crm_feature_set=[1.0.7] .
crmd: ERROR: process_lrm_event: LRM operation drbddisk_2_start_0 (18) Timed Out (timeout=5000ms)

Apparently that 5000ms timeout to start this drbd resource is a bit too short, which can be changed in corresponding primitive sections in /var/lib/heartbeat/crm/cib.xml by adding something like:
<op id="drbddisk_2_start" name="start" timeout="60s"/>

Be sure to extend the start timeout for both the heartbeat resource drbddisk and its corresponding ocf resource for the file system. More details about cib.xml can be found in /usr/lib64/heartbeat/crm.dtd. Some contraint examples are particularly helpful in understanding and configuring preferred stonithd resource locations using INFINITY/-INFINITY scores.

Sunday, October 14, 2007

Tunnel Light

It had been a reasonably quiet and productive week.

  • After enough messing around with credentials, certificates, and things like that, I witnessed some GridFTP portal displaying my home directories on both Data Capacitor and HPSS, how rewarding.

  • Another WYIN/ODI project meeting with astronomers, indicating yet another potential CIMA collaboration. The interested starting point seems to be resuming old RoboScope work for SpectraBot.

  • Application level failover testings for CIMA. The system now seems to be capable of surviving most failures/outages except for one, which roots deep down at the instrument data collection side.

  • Boiled down the annoying "There is something wrong" problem with the heatbeat configuration, bless our poor souls that suffered this message for months.

  • Reviewed a paper for GCE07, and also got back "warm words" for the one with my name on it.

More failover testings await for the week ahead, mostly at the system level. If we could tune up across distances by Friday, I'd see that light at the end of the tunnel better, I mean the tunnel of SC07 of course.

Friday, October 12, 2007

Making of Heartbeat2 Resources: Is There Something Wrong?

Linux HA project is a rather powerful and useful high availability solution, yet can be painful too, especially for those lost in documentations. We have many stories to tell, but this one amuses me most, by far.

In short, Heartbeat2 is capable of monitoring individual resources in addition to cluster nodes through Cluster Resource Manager (CRM). When a failed resource is detected, it will try to restart the resource on the same node. Our codes haven't been evil enough to testify the following statement from FAQ though: "The node will try to restart the resource, but if this fails, it will fail over to an other node. A feature that allows failover after N failures in a given period of time is planned."

To turn on the CRM resource monitor feature:
  1. Modify /etc/ha.d/ to include the line crm yes

  2. Configure resources to be managed in /etc/ha.d/haresources. Though no longer used in Heartbeat2, we found it easier to just go through the conversion path.

  3. Clean any old CRM configurations:
    rm -f /var/lib/heartbeat/crm/cib.xml*

  4. Generate the fresh configuration file /var/lib/heartbeat/crm/cib.xml:
    python /usr/lib64/heartbeat/

  5. Start Heartbeat daemon:
    /etc/init.d/heartbeat start

If some resource can not be started correctly, likely that its corresponding Init script is not LSB Compliant. Just test and fix accordingly. For example, default httpd and mysqld that come with RHEL4 distributions are not LSB compliant. A conservative approach is to make a copy from /etc/init.d/ to /etc/ha.d/resource.d/, and modify from there.

Once all heartbeat managed resources have been started and running correctly, failover scenarios can be simulated and tested through forced nodes takeover, using crm_standby.

To modify anything in /var/lib/heartbeat/crm/cib.xml, using cibadmin is required. For example:

cibadmin -U -o resources -X '<op id="test_8_mon" interval="1s" name="monitor" timeout="2s"/>'

changes the interval and timeout values for monitoring our test resource.

Still not too bad? Well, it's not finished yet if you start to notice this message in /var/log/messages: (BTW, "ouput" is NOT my typo.:)

WARN: There is something wrong: the first line isn't read in. Maybe the heartbeat does not ouput string correctly for status operation. Or the code (myself) is wrong.

It may seem not so harmful when predefined resources and services are all up and running, and can migrate between nodes without any problem. However, we experienced mysterious hearbeat behaviors from time to time when such messages were flooding /var/log/messages. For example, when our failed test resource was detected, but could not be restarted, heartbeat started to just completely ignore the resource. No further actions were taken. This may not be related, but it hasn't occurred since the above error was eliminated.

So where is "something wrong"? The Resource Agent. Examining /var/lib/heartbeat/crm/cib.xml closely, we noticed that resources like httpd, mysqld, and our test resource are all defined as Heartbeat Resource Agents, which "are basically LSB init scripts - with slightly odd status operations". Odd where?
The status operation has to really report status correctly, AND, it has to print either OK or running when the resource is active, and it CANNOT print either of those when it's inactive. For the status operation, we ignore the return code.

This sounds quite odd, but it's a historical hangover for compatibility with earlier versions of Linux which didn't reliably give proper status exit codes, but they did print OK or running reliably.

Heartbeat calls the status operation in many places. We do it before starting any resource, and also (IIRC) when releasing resources.

After repeated stop failures, we will do a status on the resource. If the status reports that the resource is still running, then we will reboot the machine to make sure things are really stopped.

I.e., a running resource should literately print OK or running with the status operation, nothing more. Both httpd and mysqld use the status function in /etc/rc.d/init.d/functions, which prints echo $"${base} (pid $pid) is running..." instead. Ding~. Changing it to echo $"running" finally eliminates the annoying message from /var/log/messages. Of course as a conservative, I copied /etc/rc.d/init.d/functions to /etc/ha.d/resource.d/functions, and modified correspondingly.

Monday, October 8, 2007

Across Grids: got CA certificates?

I had problem GridFTP between gf1 and BigRed earlier. On

$globus-url-copy -vb gsi gsi

error: globus_ftp_control: gss_init_sec_context failed
OpenSSL Error: s3_clnt.c:842: in library: SSL routines, function SSL3_GET_SERVER_CERTIFICATE: certificate verify failed
globus_gsi_callback_module: Could not verify credential
globus_gsi_callback_module: Could not verify credential: self signed certificate in certificate chain

It turns out that I was missing gf1 server certificates in ~/.globus/certificates/. Copied from /etc/grid-security/certificates/, all happy then.

Around BigRed: Available File Systems

Reference IU Research Systems Disk Space Guide for more detailed descriptions. Particularly on BigRed, general users can access several file systems as follows:

  • Home directory: /N/u/username/BigRed

  • Local scratch: /scratch/[username]

  • Shared scratch (GPFS): /N/gpfs/[username]

  • Data Capacitor scratch (Lustre): /N/dc/scratch/[username]

Note that Data Capacitor project space is accessible for project group members via /N/dc/projects/[projectname] .

Friday, October 5, 2007

Down the Rabbit Hole

Sometime in late September, I transferred to this new group focusing on portal and science gateway development, where everyone blogs everything. While down the rabbit hole, one might as well keep up with the trend. And in case you wonder, yes it's from Tales of Symphonia.

Meetings, meetings, Meetings.

It had been a couple of quite active weeks, some of us met even more than once:

  • Regular Data Capacitor group meetings: SC07 is the theme. A separate GridFTP server to DC is on the way too.

  • Portal group meetings: dissertation work review, especially for CIMA, in 10 slides or less to impress general audience for possible collaborations (SNS at ORNL, etc.); setting priorities for projects.

  • WYIN/ODI project meeting with astronomers: short presentations, getting acquainted, and background stories.

  • RT-ALL strategy meeting: long

  • Regular TeraGrid meetings: recoup

  • Astronomy department brown bag seminar: using Data Capacitor for ODI project

  • Tech Tuesday talk: preserving data objecs

  • CS/PTL/UITS monthly meeting: infoshare


  • Got TeraGrid account, login, tried single-sign-on, tested gridftp between DC and HPSS.

  • Updated CIMA web service for Admin portlet to indicate number of movies associated with a given sample.

  • Oh yeah, of course wrote this blog, using correct HTML under right editor, you think that's easy?


  • Couldn't out-wait Apple on Leopard, hence went ahead and got the MBP. A beauty, and, (un)fortunately hot.

  • ORNL is renewing my badge, oh the number of emails and resumes it takes.

  • Made SC travel arrangement, at least that one was not bad at all.

To Do

  • More testings with GridFTP, BigRed, DC, HPSS, TeraGrid credentials. What exactly is that SC portal supposed to be or do?

  • Extensive CIMA failover testings, both at application and system levels. Targeted service up date is Oct. 19th.

  • 10-15 minutes SC07 presentation on CIMA failover, better start thinking about slides.

  • Read up on NVO before next meeting with astronomers, Tuesday Oct. 9th.

GridFTP with BigRed/Data Capacitor and HPSS - Idiot's Guide

Finally I received a username and password to open the magic door to TeraGrid. In another moment down went Alice after it, never once considering how in the world she was to get out again.

Simplest steps to GridFTP zeros with BigRed and HPSS follow:

  1. Refer to the general TeraGrid single sign-on guide. An example session from
    $ export GLOBUS_LOCATION=/home/globus/nmi-8.0-rh9/
    $ source /home/globus/nmi-8.0-rh9/etc/
    $ export
    $ export MYPROXY_SERVER_PORT=7514
    $ mkdir -p ~/.globus/certificates
    $ cd ~/.globus/certificates
    $ wget
    $ wget
    $ wget
    $ myproxy-logon -T -l myusername
    Enter MyProxy pass phrase:
    A credential has been received for user myusername in /tmp/x509up_u1209.

  2. Login to BigRed using gsissh
    $ gsissh

  3. Start transfer zeros:
    $globus-url-copy -vb gsi gsi
    8388608000 bytes 75.40 MB/sec avg 36.36 MB/sec inst

  4. Not bad, right? ;) Just make sure the absolute file paths of both source and destination are correct.

  5. Oh, and my ~/.soft on BigRed looks like:

Kerberos, Gridsphere, and a bit more

I tried to develop a portal to enable easy access between Data Capacitor and HPSS for general users authenticating through Kerberos at the beginning of the year. Here are some old notes from back then, when I struggled to install all different pieces together.

I. Prerequisite of Software, versions as of 03/12/2007
  1. Apache Ant 1.7.0
  2. Apache Tomcat 5.5.20
  3. Sun JDK 1.6
  4. GridSphere 2.2.8
  5. Apache2 web server
  6. Secure Perl web services require packages Soap:Lite and Crypt-SSLeay
  7. Axis 1.4 is needed for wsdl2java tool to convert the web services WSDL document into Java codes.

II. Installation Tweaks for authentication and security

a.) SSL configuration for Tomcat (Reference Tomcat Howto)
  1. Create a certificate keystore by executing the following and specify a password:
    $JAVA_HOME/bin/keytool -genkey -alias tomcat -keyalg RSA

  2. Uncomment the "SSL HTTP/1.1 Connector" entry in $CATALINA_HOME/conf/server.xml and tweak as necessary, particularly defining the attribute "keystorePass" with the chosen password from the previous step.

    <!-- Define a SSL HTTP/1.1 Connector on port 8443 -->
    <Connector port="8443" maxhttpheadersize="8192" maxthreads="150"
    minsparethreads="25" maxsparethreads="75" enablelookups="false"
    disableuploadtimeout="true" acceptcount="100" scheme="https"
    secure="true" clientauth="false" sslprotocol="TLS"

b.) Kerberos configuration for GridSphere
With the following configuration, existing GridSphere portal users can authenticate through a designated Kerberos server, assuming /etc/krb5.conf is valid.
  1. Modify the <auth-module> section for "GridSphere JAAS" in $CATALINA_HOME/webapps/gridsphere/WEB-INF/authmodules.xml, setting <active> to true. Note that the priority number in different <auth-module> sections indicates the fallback orders of multiple authentication schemes. Smaller numbers are associated with higher priorities.

  2. Create a file $CATALINA_HOME/conf/jaas.conf as following:

    Gridsphere { required;

  3. Modify $CATALINA HOME/bin/ to include the following:

    export JAVA_OPTS="$CATALINA_HOME/conf/jaas.conf"

c.) HTTPS configuration for Apache2 web server on Fedora (Reference Apache Installation and Configuration Guide on Fedora Core)
  1. Create a new CA certificate

    [root@localhost root]# cd /usr/share/ssl/misc
    [root@localhost misc]# ./CA -newca
  2. Create a Certificate Signing Request (CSR)

    [root@localhost misc]# ./CA -newreq
  3. Sign the CSR

    [root@localhost misc]# ./CA -sign

  4. Store certificates in a directory

    [root@localhost var]# mkdir myCA
    [root@localhost var]# cd myCA
    [root@localhost myCA]# cp /usr/share/ssl/misc/demoCA/cacert.pem .
    [root@localhost myCA]# cp /usr/share/ssl/misc/newcert.pem ./servercert.pem
    [root@localhost myCA]# cp /usr/share/ssl/misc/newreq.pem ./serverkey.pem
    [root@localhost myCA]# ls
    cacert.pem servercert.pem serverkey.pem
    [root@localhost myCA]# cd /var/myCA
    [root@localhost myCA]# cp servercert.pem /etc/httpd/conf/ssl.crt/server.crt
    cp: overwrite `/etc/httpd/conf/ssl.crt/server.crt'? y
    [root@localhost myCA]# cp serverkey.pem /etc/httpd/conf/ssl.key/server.key
    cp: overwrite `/etc/httpd/conf/ssl.key/server.key'? y

  5. Edit ssl.conf (optional): open ssl.conf for editing, and uncomment and edit the following directives. You may want to change DocumentRoot to point to another directory, such as /var/www/ssl, and place your SSL files inside there instead.


  6. Require SSL (Data Capacitor specific): edit httpd.conf, comment the section that listens on port 80, and add SSLRequireSSL and Options ExecCGI to CGI directory configuration. e.g.

    ScriptAlias /cgi-bin/ "/var/www/cgi-bin/"
    <Directory cgi-bin='' www='' var=''>
    Options ExecCGI
    AllowOverride None
    Options None
    Order allow,deny
    Allow from all

  7. Disabling the passphrase on startup (Optional): to startup Apache automatically on boot without user intervention, the passphrase prompt can be disabled by simply de-crypting the server key.

    # cd /etc/httpd/conf/ssl.key
    # cp server.key server.bak
    # openssl rsa -in server.bak -out server.key

d.) Java SSL configuration with self-signed certificates (Reference here)
When opening an SSL connection to a host using self-signed certificates in Java, following exceptions may be thrown:
ValidatorException: PKIX path building failed:
SunCertPathBuilderException: unable to find valid certification path to requested target.

To add the server's certificate to the KeyStore of trusted certificates, a simple solution is to compile and run the InstallCert program:

java InstallCert hostname

It displays the complete certificate and adds it to a Java KeyStore 'jssecacerts' in the current directory. Either configure JSSE to use it as the trust store, or copy it into $JAVA_HOME/jre/lib/security directory. For all Java applications to recognize the certificate as trusted and not just JSSE, you could also overwrite the cacerts file in that directory.

e.) Secure web services configuration:
  1. Specify the https location in the <service> tag of WSDL

  2. To encode information in both soap header and body, reference WSDL specification, or Chapter 9 of "Programming Web Services with Perl".

III. GridSphere Tips
  1. To share a variable among different portlets within an application, use setAttribute and getAttribute of PortletSession at "APPLICATION_SCOPE".

  2. For simple persistence of user information between logins, use setups for PortletPreferences.

  3. To forward username and password upon login to external secure web services, modify the login function in src/org/gridlab/gridsphere/services/core/user/impl/

  4. To change logout behavior, modify the logout function in src/org/gridlab/gridsphere/servlets/

  5. Given a WSDL document, use wsdl2java tool in Axis package to generate corresponding Java codes; compile them with:

    javac -d . -classpath $CP *.java

    create a jar file with:

    jar -cf mywebservices.jar MyWebServices_pkg/

    and finally put the jar file in the corresponding lib directory of GridSphere portlet application. Note that jar files in the lib directory of axis need to be in the classpath.