Wednesday, July 4, 2012

MetCat Test-suite - Airavata workflow publisher client


Recently we added a new out of the box module called for test-suite for MetCat. It is a workflow publisher client which can emulate workflows and pre-poluate data. So using test-suite, you can emulate heavy workload for MetCat and as well as can be used justify performance metrics.


Test type: run_from_registry


In 0.1 Release of the test-suite, we have integrated only one test-type called run_from_registry. When Test-suite is running a run_from_registry type test case, it takes saved workflow-templates from Airavata Registry, spawn workflows using those templates and submits workflows to a Airavata server to execute in parallel. Here a workflow is an instance of a workflow-template with unique topic ID.

In this test type, test-suite assign each workflow into a separate thread, and assigned in to a fixed size thread pool. So each thread is responsible for executing its assigned workflow. Test-suite is using Airavata Client API to communicate with Airavata Server.


Test Mode
a run_from_registry test can be done in two modes.  

  • Parallel
  • Loop



    Parallel

    In Parallel mode test-suite schedule each workflow template to execute only for N times on separate threads in parallelly and then execute them. Assume there are two workflow-templates are saved in Registry, and you have define the number of repeat times (N), then test-suite will execute total 2N workflows including N from one-template and other N from other template.


    Loop

    In Loop mode test-suite schedule each workflow template to execute for N times on separate threads in parallel and execute them until you terminate the test-suite execution. I am not saying it runs Infinite times, But it will take several years to finish both scheduling and executing workflows in this mode. So when you want to terminate test-suite execution, feel free to press Ctrl + C.


    Minimum Work Load

    To execute a workflow it takes more time ( simple workflows takes few seconds, but large workflows takes few minutes to hours, days even months...). But to schedule a workflows, it takes only few milliseconds for test-suite. So There is a possibility to generate high workflow load for test-suite by it self. So thread scheduling should be handled carefully, So we introduced a parameter called minimum workload, which defines minimum number of threads should be assigned to the thread pool for execution, before scheduler goes to sleep. when internal load becomes low, scheduler again start his work.

    Logs

    All activities will log and exported into a log file. Please define what you want to log in test-suite property.

    Configuration

    you can configure all above disused properties by editing test-suite properties in conf directory in the binary distribution of the test-suite.

    We are hoping to integrate several other test types into test-suite in future release.


    Building from source

    If you are building whole MetCat project using developers' trunk sourcetest-suite module will automatically build. But if you want only test suite follow following steps.

      Check out from svn location http://svn.codespot.com/a/apache-extras.org/metcat/trunk/test-suite
    svn checkout http://svn.codespot.com/a/apache-extras.org/metcat/trunk/test-suite
    

      cd to test-suite project folder and type
    mvn clean install

      The compressed test-suite binary distribution is created at
    test-suite/target

    As I mentioned earlier, this is a out of the box module. this will not included in the main binary distribution of the MetCat Server. But if you are building whole MetCat project using developers' trunk source, you can find the binary distribution under test-suite/target directory. 

    Running Test-Suite

    • Start Airavata and export few workflows into Airavata registry. Follow following quick guide to learn how to do that.

    An Quick guide to export workflows in to Airavata Registry using Xbaya.


    1. Setup Registry using, Registry → Setup JCR Registry... ( Simply click OK, if you are using local Jackrabbit server with default configuration. :) )

    2. Open a sample workflow file or create a workflow as describe in Airavata in 5 minutes tutorial. 
    3. Goto Xbaya → Export → To Registry
    4. Give a name (This name is used as template ID)
    5. Tick make public option
    6. Click Ok.

    Back to running test-suite tutorial.
    • Unzip test-suite binary file.
    • Start MetCat ( follow my previous article  )
    • Configure test-suite property using testsuite.properties in conf directory.
    • Set Airavata related properties using xbaya.properties.
    • cd to bin folder.
    • Start MetCat Test-suite by executing metcat-server.sh 
    • sh metcat-testsuite.sh
    You can find the test-suite logs under log folder. Finally verify the content captured by MetCat server (Workflows detials) using test-suite log.

    Summery

    In this article, I only focused on to give an introduction to MetCat's test-suite module. I will post a new blog post describing how to do profiling and conduct test cases using test-suite in a later day. 

    Tuesday, July 3, 2012

    Getting Started with MetCat | Provenance Aware Metadata Catalog for Apache Airavata


    In this post, I am going to describe how to get started with MetCat and show you how to perform basic.




    About Project


    MetCat project will develop metadata catalog targeted to be integrated with Apache Airavata project. The project focuses on capturing metadata from workflows and assist in scalable metadata management and user defined queries.
    For more information, see Project Site and Project Host page 

    ( New to Airavata ? Learn Airavata in less than 10 mins, by following Airavata in 5 minutes and Airavata in 10 minutes tutorials. )


    Current Release of MetCat


    Currently MetCat doesn't have a stable release. But you can build the MetCat 0.1-SNAPSHOT version from developers' trunk. see building from source section for more information.
    (Known Issues MetCat 0.1-SNAPSHOT: This is a snapshot release and less on features. This version is not recommended for production usage.)

    Requirements for MetCat

    • Java >= 1.6 (Oracle have been tested)
    • Linux Environment is preferred, since currently we don't have MetCat binary executor files for Windows environment. ( But you can run it on Window by executing jars and again this is not fully tested on Windows environment.)
    • Running Apache Airavata server. Requred 0.3-incubating or higher version of Airavata. (If you don't have running server, set up Airavata server locally with help with this Tutorial.)

    Building From Source


      • Unzip/Untar the source file or check out from svn location http://svn.codespot.com/a/apache-extras.org/metcat/trunk/
      svn checkout http://svn.codespot.com/a/apache-extras.org/metcat/trunk/ metcat-read-only
      
      • cd to project folder and type
      mvn clean install
      • The compressed binary distributions are created at 
      <your_project_source_location>/distribution/target
      


      Let's Configure MetCat

      • Extract apache-airavata-metcat-$VERSION-bin.tar.gz or unzip  apache-airavata-metcat-$VERSION-bin.zip
      • Let's say unzip location as <METCAT_HOME>
      • Configure Apache Cassandra properties at <METCAT_HOME>/cassandra/apache-cassandra-1.1.1/conf , only if you need.
      • Configure Metcat Properties using <METCAT_HOME>/conf
        •  msgBrokerMonitor.properties : To set metcat server listener port and set Airavata message broker URL to subscribe work-flow notifications.
        • cassandra.properties : set Cassandra server details and keyspace for data storage.
      ( Please do not change those properties, if you are using local Airavata server and local Cassandra server using default configuration. Those are for Advance MetCat configuration.)

      Now we're ready to Start MetCat.

      Starting MetCat

      •   cd to <METCAT_HOME>/bin
      •   Start Cassandra by executing cassandra.sh 
      sh cassandra.sh
      
      •   Start MetCat-server by executing metcat-server.sh 
      sh cassandra.sh
      

      Done... :)


      Monitoring workflow metadata

      Currently we don't support monitoring/query back results from cassandra database. This feature will added later. But you can use Cassandra-GUI to monitor Cassandra data till then.

      So run a workflow as described in Airavata in 5 minutes and Airavata in 10 minutes tutorials. Use Cassandra-GUI  to see extracted workflow metadata and how they stored in Cassandra cluster .  



      MetCat Road Map

      you can find MetCat Road map from Here.


      Got an Error/Need some Help or interested on this project

      Please do not hesitate to contact MetCat developers using https://groups.google.com/d/forum/metcat-dev