Wednesday, July 4, 2012

MetCat Test-suite - Airavata workflow publisher client


Recently we added a new out of the box module called for test-suite for MetCat. It is a workflow publisher client which can emulate workflows and pre-poluate data. So using test-suite, you can emulate heavy workload for MetCat and as well as can be used justify performance metrics.


Test type: run_from_registry


In 0.1 Release of the test-suite, we have integrated only one test-type called run_from_registry. When Test-suite is running a run_from_registry type test case, it takes saved workflow-templates from Airavata Registry, spawn workflows using those templates and submits workflows to a Airavata server to execute in parallel. Here a workflow is an instance of a workflow-template with unique topic ID.

In this test type, test-suite assign each workflow into a separate thread, and assigned in to a fixed size thread pool. So each thread is responsible for executing its assigned workflow. Test-suite is using Airavata Client API to communicate with Airavata Server.


Test Mode
a run_from_registry test can be done in two modes.  

  • Parallel
  • Loop



    Parallel

    In Parallel mode test-suite schedule each workflow template to execute only for N times on separate threads in parallelly and then execute them. Assume there are two workflow-templates are saved in Registry, and you have define the number of repeat times (N), then test-suite will execute total 2N workflows including N from one-template and other N from other template.


    Loop

    In Loop mode test-suite schedule each workflow template to execute for N times on separate threads in parallel and execute them until you terminate the test-suite execution. I am not saying it runs Infinite times, But it will take several years to finish both scheduling and executing workflows in this mode. So when you want to terminate test-suite execution, feel free to press Ctrl + C.


    Minimum Work Load

    To execute a workflow it takes more time ( simple workflows takes few seconds, but large workflows takes few minutes to hours, days even months...). But to schedule a workflows, it takes only few milliseconds for test-suite. So There is a possibility to generate high workflow load for test-suite by it self. So thread scheduling should be handled carefully, So we introduced a parameter called minimum workload, which defines minimum number of threads should be assigned to the thread pool for execution, before scheduler goes to sleep. when internal load becomes low, scheduler again start his work.

    Logs

    All activities will log and exported into a log file. Please define what you want to log in test-suite property.

    Configuration

    you can configure all above disused properties by editing test-suite properties in conf directory in the binary distribution of the test-suite.

    We are hoping to integrate several other test types into test-suite in future release.


    Building from source

    If you are building whole MetCat project using developers' trunk sourcetest-suite module will automatically build. But if you want only test suite follow following steps.

      Check out from svn location http://svn.codespot.com/a/apache-extras.org/metcat/trunk/test-suite
    svn checkout http://svn.codespot.com/a/apache-extras.org/metcat/trunk/test-suite
    

      cd to test-suite project folder and type
    mvn clean install

      The compressed test-suite binary distribution is created at
    test-suite/target

    As I mentioned earlier, this is a out of the box module. this will not included in the main binary distribution of the MetCat Server. But if you are building whole MetCat project using developers' trunk source, you can find the binary distribution under test-suite/target directory. 

    Running Test-Suite

    • Start Airavata and export few workflows into Airavata registry. Follow following quick guide to learn how to do that.

    An Quick guide to export workflows in to Airavata Registry using Xbaya.


    1. Setup Registry using, Registry → Setup JCR Registry... ( Simply click OK, if you are using local Jackrabbit server with default configuration. :) )

    2. Open a sample workflow file or create a workflow as describe in Airavata in 5 minutes tutorial. 
    3. Goto Xbaya → Export → To Registry
    4. Give a name (This name is used as template ID)
    5. Tick make public option
    6. Click Ok.

    Back to running test-suite tutorial.
    • Unzip test-suite binary file.
    • Start MetCat ( follow my previous article  )
    • Configure test-suite property using testsuite.properties in conf directory.
    • Set Airavata related properties using xbaya.properties.
    • cd to bin folder.
    • Start MetCat Test-suite by executing metcat-server.sh 
    • sh metcat-testsuite.sh
    You can find the test-suite logs under log folder. Finally verify the content captured by MetCat server (Workflows detials) using test-suite log.

    Summery

    In this article, I only focused on to give an introduction to MetCat's test-suite module. I will post a new blog post describing how to do profiling and conduct test cases using test-suite in a later day. 

    Tuesday, July 3, 2012

    Getting Started with MetCat | Provenance Aware Metadata Catalog for Apache Airavata


    In this post, I am going to describe how to get started with MetCat and show you how to perform basic.




    About Project


    MetCat project will develop metadata catalog targeted to be integrated with Apache Airavata project. The project focuses on capturing metadata from workflows and assist in scalable metadata management and user defined queries.
    For more information, see Project Site and Project Host page 

    ( New to Airavata ? Learn Airavata in less than 10 mins, by following Airavata in 5 minutes and Airavata in 10 minutes tutorials. )


    Current Release of MetCat


    Currently MetCat doesn't have a stable release. But you can build the MetCat 0.1-SNAPSHOT version from developers' trunk. see building from source section for more information.
    (Known Issues MetCat 0.1-SNAPSHOT: This is a snapshot release and less on features. This version is not recommended for production usage.)

    Requirements for MetCat

    • Java >= 1.6 (Oracle have been tested)
    • Linux Environment is preferred, since currently we don't have MetCat binary executor files for Windows environment. ( But you can run it on Window by executing jars and again this is not fully tested on Windows environment.)
    • Running Apache Airavata server. Requred 0.3-incubating or higher version of Airavata. (If you don't have running server, set up Airavata server locally with help with this Tutorial.)

    Building From Source


      • Unzip/Untar the source file or check out from svn location http://svn.codespot.com/a/apache-extras.org/metcat/trunk/
      svn checkout http://svn.codespot.com/a/apache-extras.org/metcat/trunk/ metcat-read-only
      
      • cd to project folder and type
      mvn clean install
      • The compressed binary distributions are created at 
      <your_project_source_location>/distribution/target
      


      Let's Configure MetCat

      • Extract apache-airavata-metcat-$VERSION-bin.tar.gz or unzip  apache-airavata-metcat-$VERSION-bin.zip
      • Let's say unzip location as <METCAT_HOME>
      • Configure Apache Cassandra properties at <METCAT_HOME>/cassandra/apache-cassandra-1.1.1/conf , only if you need.
      • Configure Metcat Properties using <METCAT_HOME>/conf
        •  msgBrokerMonitor.properties : To set metcat server listener port and set Airavata message broker URL to subscribe work-flow notifications.
        • cassandra.properties : set Cassandra server details and keyspace for data storage.
      ( Please do not change those properties, if you are using local Airavata server and local Cassandra server using default configuration. Those are for Advance MetCat configuration.)

      Now we're ready to Start MetCat.

      Starting MetCat

      •   cd to <METCAT_HOME>/bin
      •   Start Cassandra by executing cassandra.sh 
      sh cassandra.sh
      
      •   Start MetCat-server by executing metcat-server.sh 
      sh cassandra.sh
      

      Done... :)


      Monitoring workflow metadata

      Currently we don't support monitoring/query back results from cassandra database. This feature will added later. But you can use Cassandra-GUI to monitor Cassandra data till then.

      So run a workflow as described in Airavata in 5 minutes and Airavata in 10 minutes tutorials. Use Cassandra-GUI  to see extracted workflow metadata and how they stored in Cassandra cluster .  



      MetCat Road Map

      you can find MetCat Road map from Here.


      Got an Error/Need some Help or interested on this project

      Please do not hesitate to contact MetCat developers using https://groups.google.com/d/forum/metcat-dev



      Friday, June 15, 2012

      The Big Picture - Debugging Summer...

      In my last post I mentioned activities i have done before coding start in GSoC 2012. Now I am going to draw an image on your mind about my GSoC 2012 project.

      you can find full detailed description here and i have omitted some irrelevant part from original proposal.

      Note: Things proposed here may be changed or and not implement all during the implementation. please keep track on my next blog posts to get an idea about current progress about the project.


      Project Description:


      Background and its usefulness for Airavata community:


      Apache Airavata provides a software toolkit for building E-Science gateways in wide spectrum of scientific disciplines. Main design goal of Airavata is to support long running applications and workflows on distributed computational resources using Service Oriented Architecture.

      Airavata XBaya workflow system is designed to provide a convenient programming model for scientists/ developers to program their experiments. It provides an interface for composition, execution and monitoring of the workflows and it hides underlying complexity of the middleware platform from users. So developers can easily construct their logic or experiment using XBaya GUI and execute the workflow on Workflow interpreter. Also it allows monitoring execute workflows in synchronous or asynchronous manner. Workflow interpreter can be used as an embedded workflow enactment engine within XBaya GUI or as an interpreter service that may run as a persistent service.

      Problem:

      But currently Airavata does not contain proper module to debug workflows that are being executed at the server. But there was debugging facility built on XBaya GUI to debug workflows that are being executed at embedded workflow enactment engine within XBaya GUI.  This was possible since both XBaya GUI and the embedded workflow interpreter share the same JVM enabling easy communication among the components at java level. It allows to implement debugging functions without doing too much of work.

      But using a simple RMI (Remote Method Invocation) mechanism can be used to extend existing XBaya debugging facility to use with Airavata server. But it is not feasible since it cannot achieve language independences of the debugging module. The debugging module should able to work with different debugging clients like XBaya GUI, a Web Client etc.

      Because of these limitations and as it is a much required feature, Airavata community has proposed implementing a debugging framework that enables debugging workflows that is executed at the Airavata server. It should be language independent and support remote debugging.

      Benefits 

      Proposed Workflow debugging framework is very similar to normal debugger applications. It allows its user to exercise some degree of control over workflows and to examine them when things go amiss. Some main functionalities of a workflow debugger are,


      • Receive workflow execution data, current state data at a particular point during execution of the workflow.
      • Setting and removing break points from workflow execution.
      • Send commands to manipulate execution life cycle (pause/resume/single step/stop etc.)
      • Modify workflow data on the fly.
      etc. More details will be discussed under the solution section.
      Debugging framework will give following benefits to developers;
      • Easy to examine behavior of the workflow, especially when workflow execute in the Airavata server.
      • Can identify errors before they manifest themselves at the end of the workflow execution.
      • Modify the workflow execution data dynamically.

      etc.

      Finally debugging framework will help to enhance the usability of the workflow designing and help to increase the productivity of both developer and the system.


      Solution


      The proposed solution is to create a debugging framework at the Airavata backend. It will contain mainly two components; backend debugging module and debugging client. Once it is implemented, they should be able to do following things.

      Functions of the backend debugging module

      It will be capable to do following tasks.


      • Provides  an API to communicate with debugging clients. This API should be intuitive, language independent & support remote debugging.
      • Able to communicate with workflow interpreter to retrieve status and data of a workflow which corresponds to a given topic ID.
      • Notify the debugging client whenever workflow instance reaches a break point and pause the workflow instance execution.
      • Able to control the paused workflow instance execution according to the debugging client’s commands.

      Functions of a debugging client 

      Debugging client can be implemented as a new module or can be integrated into XBaya. However debugging client should be able to do following tasks;

      Basic tasks

      • Connect with the backend debugging module in Airavata server.
      Monitoring Related tasks
      • Retrieve current status & node data of a workflow execution for a given topic ID.
      Debugging workflows related tasks..    
      • Placing, modifying and removing break points to nodes for a given topic ID. 
      • Able to tell Airavata interpreter to execute next node and pause again. This is similar to the single step command in normal debugging.
      • Able to tell Airavata interpreter to continue execution from a break point. This is similar to the resume command in normal debugging.
      • Able to tell Airavata interpreter to stop the execution of the workflow. This is similar to the stop command in normal debugging.


      Implementation:


      This is my proposal to implement above discussed solution.



      Overall debugging system's architecture

      As discussed earlier the system consists of two components; debugging client and backend debugging module. Backend debugging module consists of a web service interface and debugging API.

      When a user debugs a workflow using a Debugging client (here XBaya GUI), it accesses the Debugging API through the web service interface and gives instructions to control workflow execution. According to the client’s instructions Debugging API communicates with workflow interpreter to control workflow execution or retrieve/modify workflow execution data. Then Debugging module notifies the debugging client with requested data or status using its web service interface and client will illustrate debug results/ data using debugging client's user interface. 

      Development of the debugging framework is divided into two main phases. Phase 1 is developing the backend of the debugging module and phase 2 is developing the client side.

      First phase



      • When considering requirements of backend, following activities can be identified in First phase.
      • Adding necessary functions in workflow-interpreter and GFac to support modifying/retrieving node data and to control workflow etc. (if necessary)
      • Design and Implementing a debugging module API and supportive functions to  pause/ resume/ stop a workflow, notifying client, retrieving data of a node etc.
      • Implementing a registry to store break point details, topic IDs, client details etc.

      Developing a web service interface for debugging client (XBaya GUI or a web portal) to connect with backend debugging module. This ensures language independence & remote debugging support of the backend of the debugging module.

      Designing a simple, robust and language independent backend module gives a huge advantage when developing a debugging client. Also security issues can be addressed using proper design. Thus designing APIs should be properly handled. Also the API should be clearly defined, because it will help someone else trying to implement the missing stuff in future based on the provided API.

      Second Phase  

      The second phase will include developing the client side. Since existing XBaya GUI already has functions to set break points to nodes, start/stop/restart workflows, and monitoring capabilities, it is very straightforward to implement the new debugging client at XBaya GUI. Main advantage that gives to user is the ability to debug workflows either executes in workflow enactment engine within XBaya or remote workflow interpreter by using single workbench.


      • When considering requirements of client side, existing XBaya GUI should be modified to support remote debugging module.
      • Design and implement new interface to connect with backend debugging module.
      • Implement method to call remote debugging functions.
      • Do necessary changes for existing debugging system in the XBaya GUI to support new debugging system. For example modify the setting/ removing break points function etc.    
      • Implement new window like “parameters”, “monitoring” to show debugging information.
      • Design and Implement GUI support for “single step”, “resume”, “terminate” debugging functions.   

      Second phase is a bit challenging section to develop since this involves lots of UI related implementations and testing.

       Development Plan


      This proposal covers most of the proposed features in the solution section which are discussed on issue. But as a GSOC project there are some constraint to be met. Time constraint will be the main problem when it comes to implementing the above solution.

      As you can see it will take much more time and effort to complete all the mentioned tasks from the beginning in phase 1, phase 2 and Testing plan. Therefore following a better development plan will be helpful in complete this GSOC project within the GSOC time frame.

      During the Community Bonding Period I would like to prepare the initial requirement specification and identify the basic components of the debugging module (Core) with help from mentor and the community. This requirement specification will contain the identified subtasks of each phase 1 and phase 2 with their priorities and dependencies. Subtasks will be prioritized according to their usefulness for the core debugging system.

      In each phase, identified basic subtasks (high priority tasks) will be carried out using initial requirement specification to build the core debugging system. Also Documentation will be maintained. Leave low priority tasks with proper documentation. If necessary they will be implemented as dummy implementations with proper comments/ documentations.

      In the event of a low priority feature being recognized as a hindrance to the implementation of a required feature, the blocking feature will be implemented prior to the required feature. Usually a blocker can be identified during the dependency check of the subtask. 

      The initial requirement specification will be updated during each phase with help from the mentor and the community. This will help in identifying and solving new blockers so that the project requirements can be fulfilled successfully.  But any attempts at adding new features late in the development could result in core level changes and changes in the API.

      All development needs to be properly documented in such a way that in future someone else can implement the missing components/ features using the produced documentation.

      Testing Plan


      Testing plan contains unit tests and integration tests. Unit tests will be carried out during both Phase 1 and Phase 2 of the development. It is very important to conduct unit tests at each phase, since they will ensure the functionality of each component and guarantee that they are working properly.

      JUnit test cases will be used for automating (with maven support) test cases and checking functionality entirely of new components (eg: debugging API, Web Interface, and Client UI components)

      Integration tests will be carried our after the development of the second phase. Integration tests mainly focuses on use cases which are discussed in 5 minute and 10 minute tutorials. They are,

      Invoke a simple Web Service by creating a workflow.

      Create a simple Web Service for a command-line Echo application. Then invoke the echo service by using a workflow.

      Tests will be conducted to cover those two use cases from start to end using debugging features. Finally documenting the debugging steps of the above two use cases in the form of tutorials. Once the debugging module is added into the Airavata code base, those two tutorials can be added into Airavata documentations as well.






      GSoC 2012 with Apache Airavata.


      Hi All,

      After long silence, I am back to blogging after nearly 3 months.

      This is summer. When we heard summer is coming, the first thing which pop up into our mind is Google Summer of Code competition. It was announced early this year also since 2005. (see more information about GSoC 2012) In this blog post, I am going to briefly summarize the activities which I involved in Pre-GSoC 2012 period (from Announcement of GSoC 2012 to student bonding period.).

      See GSoC 2012 event and time plan and Read FAQ to get more information about the GSoC 2012. 


      Here is the rough time plan of the GSoC 2012 and activities which i did during each time period.  
      • Google summer of Code 2012 Program announced.
      • Google program administrators review organization applications.
      • Accepted mentoring organizations published on the Google Summer of Code 2012 site.

      • Student application period opens.
        • After discussing with Airavata community I wrote a project proposal for the project and submitted to Google. 
      • Student application deadline.
      • Mentoring organizations review and rank student proposals; where necessary, mentoring organizations may request further proposal detail from the student applicant.
        • I did some improvements to my project proposal as requested by them.
      • Accepted student proposals announced on the Google Summer of Code 2012 site.
      yes, I am in. I am a GSoCer. 
      I will post my current progress and detailed description about my project on next blog post.
        


      Thursday, March 8, 2012

      [Tip] Avoiding Shift/Reduce conflict

      consider following grammar for C-minus;

      selection_stmt      ::= IF '(' expression ')'  statement
                              | IF '(' expression ')' statement ELSE statement ;
      


      This is a "Dangling else" problem that leads to Shift/Reduce conflict when generating a parser ( I am using CUP ).

      Warning : *** Shift/Reduce conflict found in state #72
        between selection_stmt ::= IF '(' expression ')' statement (*) 
        and     selection_stmt ::= IF '(' expression ')' statement (*) ELSE statement 
        under symbol ELSE
        Resolved in favor of shifting.
      

      you can solve this problem by two ways. ( or you can re-define the if-else grammar rule with unambiguous if-else rules. I am working on it, hope to update this post after it)

      1) adding precedence ELSE:

      eg: modify code as given bellow.

      precedence left  PLUS, MINUS;
      precedence left  TIMES, DIVIDE;
      precedence left  ELSE;
      


      2) Allow to resolve in favor of shifting:

      you can generate parser code expecting 1(or more) shift/reduce conflict.


      eg: add "-expect 1" option to tell expect one shift/reduce conflict.

      cup -expect 1 -parser CMinusParser cminus.cup
      

      Sunday, February 19, 2012

      [TIP] Use Linux Pipelines and Make your life easier

      Suppose you want to get the history of your terminal (commands, that you have typed  in the terminal) and find out a command that you have used log time ago.

      So you can simply use history command to list down command history and find it.

      $ history

      ( By default Linux store only last 1000 history of your command line. you can find the history file and size of history by typing following command. you can change these values changing global profile value of the terminal. )

      $ echo $HISTFILE 
      $ echo $HISTSIZE
      

      But if your list is bigger one you may need to do a search to find your command. 
      for example you can use grep command to find it. See following command set.

      $ history > history.txt 
      $ grep -i "dpkg" history.txt
       1761  dpkg -s jflex
       1974  dpkg -s cup
      $ rm history.txt
      

      In first command it writes history into a text file. In the second line, it search for "dpkg" with ignoring the case in the history.txt and it shows the results in next two lines. After that in the 5th line it removes the history.txt file.

      But we can do this easily using Linux pipelines. It generates no intermediate files and can obtain result using one line. here is the command.

      $ history | grep -i "dpkg" 
       1761  dpkg -s jflex
       1974  dpkg -s cup 

      This is a simple example that use Linux pipelines. You can use this for very complex stuffs.  Read this Wikipedia page to learn more on Linux pipelines.

      Here is another example that use pipeline.

      $ find . -name .svn | xargs rm -fr
      

      this command removes all .svn directories in current path.( Also you can use non pipeline solution to do this. see following command )

      find . -name .svn -exec rm -rf {} \;
      

      Wednesday, February 15, 2012

      HUAWEI Mobile Broadband dongle not working on Ubuntu ? Try Mobile partner on Ubuntu

      Today I noticed that, I can’t connect to the Internet using my HUAWEI E153 dongle on Ubuntu 11.10. (In the afternoon, it was working well on Ubuntu). 

      In Network connection menu in the Notification area, it shows my mobile connection plan (Which I configured earlier; that means my dongle was detected by Ubuntu), but when I select it, it didn't connect to the Internet. 
      I tried it several times and tried several things( unplugged and-replugged etc.) and but unable to fix it. Then I moved to my old-friend windows and googled about the problem. Observed many people have similar questions.

      Mobile partner software for Linux

      Most important thing which I found was “Mobile partner software for Linux”. I downloaded and installed it. (installation is very straightforward, Read readme.txt for installation instructions. )

      I unplugged the dongle and re-plugged it.
       
      Wow, Mobile partner application automatically stared. After few seconds it showed the signal strength at left bottom corner. Then I created a new profile (configured APN and Save) and pressed connect button. 

      It connected to the Internet. At last... :)



      But when you are using mobile partner even you are connected to the Internet it will be shown as you aren't connect.

      Don’t forget: Now you can send SMSs and read SMS using mobile partner on Ubuntu. :) But the network statistics wasn't working properly in my system.   :(


      Note:   I have updated the downloaded link.

      New Download location: http://www.mediafire.com/?7a0aa7414cud40h
      To old Download location: click here




        

      Monday, February 13, 2012

      Apache Airavata



       
       (Image Source: Apache Airavata URL:http://airavata.org/)

      Apache Airavata is newly started project at The Apache Software Foundation. currently Apache Airavata is in incubation phase at The Apache Software Foundation. You can visit to the official website of the Apace Airavata at http://airavata.org/.

      Apache Airavata is a software framework to build and deploy e-science projects. Airavata has the capability of composing, managing, executing and monitoring variety of distributed applications and work-flows that runs on computational resources. Concepts of service oriented computing, distributed messaging, work-flow composition and orchestration provides the foundation for Airavata. 

       Interest and Want to Involved with Airavata? Visit http://incubator.apache.org/airavata/community/get-involved.html 

      Some thing you may not know about the word "Airavata"...

      In South Asian culture, the word Airavata , refers to a mythological white elephant who carries the Hindu god Indra. Also means meaning "elephant of the clouds". (Reference: Wikipedia http://en.wikipedia.org/wiki/Airavata) 


      I hope to post some more about Apache Airavata in my next blog posts.