Saturday, October 9, 2010

Building Carbon from the trunk

-->

WSO2 Carbon is the base platform for WSO2’s enterprise-grade middleware stack and as a start up work I spent some time on checking out Carbon from the trunk and building it. This post describes how I managed to accomplish this and some problems I encountered while doing so, and the way I got them solved.

Requirements


Before building carbon I was required to install subversion and maven. This was done using following commands ( in a Linux environment ).

$ sudo apt-get install subversion

$ sudo apt-get install maven2

Then I used the following command to checkout the carbon from trunk.



In the carbon_home directory the POM, "Project Object Model", which is an XML representation of a Maven project held in a file named pom.xml can be found. I could obtain more knowledge about POM from here.


In the pom.xml file, which resides at the root level, the order of modules to be built can be found. For example, as per the below mentioned xml code snippet you have to start from the dependencies module and proceed with orbit, core, components, features and products.

dependencies
orbit
core
components
features
products

Dependencies directory need to be built in the order specified under the pom.xml in the dependencies directory. I used the following order of commands to accomplish this.

$ cd /home/randi/carbon/dependencies/axiom
$ mvn clean install 

-->
Before building ~/dependencies/axis2 I built the following two sub directories first.


$ cd /home/randi/carbon/dependencies/axis2/modules/tool/axis2-mar-maven-plugin
$ mvn clean install

$ cd /home/randi/carbon/dependencies/axis2/modules/tool/axis2-aar-maven-plugin
$ mvn clean install

Then the axis2 module was built.
$ cd /home/randi/carbon/dependencies/axis2
$ mvn clean install


-->
Then the same process was carried out for all the other modules inside the dependencies directory.

-->
In order to skip tests while building I used the following command.

$ mvn clean install -Dmaven.test.skip=true


 

-->
After successfully building ~/carbon/dependencies, I could build other modules as shown below.

-->
$ cd ~/carbon/orbit
$ mvn clean install -Dmaven.test.skip=true

$ cd ~/carbon/core
$ mvn clean install -Dmaven.test.skip=true

$ cd ~/carbon/components
$ mvn clean install -Dmaven.test.skip=true

$ cd ~/carbon/features
$ mvn clean install -Dmaven.test.skip=true

$ cd ~/carbon/products
$ mvn clean install -Dmaven.test.skip=true


$ sudo gedit /usr/bin/mvn
$ sudo gedit /etc/security/limits.conf 

-->
Problems encountered

while trying to accomplish the above task, I ecountered some problems as shown below.

"mvn clean install" crashes with java.lang.OutOfMemoryError: PermGen space after update to java 1.6.0_04


The system is out of resources.
Consult the following stack trace for details.
java.lang.OutOfMemoryError: PermGen space
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClassCond(ClassLoader.java:632)
at java.lang.ClassLoader.defineClass(ClassLoader.java:616)
at java.security.SecureClassLoader.defineClass(SecureClassLoader.java:141)
at java.net.URLClassLoader.defineClass(URLClassLoader.java:283)
at java.net.URLClassLoader.access$000(URLClassLoader.java:58)
at java.net.URLClassLoader$1.run(URLClassLoader.java:197)
at java.security.AccessController.doPrivileged(Native Method)
at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
at org.codehaus.plexus.compiler.javac.IsolatedClassLoader.loadClass(IsolatedClassLoader.java:56)
at com.sun.tools.javac.main.JavaCompiler.(JavaCompiler.java:300)
at com.sun.tools.javac.main.JavaCompiler.instance(JavaCompiler.java:72)
at com.sun.tools.javac.main.Main.compile(Main.java:340)
at com.sun.tools.javac.main.Main.compile(Main.java:279)
at com.sun.tools.javac.main.Main.compile(Main.java:270)
at com.sun.tools.javac.Main.compile(Main.java:87)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.codehaus.plexus.compiler.javac.JavacCompiler.compileInProcess(JavacCompiler.java:549)
at org.codehaus.plexus.compiler.javac.JavacCompiler.compile(JavacCompiler.java:156)
at org.apache.maven.plugin.AbstractCompilerMojo.execute(AbstractCompilerMojo.java:605)
at org.apache.maven.plugin.CompilerMojo.execute(CompilerMojo.java:128)
at org.apache.maven.plugin.DefaultPluginManager.executeMojo(DefaultPluginManager.java:490)
at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoals(DefaultLifecycleExecutor.java:694)
at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalWithLifecycle(DefaultLifecycleExecutor.java:556)
at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoal(DefaultLifecycleExecutor.java:535)
at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeGoalAndHandleFailures(DefaultLifecycleExecutor.java:387)
at org.apache.maven.lifecycle.DefaultLifecycleExecutor.executeTaskSegments(DefaultLifecycleExecutor.java:348)
at org.apache.maven.lifecycle.DefaultLifecycleExecutor.execute(DefaultLifecycleExecutor.java:180)
at org.apache.maven.DefaultMaven.doExecute(DefaultMaven.java:328)

[INFO] 1error
[INFO] -------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD FAILURE
[INFO] ------------------------------------------------------------------------
[INFO] Compilation failure
Failure executing javac, but could not parse the error:


I could overcome this by appending following line to /usr/bin/mvn.

sudo gedit /usr/bin/mvn (to open the file from gedit).

Then I added following line to the file.

export MAVEN_OPTS="-Xmx1024M -XX:MaxPermSize=128M"

  -------------------------------------------------------------------------------------------------------------------------------------
-->
When building the dependencies, I got the following build error message.

[INFO] ------------------------------------------------------------------------
[INFO] Building Apache Synapse - Distribution
[INFO] task-segment: [install]
[INFO] ------------------------------------------------------------------------
[INFO] [remote-resources:process {execution: default}]
[INFO] [site:attach-descriptor {execution: default-attach-descriptor}]
[INFO] [assembly:single {execution: distribution-package}]
[INFO] Reading assembly descriptor: src/main/assembly/bin.xml
[INFO] Processing DependencySet (output=synapse-${synapse.version}/lib)
[INFO] ------------------------------------------------------------------------
[ERROR] BUILD ERROR
[INFO] ------------------------------------------------------------------------
[INFO] Failed to create assembly: Failed to retrieve OS environment variables. Reason: Cannot run program "env": java.io.IOException: error=24, Too many open files

[INFO] ------------------------------------------------------------------------
[INFO] For more information, run Maven with the -e switch
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 3 minutes 44 seconds
[INFO] Finished at: Tue Sep 28 10:07:38 IST 2010
[INFO] Final Memory: 377M/665M
[INFO] ------------------------------------------------------------------------



to overcome this I did the following modifications to files.

I added following two lines to /etc/security/limits.conf .
soft nofile 2048
hard nofile 4096



further to set up user limits according to /etc/security/limits.conf, I added following line to /etc/pam.d/login

session required pam_limits.so

Then rebooting and building dependencies again solved my problem.






Thursday, September 30, 2010

A month to remember...

Its coming to an end of a happening and important month in my life. Important things are meant to be cherished. Hence thought of spending sometime on revising the exciting moments I encountered throughout the month.


The beginning of this month was the time of playing the final trump after spending tough 4 years of academic progression. On 3rd of September the final exams in final year was ended and it was the hardest exam I have ever faced during the university I must say. However,though the four years we spent inside UOM is tough we really enjoyed each and every second we spent there.Of course the friendship we shared became the catalyst for that for sure. Hence that feeling of departure we felt during those last days was heart-felting. As a cure for this all of us tried to spend those few days in a remarkable manner and that made us to get even closer I guess. Despite the busy schedules me and my bunch of friends always found sometime to hangup with and have a walk around here and there in the University premises. Its my pleasure to take this moment to pay my gratitude to all my university buddies who paid their contribution to bring this four year of stay at uni a remarkable period.


Second week of this month was the opportunity we got to show off our colors as the cream of the country since the engineering exhibition 'Extreme Odyssey' was in action. The exhibition held throughout days 9,10 and 11th and we were lucky enough to exhibit our final year project to the crowd. In line with the exhibition Cs & Es conference was on the play and we could published our research paper of final year project at this conference too.



13th September was a special day in my life indeed as it was a new beginning. It was the day on which I started investing the gain from my entire academic life to the industry. It was the day at which I started my career as a software engineer at Wso2 Inc, a leading middleware company, the place where I always wanted to be in. and the time we joined there was a important period to the WSO2 as well since they were celebrating their 5 years of creating a global brand. Hence my first week at work was accompanied by WSO2-Con 2010,which was held on 14th and 15th of this month at HNB towers and the 5 year celebration party , which was held on 17th at Waters Edge.



I was so lucky enough to start my career by working to the first 100% open source cloud platform for enterprise applications, WSO2 Stratos and I was assigned to work on the Billing and Metering component of it. Last week of this month was spent on getting used with various technologies using at WSO2 and setting up the environment.

In overall this month was happening and sometimes there were moments where I felt like life race at a speed that I can hardly manage. But still it is the race for success, and in such a race no matter how hard you burn your energy, sooner or later you will definitely yield the profits.


Saturday, April 17, 2010

Support Vector Machines for Speech Recognition

Support Vector Machines are a way of classifying data. it says that SVM is considered to be easier than using neural networks, but I'm yet to provide a comment on that :).
These days I’m trying to find a way to apply a SVM to improve accuracy of our speech recognition application. To achieve this I'm going to use LIBSVM.

The procedure for training the SVM is to
  • Transform data to the format of an SVM package
  •  Randomly try a few kernels and parameters
  •  Test

First, I thought to check whether I could use this to classify and predict phonemes. First step was to prepare the input files.
The format of training and testing data file is: 

<label><index1>:<value1><index2><value2> 
.
.
.
Each line contains an instance and is ended by a '\n' character.  For
classification, file are only used to calculate accuracy or errors.
Hence I analyzed a voice signal using MATLAB to check whether I can represent phonemes in vector format. For this first I recorded a voice signal and saved in .wav format . When this signal is plotted, I observed that some phonemes are intuitively separable whereas some are not. However even these separable phonemes contain different number of samples. But according to input file format each data element need to be represented in a fixed size vector. So now, I’m trying to check whether I can represent every phoneme in a fixed sized vector using different transforms, filters etc.

Thursday, April 15, 2010

Thursday, April 8, 2010

Want to type e-mails in Sinhalese ??


The easiest way of typing from a native language is to use a transliteration application. For Sinhalese the most famous application was the UCSC real-time Unicode converter that can be found here.
Now this is becoming replaced with Google transliteration labs, which provides Unicode conversion support for number of languages including Sinhalese.  
Compared to Unicode converter this provides an intuitive transliteration scheme.

Transliteration Vs Translation


Translation is the process of converting meaning of a word from one language to another where as transliteration is the process of converting sound of a word to another language. Apparently, transliteration allows typing other languages phonetically in English letters through these converters.
To embed the feature to Gmail and make the life easy refer this.

Thursday, April 1, 2010

Using neural networks for speech recognition


Our 'Sinhala Speech Recognition system' is now showing an elementary behavior, but we need to improve it more in order to get a satisfactory performance. To achieve that we are using optimization techniques on several faces such as improving and training language model and acoustic model, noise filtering techniques and machine learning approaches.
I’m going to use a neural network based approach to overcome the uncertainty due to variations of user and environment noise.  My intention is to use this blog entry to provide a step by step process to illustrate how to use a neural network in speech recognition.

Introduction to neural networks
A neural network is a collection of unique processing elements named neurons which are connected with similar other elements. First artificial neural network was designed in late 50’s and it’s much simpler than any biological neural network. Biological neural networks are far more complex and lots of studies are happening to discover the secrets behind these biological systems. Neural networks perform their role in applications that have limitations of using regular computer programs such as image recognition, speech recognition and making decisions.
Neural network differ from regular programming due to its requirement of training before performing task where as in regular programming task is programmed.

Structure of a neural network
A neural network consists of set of inputs, a weight (that multiplies each input ) and output per neuron. The output is calculated by an activation function applied on the sum of all inputs that are multiplied by the weights.

Friday, February 5, 2010

Avatar, Na'vi and networking

Nice film with breath taking animations. But I wonder whether the fanciness of surrounding has degrade the quality of effort. However rather than animations I find the concept of Na’vi creatures’ networking much more interesting. Those creatures have powerful network to which they can upload and down load information from. Rather than artificial wired / wireless network topologies existing on earth their network is via the fibers running though trees and other creatures.Their sacred trees are the backbone of their network and all the creatures including Na’vies and animals are plug and play workstations. All the creatures have a device (similar to USB device) which can be plug to this network or other creatures via corresponding port and facilitate the communication among them. On one hand, it could be seen to earth people that these creatures are machines rather living beings, but still those creatures live with nature unlike to human kind on earth.

I wonder whether Na’vi kind shows the possibility of ‘genetic globalization/ genetic networking’ that may be possible thorough genetic evolution someday.
Selective mutation by mother nature ensures that those features relevant to a species get enhanced and those are unnecessary get depreciated and vanished. Communication on the other hand is an essential for any kind of species for their existence, reproduction, evolvement and even to interact with other species. Hence similar to the way our ancestors and other living beings at early stages of evolution came up and improved their own languages or sounds or gestures to communicate, the evolvement of these methods also be technically possible with genetic evolution. However, human kind has deviate from the natural evolving pattern and pace, and has gone for artificial communication and network topologies. This has generated a world that could have very different if such evolvement happen naturally.

What if someone didn’t invent electricity, vacuum tubes, resistors and finally all digital media? Will us be not linked to each other as we are today via internet? Will we be not able to communicate globally? What I think is if we adhered to the pace of evolution by Mother Nature, she would have found us a way to be networked among each other just like Na` vi creatures do, because nature of genetic evolution is to upgrade towards positive direction and degrade towards negative direction. Hence had we lived with nature and had we let nature to evolve and develop us, entire world could have been networked and globalized genetically rather than digitally.

Friday, January 1, 2010

Encryption with Rotor machines


As a requirement of the module ‘Computer Security’ we had to engage in a forum discussion about the level of security provide by a 5-disk rotor machine used in encrypting messages from the Arabic language. I thought of blogging some interesting details we shared during the discussion.

Cryptography is the science of conducting secure communication. Rotor machine was the first electro-mechanical encryption device intended to automated cryptography. They became the most important device of the Second World War and remained dominant till nineteen fifties.

Structure of rotor machine

Each rotor in a rotor machine maps a character at its input face to one on its output face so that it implements fixed mono-alphabetic substitution. A generic rotor machine constitutes of number of such rotors. A plaintext character, which is input to the first rotor, generates an output so that it becomes the input to the second rotor. Finally, the last rotor produces the corresponding ciphertext. This idea can be illustrated using the following diagram.


If the rotor position is fixed, the collection of rotors implements a mono-alphabetic substitution। This is produced by constitution of the substitutions delineated by each individual rotors. Through encipherment of each plaintext character causes various rotors to move, a poly-alphabetic substitution will be resulted.


With a alphabet of 28 letters, when two rotors are next to each other and geared together, you have to type 28x28=784 letters before the key repeats. We can keep on adding rotors next to each other if the key length is not sufficient. Moreover, with a 5-disk rotor machine you will be able to obtain a period of 17,210,368 letters long. since Arabic is contextual, that is the way it is written depends upon the context, certain level of security is implicit with language characteristics. It says in standard Arabic style letters have considerably different shapes depending whether it connect with proceeding and/or a succeeding letter. Hence all primary letters have conditional forms depending on their positioning. That is depending on whether such letter is at beginning, middle or end of a word. For example in some letters the middle form starts with a short horizontal line on the right to ensure that it will connect with its preceding letter and, for some letters, a loop or longer line on the left with which to finish the word with a subtle ornamental flourish. Hence may be in reality more than 28 letters come into the scene providing period longer than 17,210,368.

When a rotor machine is used, the rotor machine itself is the algorithm. I.e. the way in which it is set up is the key. So when deciphering, recipient need to type the ciphertext letters to his rotor machine. If that machine was set up exactly in the same way as the message sender’s, plaintext can be identified. But as similar with other types of cipher systems, if you don’t know the key it is really difficult to read the message even the system which was used to encipher it is known.

If the key is unknown, finding the rotor setup is really hard. Say you have 5 rotors with alphabet of 28 letters, then there will be 28x28x28x28x28 distinct ways to set the starting position. In addition, the possible number of link up pairs is extremely high and its calculation is so complex. If you are interested this way of calculation could be found at http://www.codesandciphers.co.uk/enigma/steckercount.htm . This shows the difficulty of breaking code further.

At the same time still there is the problem of distributing the key. The problem is conveying a long key securely to the parties who need it takes time and there can be mistakes in key distribution.

But if a fixed setup is using this might not make a much trouble। But if it is as military event as was the case with German army, then there is a immense problem as the changed key need to be distributed daily. But what they were doing was using a key sheet specifying the required set up for each month. Hence when using rotors, it is required to use some trick as such to avoid the problems relating to distribution of long key.

in everyday writing, accents are often omitted in Arabic; the reader recognizes the words as a result of experience as well as the context.This may expose a threat in security of contextual languages. If there is a possibility for reader to recognize words because of experience as well as the context, then simple guesswork could do a great job in decrypting. Means by identifying some amount of letters without taking a further effort hacker can understand the meaning of plaintext. But again only understanding would be possible instead of modifying. If the plain text is small in length, as the key repetition is not occurred there is a possibility that hacker can map unidentified letters based on his/her prediction also. However, this is less likely to occur in case of a long plaintext.

When comparing with other attempts of cipher designers one advantage in rotor machine is that it does not require extraordinary abilities from their users. This was not the case with other methods, as they required patience to carry out lengthy, letter-perfect evolutions and uncertainty under time pressure or battle strength in military context.