Crossed Logic: September 2007

Friday, September 14, 2007

Free ColdFusion Server

Since I have let the cat out of the bag that I am a bit of a Java enthusiast, I may as well admit to being a fan of ColdFusion which integrates with Java real well. You can use object-oriented constructs in ColdFusion as well as use Java to extend functionality or provide framework that can be executed easily from the web interface.

Anyway, I just saw that New Atlanta released a new version of their ColdFusion server which is free. There are some restrictions on tag support that can be overcome by buying their full blown version with J2EE capabilities or Adobe's ColdFusion server.

Either choice, one good point is that ColdFusion and Java integrate real well to allow designers use to the web interface niceties of ColdFusion Markup Language (CFML) while leveraging business logic code written in Java.

For example you can create a normal Java class as follows:


package com.mydomain;
public class MyClass {
     public int getInt(){ return 1; }
}

Then you can create and use the object in CF through a few tags:


<cfset mc = createObject("java","com.mydomain.MyClass") />
<cfset myint = mc.getInt() />

CFScript can be pretty useful in making this flow similar to syntax from Java itself, but using the CF tags aren't a bad way to go either especially if the point is separating duties between developers use to Java code and database design from web designers use to HTML style.

Happy coding!

Thursday, September 06, 2007

Concurrent Threading Series (J# Revisited)

After discussing the .NET approach to threading and asynchronous processing, I figured I would circle back to my goal of showing in J# how threading would be accomplished in the same manner it was in our Java application example. Here is what the code looks like (mostly from the different Microsoft references on MSDN web-site. The interesting thing I found is that the use of IAsyncResult is slightly different in Jsharp and requires the use of an additional object System.Runtime.Remoting.Messaging.AsyncResult.


import System.Threading.ThreadPool;
import System.Threading.WaitHandle;
import System.Runtime.Remoting.Messaging.AsyncResult;
import System.IAsyncResult;
import System.AsyncCallback;

/** @delegate */
public delegate String taskDelegate(String x);

public class Example
{
    public static int NUM_THRDS_CONCURRENT = 60;
    /** 
     * Initializations to be done only once by application
     * Setup thread pool globally.
     */
    static
    {
        int workerThreads = 0, completionPortThreads = 0;
        ThreadPool.GetMaxThreads(workerThreads, completionPortThreads);
        //If current pool size is less than our maximum, change the pool size
        boolean changePoolSize = false;
        if (workerThreads < NUM_THRDS_CONCURRENT)
        {
            workerThreads = NUM_THRDS_CONCURRENT;
            changePoolSize = true;
        }
        if (completionPortThreads < NUM_THRDS_CONCURRENT)
        {
            completionPortThreads = NUM_THRDS_CONCURRENT;
            changePoolSize = true;
        }
        if (changePoolSize) ThreadPool.SetMaxThreads(workerThreads, completionPortThreads);
    }

    public Example()
    {
        super();
    }

    public static void main() {
        Example instance = new Example();
        System.out.println("Processing started " + System.DateTime.get_Now().ToString());
        instance.execute();
        instance = null;
    }

    public void execute()
    {
        int NUM_THRDS = 100;
        taskDelegate tasks[] = new taskDelegate[NUM_THRDS];
        IAsyncResult ars[] = new IAsyncResult[NUM_THRDS];
        WaitHandle _handles[] = new WaitHandle[NUM_THRDS];
     
        for(int i=0; i<NUM_THRDS; i++) {
            tasks[i] = new taskDelegate(task);
            ars[i] = tasks[i].BeginInvoke("Thread #" + i,
                new AsyncCallback(aCallback), 
                tasks[i]
            );
            _handles[i] = ars[i].get_AsyncWaitHandle();
        }

        // use wait handles to ensure threads complete before continuing
        // in this example we are just printing data,
        //    but this is useful if after async calls you need to sort/manipulate results.
        for (int i = 0; i < NUM_THRDS; i++)
        {
            _handles[i].WaitOne();
        }

        try
        {
            Thread.sleep(5000L);
        }
        catch (InterruptedException e)
        {
            // exception handling
        }

        tasks = null;
        ars = null;
    }

    /* Asynchronous callback method used to call EndInvoke and process results */
    public void aCallback(IAsyncResult ar) {
        taskDelegate t = (taskDelegate)ar.get_AsyncState();
        AsyncResult aResult = (AsyncResult)ar;
        taskDelegate temp = (taskDelegate)(aResult.get_AsyncDelegate());
        System.out.println(temp.EndInvoke(ar));
        t = null; temp = null; aResult = null;
    }

    public String task(String x)
    {
        try {
            Thread.sleep(2000L);
        } catch(InterruptedException e) {
            // exception handling
        }
        return (x.trim() + ": " + System.DateTime.get_Now().ToString());
    }
}

I tried to correlate other examples/practices in here like the use of a static initializer to set the size of the pool or other data/objects you want done once, but the remainder of the code is straight from documentation on the different .NET classes as mentioned above. Note the use of the AsyncWaitHandles, as I have found it is a good way to signal the calling thread when all the Async activities have completed. Otherwise, you can start a loop from the number of total threads and do i-- (decrement counter) within AsyncCallback method to indicate the number of threads still running.

Bringing The Background To The Foreground

One point to make on multi-threading applications if I have not done a good job of making it clear is that it should be done with a good deal of responsibility with regarding to synchronization or protection against threads colliding with each other, exception handling, and resource management.

With the discussion on using thread pool or normal threads mentioned at the end of my last post, I realized I used the term "background" threads a good deal, but did not show how to implement one or what the difference was between a thread running in the background according to my definition and a background thread as implemented in programming language.

A thread is either a background thread or a foreground thread. Background threads are identical to foreground threads, except that background threads do not prevent a process from terminating. Once all foreground threads belonging to a process have terminated, the common language run-time ends the process. Any remaining background threads are stopped and do not complete. -MSDN

Threads that belong to the managed thread pool (that is, threads whose IsThreadPoolThread property is true) are background threads. All threads that enter the managed execution environment from unmanaged code are marked as background threads. All threads generated by creating and starting a new Thread object are by default foreground threads. -MSDN

If you follow the first link above, the MSDN library entry has the following code for illustrating the difference in how the background and foreground threads are treated at termination of application.


Imports System
Imports System.Threading

Public Class Test

    <MTAThread> _
    Shared Sub Main()
        Dim shortTest As New BackgroundTest(10)
        Dim foregroundThread As New Thread(AddressOf shortTest.RunLoop)
        foregroundThread.Name = "ForegroundThread"

        Dim longTest As New BackgroundTest(50)
        Dim backgroundThread As New Thread(AddressOf longTest.RunLoop)
        backgroundThread.Name = "BackgroundThread"
        backgroundThread.IsBackground = True

        foregroundThread.Start()
        backgroundThread.Start()
    End Sub

End Class

Public Class BackgroundTest

    Dim maxIterations As Integer

    Sub New(maximumIterations As Integer)
        maxIterations = maximumIterations
    End Sub

    Sub RunLoop()
        Dim threadName As String = Thread.CurrentThread.Name

        For i As Integer = 0 To maxIterations
            Console.WriteLine("{0} count: {1}", _
                    threadName, i.ToString())
            Thread.Sleep(250)
        Next i

        Console.WriteLine("{0} finished counting.", threadName)
    End Sub

End Class

As stated previously, I have found "background" threads really useful for statistical analysis/logging activities that are not necessary for functionality of user request, but need data from request to get started. So the user doesn't wait for long time for a response, the application calculates results for user while spawning a background thread for performing business logic, file I/O, database calls, etc. necessary.

This works well keeping in mind whether or not it is important to you to maintain any statistical threads still running at application shutdown. With .NET typically being hosted on a Microsoft IIS server where application pools cycle periodically, the need to use foreground thread or programmatic handling of background threads on application end (global.asax) may be more prevalent.

I will try to be careful on how I use the term "background" as I tend to use this to refer to any thread that does not need to run while a user is waiting/watching.

Anyway, just a little tidbit of information to add to the arsenal.

Code Formatter for Blogs/Web Pages

Since I have been copying and pasting in code recently pretty frequently and trying to format, I thought I should post a really cool web tool I found that is great for formatting code snippets for placement in blog posts or HTML in general. If you have not seen the new look on my previous posts that I have retrofitted using this tool, take a look and see.

http://www.manoli.net/csharpformat/

You will find the tool is good for VB and C#. Since Java is very close in color coding scheme to C#, I used the tool with C# selection and worked out decently.

The formatting uses CSS, so you will have to copy the following CSS code into the template page of your blog or update the linked CSS file from the site with this code and reference as indicated.


.csharpcode, .csharpcode pre
{
    font-size: small;
    color: black;
    font-family: Consolas, "Courier New", Courier, Monospace;
    background-color: #f4f4f4;
    white-space: pre-wrap;       /* css-3 */
    white-space: -moz-pre-wrap;  /* Mozilla, since 1999 */
    white-space: -pre-wrap;      /* Opera 4-6 */
    white-space: -o-pre-wrap;    /* Opera 7 */
    word-wrap: break-word;       /* Internet Explorer 5.5+ */
}

.csharpcode pre { margin: 0em; }

.csharpcode .rem { color: #008000; }

.csharpcode .kwrd { color: #0000ff; }

.csharpcode .str { color: #006080; }

.csharpcode .op { color: #0000c0; }

.csharpcode .preproc { color: #cc6633; }

.csharpcode .asp { background-color: #ffff00; }

.csharpcode .html { color: #800000; }

.csharpcode .attr { color: #ff0000; }

.csharpcode .alt
{
    background-color: #f4f4f4;
    width: 100%;
    margin: 0em;
}

.csharpcode .lnum { color: #606060; }

Update adds text wrap for browsers that don't automatically do so with use of the pre HTML element.

Calling All Threads Again (Async Revisited)

overview, calling all threads

So in the last post, we illustrated how to recreate Java code for parallel processing in Visual Basic .NET. One of the design features used in the implementation of retrieving data from task submitted to thread pool work item queue was to use a wait object to ensure that the thread completed before trying to collect the results.

Well that is the point of asynchronous processing, right: invoke a thread and then continue to do other work in main thread until work is complete. In our case, the additional processing we want to do is to start more threads or at least submit them to the queue and then we are satisfied in waiting for the results. Here is a look at the .NET asynchronous programming methodology for accomplishing what was done in my implementation of ATask by creating my BeginTask and EndTask routines. This should look very similar as asynchronous methods within .NET employ the same logic of beginning/invoking action with one method then ending/responding through another. The difference is that .NET will create all these methods for us versus my manually coded ones in previous post.

...
    Delegate Function TaskDelegate(ByVal x As String) As String

    Sub Main()
        Dim str As String = "test"
        Dim t As TaskDelegate = New TaskDelegate(AddressOf ATask)
        Dim ar As IAsyncResult = t.BeginInvoke(str, Nothing, Nothing)
        Console.WriteLine(t.EndInvoke(ar))
        ar = t.BeginInvoke(str + " second run!", New AsyncCallback(AddressOf ACallback), t)
        System.Threading.Thread.Sleep(5000L)
    End Sub

    Function ATask(ByVal x As String) As String
        Return x.Trim()
    End Function

    Sub ACallback(ByVal ar As IAsyncResult)
        Dim t As TaskDelegate = DirectCast(ar.AsyncState, TaskDelegate)
        Console.WriteLine(t.EndInvoke(ar))
    End Sub

...

The first piece of this code starts with the creation of worker function/class like the ATask.Task method from my last post. Above, this is represented by my highly complicated method ATask(String). Secondly, a delegate is declared with the same signature as function/sub routine created to do work.

As stated earlier, .NET will take care of creating the BeginInvoke and EndInvoke methods appropriately. Subsequently, the first call to t.BeginInvoke starts the processing of our task with the specified parameter data contained in the variable str. The t.EndInvoke returns a String as we instructed through the signature of method, so in our simple example, the result is fed right to console.

For more of a "Fire and Forget" implementation, the remaining parameters, we initially ignored in t.BeginInvoke by passing "Nothing" or null, are used. The first of the two is used to specify a callback method that has signature like ACallback. This method handles calling the t.EndInvoke. The second omitted parameter is used to pass the invocation the delegate object itself. As you will see in the second call to t.BeginInvoke, we are able to start the processing of work and then the results will be collected when ready by the callback method.

Where's The Pool
If you have been following along and aren't experts at this, you may be wondering what happening to the thread pool as we have been promoting the use of pooling since our worker thread count can go from one to 10,000. The fixed thread pool is great for ensuring that our system/application stays stable, so why eliminate it. Well, from what I have gathered, asynchronous threads are always run in the shared thread pool provided by the Common Language Runtime (CLR). Therefore, my understanding is that we do not need to submit these tasks to user work item queue.

With all this swimming around in a shared thread pool, there is a potential that managed code or other long running processes may be utilizing threads needed by this specific application or vice versa, so one thought is to use custom thread pooling. Custom thread pool development is not covered in this post, but I have found many different references to a fully implemented C# code from Mike Woodring. This would allow for processes using shared thread pool that we can't control to have adequate resources while ensuring that our application does not create an exorbitant amount of threads.

One interesting thought I had on this was to see if using standard threads if I could effectively code a mechanism in which I launch a finite number of threads (e.g. 15) and then monitor for the completion of any of the threads so that another iteration of the task can be spawned in that thread until all the tasks are complete. This is the concept behind thread pool, but just wondering if it can be done without the entire code associated with a full blown thread pool. This intrigues me as I can think of applications where it may be useful to spawn a set of processes within a thread pool and then to make them run faster spawn numerous non-pooled background threads. This eliminates some possibilities of deadlocks or issues with using shard pool, as our threads would not need to stay in the pool long as they would return rapidly. The pool would just be used for queueing the iteration of main working process while the background threads would be working on finite tasks within a working process instance.

Why not do this all with background threads if we can control their execution to be efficient?

We will see how my crazy mind develops this thought. May be easier just to grab the custom thread pool from Mike Woodring's site and make a reference to a compiled DLL that already has this work done. Why re-invent the wheel, right? Probably why I found other sites that had reference to DevelopMentor.ThreadPool as they just decided to copy his too.

The adventure continues...

Wednesday, September 05, 2007

Calling All Threads

example source code, overview

There is always room for improvement, so I won't say we are at journey's end; however, this is at least in the right vicinity. The original intent of this series was to show concurrent programming in .NET as it was done in a real life Java application. There are components of this transition that are fairly straightforward as some objects conceptually map directly to that of original implementation, despite syntactical differences. My hope is that the information presented below will be useful in bringing the business and programming concepts together as well as point out places where the implementation is not as obvious.

Firstly, let's begin with the creation of the "thread" class. Microsoft .NET does not allow the inheritance of the Thread class or have Runnable or Callable interfaces to implement. Instead, using an additional class or member function within the main object, threads are created through use of a delegate function. Threads are instanced by calling the constructor of System.Threading.Thread, passing it a reference to the delegate function.


Dim Thread1 as New System.Threading.Thread(AddressOf delegateFunction)
Thread1.Start()

However, as we discussed in previous posts, we want to use pooled threads versus multiple threads called in succession. With that said, the format to spawn a thread in .NET into a pool requires the submission of a WaitCallback delegate, which has a signature of delegateFunction(Arg As Object), to the shared System.Threading.ThreadPool object that can be likened (hopefully without ridicule) to the fixed thread pool ExecutorService we utilized in Java. The object argument passed to the delegate function holds both the input parameters and variable(s) for output after thread processing has completed. See the AStateObject class in the linked source files as an example.

The general setup of a delegate function and use of thread pooling is shown below:

...
Sub BeginTask(ByVal thrdNum As Integer, ByVal isVarCalcTime As Boolean)
   state = New AStateObject(thrdNum, isVarCalcTime)
   'ThreadPool is a shared/static system object
'that can be likened to the Java ExecutorService object
   ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf Task), state)
End Sub

Function EndTask() As Object
   semaphore.WaitOne()
   Return state.Status
End Function

Sub Task(ByVal StateObj As Object)
   Dim varCalcTime As Long = AStateObject.DEFAULT_VAR_CALC_TIME
   Dim thrdnum As Integer = AStateObject.DEFAULT_THRD_NUM
   Dim isVarCalcTime As Boolean = AStateObject.DEFAULT_IS_VAR_CALC_TIME
   ' Use the state object fields as arguments.
   Dim StObj As AStateObject = CType(StateObj, AStateObject)
   isVarCalcTime = StObj.IsVarCalcTime
   thrdnum = StObj.ThrdNum
   If isVarCalcTime Then varCalcTime = AStateObject.getRandomCalcTime()

   'Can use SyncLock or other synchronization techniques
   'for real life calc that may hit backend systems
   'Data that may be touched by multiple tasks simultaneously
   Thread.Sleep(2000L + varCalcTime)
   'SyncLock StObj
   StObj.Status = Date.Now() 'Return value
   'End SyncLock
   semaphore.Set()   ' Signal that the thread is done.
End Sub

...

The Task subroutine is the delegate that does the processing we want. This can be coded to perform business logic directly or call a web service or other API that does the processing/calculations for us. The object passed to the delegate is really done by reference despite the method signature as what is passed by value is the address of the object itself; therefore, interactions with this object like setting Status which is the return data object set values in the original instance of AStateObject in our case. The thread pool is then passed a user work item through WaitCallback delegate and the AStateObject associated.

As with the Java implementation, it is most efficient to submit all the threads to the pool for processing then circle back and retrieve results once each thread is ready to yield it. If we wait for thread to finish to get result at same time we add to user work item queue, then we may as well submit threads sequentially or allow the main process to iterate task without overhead of creating more threads. Assuming we want the efficiency, what you will see in the class ATask is that the invocation/queuing of task is done in a BeginTask and then the response is queried through calling EndTask.

The calls from the main process occur in this phased fashion as well, very similar to how it was implemented in Java as we also used three sections there: create threads; read results/wait; display results.

...
    Dim NUM_THRDS As Integer = 100, I As Integer
    Dim tasks(NUM_THRDS) As ATask
    Dim status(NUM_THRDS) As Object

    'Start threads
    For I = 0 To (NUM_THRDS - 1)
        tasks(I) = New ATask()
        tasks(I).BeginTask(I, True)
    Next

    'Get the result of the threads
    For I = 0 To (NUM_THRDS - 1)
        Try
            'Not using any processing that will throw an error,
            'but just wanted to show as an example
            status(I) = tasks(I).EndTask()
        Catch e As Exception
            status(I) = Nothing
        End Try
    Next

    For I = (NUM_THRDS - 1) To 0 Step -1
        If status(I) = Nothing Then status(I) = "Empty"
        Console.WriteLine("Thread #" + I.ToString + ": " + status(I).ToString())
    Next

    'Clean-up objects used
    status = Nothing
    tasks = Nothing

...

The significance to this approach is that after the process has been put into a thread pool, the results cannot be extracted until after the process has fully completed. Therefore, the EndTask function waits on the thread making the middle block of code very much sequential, but since the threads run concurrently before hand it adds minimal if any processing overhead as you should never wait any longer than the longest running thread in the bunch. Since "run time" in the context of longest running process includes the amount of time a thread must sit in the queue, now may be a good time to show the syntax for increasing the thread pool size.


ThreadPool.SetMaxThreads(workerThreads, completionPortThreads)

Both parameters are integers representing the quantity of each type of thread available in the pool. Since .NET employs a shared thread pool, this follows the principal of finding the right number of threads for all concurrent users of the application. As the application is idle, users will have a faster response, then will slow down as load grows to ensure stability of system as too many threads can be as lethal as not enough.

Putting it all together, we have an application that emulates our Java example and yields pretty equal results (see example output). Hopefully this was successful in meeting my intent, which was to place links and thoughts around multi-threading in .NET (for Java developers) especially around concurrency. In trying to stick closely to the layout/concepts in the Java implementation, I am sure I have put together some odd .NET constructs, so all you VB/.NET purists out there must forgive me.

10,000 threads are still better than one if used tactically, especially as we move to a world of even faster response times with asynchronous interfaces. A follow-up to this exercise may be to show how to improve the user interface portion of the application to display some of the results immediately to browser/screen like how many items are in stock today as the program processes the results for items that need to be built for example in our item availability service. We'll see if I can wrap my brain around using the IAsyncResult process in-line with the other concepts here, but, until then...

keep evolving development

Saturday, September 01, 2007

Migrating to J# From Java

overview

Microsoft's J# is like Cappuccino, it is pretty much Java with some extra frosting on top.

Microsoft's J#, syntactically, is very much the same as Java; therefore, when this adventure to migrate the availability code to .NET presented itself, my first thought was "convert to J#." Unfortunately, with the features we explored previously being functionality of JDK level 5.0, J# could not actual accomplish this task.

According to Microsoft, J#'s run-time libraries will remain frozen at the JDK 1.1.4 level; there are no plans to emulate later versions of the JDK. If J# catches on, it's likely that users will fill in the gaps over time. Microsoft considers JDK compatibility largely irrelevant. The current level supports Visual J++ projects, and whatever is missing from the JDK compatibility libraries is covered by the .Net framework.
J#'s interface to the .Net framework is solid, but not as seamless as C#. In particular, J# code cannot define new .Net properties, events, value types, or delegates. J# can make use of these language constructs if they are defined in an assembly written in another language, but its inability to define new ones limits J#'s reach and interoperability compared to other .Net languages.
-Tom Yager, Infoworld

Presumably due to legality issues restricting Microsoft from copying Java bytecode for bytecode, coupled with strategic desire to push developers towards using core .NET functionality to accomplish the tasks done in later Java versions, the J# support for newer JDKs will probably not be updated to cover the holes in it current capabilities.

With that said, for my case study, I have come to the realization that code must be re-written in .NET and it will be in Visual Basic. C# is probably the more powerful than Visual Basic .NET; however, the target business system being written in Visual Basic makes it the more desirable for this project.

Don't get me wrong, though, there is a good deal of power in being able to compile java directly into .NET using J#, but just be aware of the limitations. It took me a while to stop seeing 1.1.4 and thinking 1.4, so for the application I was converting there was too much of a gap.

For you developers who have Java applications that would be hard to re-write, J# .NET may fit the need, especially when using tools such as IKVM or JBImp that allow conversion of Java bytecode to Microsoft intermediate language (MSIL). However, this may also be a good argument for using C#, which syntactically is close to that of Java as C/C++ is. After converting Java classes or archive (JAR) files to .NET libraries, migration of application to C# becomes easier.

Crossed Logic