Friday, September 14, 2007

Free ColdFusion Server

Since I have let the cat out of the bag that I am a bit of a Java enthusiast, I may as well admit to being a fan of ColdFusion which integrates with Java real well. You can use object-oriented constructs in ColdFusion as well as use Java to extend functionality or provide framework that can be executed easily from the web interface.

Anyway, I just saw that New Atlanta released a new version of their ColdFusion server which is free. There are some restrictions on tag support that can be overcome by buying their full blown version with J2EE capabilities or Adobe's ColdFusion server.

Either choice, one good point is that ColdFusion and Java integrate real well to allow designers use to the web interface niceties of ColdFusion Markup Language (CFML) while leveraging business logic code written in Java.

For example you can create a normal Java class as follows:


package com.mydomain;
public class MyClass {
public int getInt(){ return 1; }
}

Then you can create and use the object in CF through a few tags:


<cfset mc = createObject("java","com.mydomain.MyClass") />
<cfset myint = mc.getInt() />

CFScript can be pretty useful in making this flow similar to syntax from Java itself, but using the CF tags aren't a bad way to go either especially if the point is separating duties between developers use to Java code and database design from web designers use to HTML style.

Happy coding!

Thursday, September 06, 2007

Concurrent Threading Series (J# Revisited)

After discussing the .NET approach to threading and asynchronous processing, I figured I would circle back to my goal of showing in J# how threading would be accomplished in the same manner it was in our Java application example. Here is what the code looks like (mostly from the different Microsoft references on MSDN web-site. The interesting thing I found is that the use of IAsyncResult is slightly different in Jsharp and requires the use of an additional object System.Runtime.Remoting.Messaging.AsyncResult.


import System.Threading.ThreadPool;
import System.Threading.WaitHandle;
import System.Runtime.Remoting.Messaging.AsyncResult;
import System.IAsyncResult;
import System.AsyncCallback;

/** @delegate */
public delegate String taskDelegate(String x);

public class Example
{
public static int NUM_THRDS_CONCURRENT = 60;
/**
* Initializations to be done only once by application
* Setup thread pool globally.
*/
static
{
int workerThreads = 0, completionPortThreads = 0;
ThreadPool.GetMaxThreads(workerThreads, completionPortThreads);
//If current pool size is less than our maximum, change the pool size
boolean changePoolSize = false;
if (workerThreads < NUM_THRDS_CONCURRENT)
{
workerThreads = NUM_THRDS_CONCURRENT;
changePoolSize = true;
}
if (completionPortThreads < NUM_THRDS_CONCURRENT)
{
completionPortThreads = NUM_THRDS_CONCURRENT;
changePoolSize = true;
}
if (changePoolSize) ThreadPool.SetMaxThreads(workerThreads, completionPortThreads);
}

public Example()
{
super();
}

public static void main() {
Example instance = new Example();
System.out.println("Processing started " + System.DateTime.get_Now().ToString());
instance.execute();
instance = null;
}

public void execute()
{
int NUM_THRDS = 100;
taskDelegate tasks[] = new taskDelegate[NUM_THRDS];
IAsyncResult ars[] = new IAsyncResult[NUM_THRDS];
WaitHandle _handles[] = new WaitHandle[NUM_THRDS];

for(int i=0; i<NUM_THRDS; i++) {
tasks[i] = new taskDelegate(task);
ars[i] = tasks[i].BeginInvoke("Thread #" + i,
new AsyncCallback(aCallback),
tasks[i]
);
_handles[i] = ars[i].get_AsyncWaitHandle();
}

// use wait handles to ensure threads complete before continuing
// in this example we are just printing data,
// but this is useful if after async calls you need to sort/manipulate results.
for (int i = 0; i < NUM_THRDS; i++)
{
_handles[i].WaitOne();
}

try
{
Thread.sleep(5000L);
}
catch (InterruptedException e)
{
// exception handling
}

tasks = null;
ars = null;
}

/* Asynchronous callback method used to call EndInvoke and process results */
public void aCallback(IAsyncResult ar) {
taskDelegate t = (taskDelegate)ar.get_AsyncState();
AsyncResult aResult = (AsyncResult)ar;
taskDelegate temp = (taskDelegate)(aResult.get_AsyncDelegate());
System.out.println(temp.EndInvoke(ar));
t = null; temp = null; aResult = null;
}

public String task(String x)
{
try {
Thread.sleep(2000L);
} catch(InterruptedException e) {
// exception handling
}
return (x.trim() + ": " + System.DateTime.get_Now().ToString());
}
}

I tried to correlate other examples/practices in here like the use of a static initializer to set the size of the pool or other data/objects you want done once, but the remainder of the code is straight from documentation on the different .NET classes as mentioned above. Note the use of the AsyncWaitHandles, as I have found it is a good way to signal the calling thread when all the Async activities have completed. Otherwise, you can start a loop from the number of total threads and do i-- (decrement counter) within AsyncCallback method to indicate the number of threads still running.

Bringing The Background To The Foreground

One point to make on multi-threading applications if I have not done a good job of making it clear is that it should be done with a good deal of responsibility with regarding to synchronization or protection against threads colliding with each other, exception handling, and resource management.

With the discussion on using thread pool or normal threads mentioned at the end of my last post, I realized I used the term "background" threads a good deal, but did not show how to implement one or what the difference was between a thread running in the background according to my definition and a background thread as implemented in programming language.

A thread is either a background thread or a foreground thread. Background threads are identical to foreground threads, except that background threads do not prevent a process from terminating. Once all foreground threads belonging to a process have terminated, the common language run-time ends the process. Any remaining background threads are stopped and do not complete. -MSDN

Threads that belong to the managed thread pool (that is, threads whose IsThreadPoolThread property is true) are background threads. All threads that enter the managed execution environment from unmanaged code are marked as background threads. All threads generated by creating and starting a new Thread object are by default foreground threads. -MSDN


If you follow the first link above, the MSDN library entry has the following code for illustrating the difference in how the background and foreground threads are treated at termination of application.

Imports System
Imports System.Threading

Public Class Test

<MTAThread> _
Shared Sub Main()
Dim shortTest As New BackgroundTest(10)
Dim foregroundThread As New Thread(AddressOf shortTest.RunLoop)
foregroundThread.Name = "ForegroundThread"

Dim longTest As New BackgroundTest(50)
Dim backgroundThread As New Thread(AddressOf longTest.RunLoop)
backgroundThread.Name = "BackgroundThread"
backgroundThread.IsBackground = True

foregroundThread.Start()
backgroundThread.Start()
End Sub

End Class

Public Class BackgroundTest

Dim maxIterations As Integer

Sub New(maximumIterations As Integer)
maxIterations = maximumIterations
End Sub

Sub RunLoop()
Dim threadName As String = Thread.CurrentThread.Name

For i As Integer = 0 To maxIterations
Console.WriteLine("{0} count: {1}", _
threadName, i.ToString())
Thread.Sleep(250)
Next i

Console.WriteLine("{0} finished counting.", threadName)
End Sub

End Class

As stated previously, I have found "background" threads really useful for statistical analysis/logging activities that are not necessary for functionality of user request, but need data from request to get started. So the user doesn't wait for long time for a response, the application calculates results for user while spawning a background thread for performing business logic, file I/O, database calls, etc. necessary.

This works well keeping in mind whether or not it is important to you to maintain any statistical threads still running at application shutdown. With .NET typically being hosted on a Microsoft IIS server where application pools cycle periodically, the need to use foreground thread or programmatic handling of background threads on application end (global.asax) may be more prevalent.

I will try to be careful on how I use the term "background" as I tend to use this to refer to any thread that does not need to run while a user is waiting/watching.

Anyway, just a little tidbit of information to add to the arsenal.

Code Formatter for Blogs/Web Pages

Since I have been copying and pasting in code recently pretty frequently and trying to format, I thought I should post a really cool web tool I found that is great for formatting code snippets for placement in blog posts or HTML in general. If you have not seen the new look on my previous posts that I have retrofitted using this tool, take a look and see.

http://www.manoli.net/csharpformat/

You will find the tool is good for VB and C#. Since Java is very close in color coding scheme to C#, I used the tool with C# selection and worked out decently.

The formatting uses CSS, so you will have to copy the following CSS code into the template page of your blog or update the linked CSS file from the site with this code and reference as indicated.


.csharpcode, .csharpcode pre
{
font-size: small;
color: black;
font-family: Consolas, "Courier New", Courier, Monospace;
background-color: #f4f4f4;
white-space: pre-wrap; /* css-3 */
white-space: -moz-pre-wrap; /* Mozilla, since 1999 */
white-space: -pre-wrap; /* Opera 4-6 */
white-space: -o-pre-wrap; /* Opera 7 */
word-wrap: break-word; /* Internet Explorer 5.5+ */
}

.csharpcode pre { margin: 0em; }

.csharpcode .rem { color: #008000; }

.csharpcode .kwrd { color: #0000ff; }

.csharpcode .str { color: #006080; }

.csharpcode .op { color: #0000c0; }

.csharpcode .preproc { color: #cc6633; }

.csharpcode .asp { background-color: #ffff00; }

.csharpcode .html { color: #800000; }

.csharpcode .attr { color: #ff0000; }

.csharpcode .alt
{
background-color: #f4f4f4;
width: 100%;
margin: 0em;
}

.csharpcode .lnum { color: #606060; }

Update adds text wrap for browsers that don't automatically do so with use of the pre HTML element.

Calling All Threads Again (Async Revisited)

overview, calling all threads

So in the last post, we illustrated how to recreate Java code for parallel processing in Visual Basic .NET. One of the design features used in the implementation of retrieving data from task submitted to thread pool work item queue was to use a wait object to ensure that the thread completed before trying to collect the results.


Well that is the point of asynchronous processing, right: invoke a thread and then continue to do other work in main thread until work is complete. In our case, the additional processing we want to do is to start more threads or at least submit them to the queue and then we are satisfied in waiting for the results. Here is a look at the .NET asynchronous programming methodology for accomplishing what was done in my implementation of ATask by creating my BeginTask and EndTask routines. This should look very similar as asynchronous methods within .NET employ the same logic of beginning/invoking action with one method then ending/responding through another. The difference is that .NET will create all these methods for us versus my manually coded ones in previous post.

...
Delegate Function TaskDelegate(ByVal x As String) As String

Sub Main()
Dim str As String = "test"
Dim t As TaskDelegate = New TaskDelegate(AddressOf ATask)
Dim ar As IAsyncResult = t.BeginInvoke(str, Nothing, Nothing)
Console.WriteLine(t.EndInvoke(ar))
ar = t.BeginInvoke(str + " second run!", New AsyncCallback(AddressOf ACallback), t)
System.Threading.Thread.Sleep(5000L)
End Sub

Function ATask(ByVal x As String) As String
Return x.Trim()
End Function

Sub ACallback(ByVal ar As IAsyncResult)
Dim t As TaskDelegate = DirectCast(ar.AsyncState, TaskDelegate)
Console.WriteLine(t.EndInvoke(ar))
End Sub

...

The first piece of this code starts with the creation of worker function/class like the ATask.Task method from my last post. Above, this is represented by my highly complicated method ATask(String). Secondly, a delegate is declared with the same signature as function/sub routine created to do work.

As stated earlier, .NET will take care of creating the BeginInvoke and EndInvoke methods appropriately. Subsequently, the first call to t.BeginInvoke starts the processing of our task with the specified parameter data contained in the variable str. The t.EndInvoke returns a String as we instructed through the signature of method, so in our simple example, the result is fed right to console.

For more of a "Fire and Forget" implementation, the remaining parameters, we initially ignored in t.BeginInvoke by passing "Nothing" or null, are used. The first of the two is used to specify a callback method that has signature like ACallback. This method handles calling the t.EndInvoke. The second omitted parameter is used to pass the invocation the delegate object itself. As you will see in the second call to t.BeginInvoke, we are able to start the processing of work and then the results will be collected when ready by the callback method.


Where's The Pool
If you have been following along and aren't experts at this, you may be wondering what happening to the thread pool as we have been promoting the use of pooling since our worker thread count can go from one to 10,000. The fixed thread pool is great for ensuring that our system/application stays stable, so why eliminate it. Well, from what I have gathered, asynchronous threads are always run in the shared thread pool provided by the Common Language Runtime (CLR). Therefore, my understanding is that we do not need to submit these tasks to user work item queue.

With all this swimming around in a shared thread pool, there is a potential that managed code or other long running processes may be utilizing threads needed by this specific application or vice versa, so one thought is to use custom thread pooling. Custom thread pool development is not covered in this post, but I have found many different references to a fully implemented C# code from Mike Woodring. This would allow for processes using shared thread pool that we can't control to have adequate resources while ensuring that our application does not create an exorbitant amount of threads.

One interesting thought I had on this was to see if using standard threads if I could effectively code a mechanism in which I launch a finite number of threads (e.g. 15) and then monitor for the completion of any of the threads so that another iteration of the task can be spawned in that thread until all the tasks are complete. This is the concept behind thread pool, but just wondering if it can be done without the entire code associated with a full blown thread pool. This intrigues me as I can think of applications where it may be useful to spawn a set of processes within a thread pool and then to make them run faster spawn numerous non-pooled background threads. This eliminates some possibilities of deadlocks or issues with using shard pool, as our threads would not need to stay in the pool long as they would return rapidly. The pool would just be used for queueing the iteration of main working process while the background threads would be working on finite tasks within a working process instance.

Why not do this all with background threads if we can control their execution to be efficient?

We will see how my crazy mind develops this thought. May be easier just to grab the custom thread pool from Mike Woodring's site and make a reference to a compiled DLL that already has this work done. Why re-invent the wheel, right? Probably why I found other sites that had reference to DevelopMentor.ThreadPool as they just decided to copy his too.

The adventure continues...

Wednesday, September 05, 2007

Calling All Threads

example source code, overview

There is always room for improvement, so I won't say we are at journey's end; however, this is at least in the right vicinity. The original intent of this series was to show concurrent programming in .NET as it was done in a real life Java application. There are components of this transition that are fairly straightforward as some objects conceptually map directly to that of original implementation, despite syntactical differences. My hope is that the information presented below will be useful in bringing the business and programming concepts together as well as point out places where the implementation is not as obvious.

Firstly, let's begin with the creation of the "thread" class. Microsoft .NET does not allow the inheritance of the Thread class or have Runnable or Callable interfaces to implement. Instead, using an additional class or member function within the main object, threads are created through use of a delegate function. Threads are instanced by calling the constructor of System.Threading.Thread, passing it a reference to the delegate function.


Dim Thread1 as New System.Threading.Thread(AddressOf delegateFunction)
Thread1.Start()

However, as we discussed in previous posts, we want to use pooled threads versus multiple threads called in succession. With that said, the format to spawn a thread in .NET into a pool requires the submission of a WaitCallback delegate, which has a signature of delegateFunction(Arg As Object), to the shared System.Threading.ThreadPool object that can be likened (hopefully without ridicule) to the fixed thread pool ExecutorService we utilized in Java. The object argument passed to the delegate function holds both the input parameters and variable(s) for output after thread processing has completed. See the AStateObject class in the linked source files as an example.

The general setup of a delegate function and use of thread pooling is shown below:

...
Sub BeginTask(ByVal thrdNum As Integer, ByVal isVarCalcTime As Boolean)
state = New AStateObject(thrdNum, isVarCalcTime)
'ThreadPool is a shared/static system object
'that can be likened to the Java ExecutorService object
ThreadPool.QueueUserWorkItem(New WaitCallback(AddressOf Task), state)
End Sub

Function EndTask() As Object
semaphore.WaitOne()
Return state.Status
End Function

Sub Task(ByVal StateObj As Object)
Dim varCalcTime As Long = AStateObject.DEFAULT_VAR_CALC_TIME
Dim thrdnum As Integer = AStateObject.DEFAULT_THRD_NUM
Dim isVarCalcTime As Boolean = AStateObject.DEFAULT_IS_VAR_CALC_TIME
' Use the state object fields as arguments.
Dim StObj As AStateObject = CType(StateObj, AStateObject)
isVarCalcTime = StObj.IsVarCalcTime
thrdnum = StObj.ThrdNum
If isVarCalcTime Then varCalcTime = AStateObject.getRandomCalcTime()

'Can use SyncLock or other synchronization techniques
'for real life calc that may hit backend systems
'Data that may be touched by multiple tasks simultaneously
Thread.Sleep(2000L + varCalcTime)
'SyncLock StObj
StObj.Status = Date.Now() 'Return value
'End SyncLock
semaphore.Set() ' Signal that the thread is done.
End Sub

...

The Task subroutine is the delegate that does the processing we want. This can be coded to perform business logic directly or call a web service or other API that does the processing/calculations for us. The object passed to the delegate is really done by reference despite the method signature as what is passed by value is the address of the object itself; therefore, interactions with this object like setting Status which is the return data object set values in the original instance of AStateObject in our case. The thread pool is then passed a user work item through WaitCallback delegate and the AStateObject associated.

As with the Java implementation, it is most efficient to submit all the threads to the pool for processing then circle back and retrieve results once each thread is ready to yield it. If we wait for thread to finish to get result at same time we add to user work item queue, then we may as well submit threads sequentially or allow the main process to iterate task without overhead of creating more threads. Assuming we want the efficiency, what you will see in the class ATask is that the invocation/queuing of task is done in a BeginTask and then the response is queried through calling EndTask.

The calls from the main process occur in this phased fashion as well, very similar to how it was implemented in Java as we also used three sections there: create threads; read results/wait; display results.

...
Dim NUM_THRDS As Integer = 100, I As Integer
Dim tasks(NUM_THRDS) As ATask
Dim status(NUM_THRDS) As Object

'Start threads
For I = 0 To (NUM_THRDS - 1)
tasks(I) = New ATask()
tasks(I).BeginTask(I, True)
Next

'Get the result of the threads
For I = 0 To (NUM_THRDS - 1)
Try
'Not using any processing that will throw an error,
'but just wanted to show as an example
status(I) = tasks(I).EndTask()
Catch e As Exception
status(I) = Nothing
End Try
Next

For I = (NUM_THRDS - 1) To 0 Step -1
If status(I) = Nothing Then status(I) = "Empty"
Console.WriteLine("Thread #" + I.ToString + ": " + status(I).ToString())
Next

'Clean-up objects used
status = Nothing
tasks = Nothing

...

The significance to this approach is that after the process has been put into a thread pool, the results cannot be extracted until after the process has fully completed. Therefore, the EndTask function waits on the thread making the middle block of code very much sequential, but since the threads run concurrently before hand it adds minimal if any processing overhead as you should never wait any longer than the longest running thread in the bunch. Since "run time" in the context of longest running process includes the amount of time a thread must sit in the queue, now may be a good time to show the syntax for increasing the thread pool size.

ThreadPool.SetMaxThreads(workerThreads, completionPortThreads)

Both parameters are integers representing the quantity of each type of thread available in the pool. Since .NET employs a shared thread pool, this follows the principal of finding the right number of threads for all concurrent users of the application. As the application is idle, users will have a faster response, then will slow down as load grows to ensure stability of system as too many threads can be as lethal as not enough.

Putting it all together, we have an application that emulates our Java example and yields pretty equal results (see example output). Hopefully this was successful in meeting my intent, which was to place links and thoughts around multi-threading in .NET (for Java developers) especially around concurrency. In trying to stick closely to the layout/concepts in the Java implementation, I am sure I have put together some odd .NET constructs, so all you VB/.NET purists out there must forgive me.

10,000 threads are still better than one if used tactically, especially as we move to a world of even faster response times with asynchronous interfaces. A follow-up to this exercise may be to show how to improve the user interface portion of the application to display some of the results immediately to browser/screen like how many items are in stock today as the program processes the results for items that need to be built for example in our item availability service. We'll see if I can wrap my brain around using the IAsyncResult process in-line with the other concepts here, but, until then...

keep evolving development

Saturday, September 01, 2007

Migrating to J# From Java

overview

Microsoft's J# is like Cappuccino, it is pretty much Java with some extra frosting on top.


Microsoft's J#, syntactically, is very much the same as Java; therefore, when this adventure to migrate the availability code to .NET presented itself, my first thought was "convert to J#." Unfortunately, with the features we explored previously being functionality of JDK level 5.0, J# could not actual accomplish this task.

According to Microsoft, J#'s run-time libraries will remain frozen at the JDK 1.1.4 level; there are no plans to emulate later versions of the JDK. If J# catches on, it's likely that users will fill in the gaps over time. Microsoft considers JDK compatibility largely irrelevant. The current level supports Visual J++ projects, and whatever is missing from the JDK compatibility libraries is covered by the .Net framework.

J#'s interface to the .Net framework is solid, but not as seamless as C#. In particular, J# code cannot define new .Net properties, events, value types, or delegates. J# can make use of these language constructs if they are defined in an assembly written in another language, but its inability to define new ones limits J#'s reach and interoperability compared to other .Net languages.

-Tom Yager, Infoworld

Presumably due to legality issues restricting Microsoft from copying Java bytecode for bytecode, coupled with strategic desire to push developers towards using core .NET functionality to accomplish the tasks done in later Java versions, the J# support for newer JDKs will probably not be updated to cover the holes in it current capabilities.

With that said, for my case study, I have come to the realization that code must be re-written in .NET and it will be in Visual Basic. C# is probably the more powerful than Visual Basic .NET; however, the target business system being written in Visual Basic makes it the more desirable for this project.

Don't get me wrong, though, there is a good deal of power in being able to compile java directly into .NET using J#, but just be aware of the limitations. It took me a while to stop seeing 1.1.4 and thinking 1.4, so for the application I was converting there was too much of a gap.

For you developers who have Java applications that would be hard to re-write, J# .NET may fit the need, especially when using tools such as IKVM or JBImp that allow conversion of Java bytecode to Microsoft intermediate language (MSIL). However, this may also be a good argument for using C#, which syntactically is close to that of Java as C/C++ is. After converting Java classes or archive (JAR) files to .NET libraries, migration of application to C# becomes easier.


Thursday, August 30, 2007

Jump In The Pool, The Threads Are Fine

example source code, overview, java part I

10,000 soldiers may be more powerful than one, but, if they march in a straight line towards enemy fire, they are no more effective.

Threads are the same. Marching them in regiments versus a line adds power and efficiency at the same time.


Thread Pooling
In Java 5.0 (JRE 1.5), the java.util.concurrent package was made available which provided API for concurrent processing/efficient thread pooling. For some good details on the API, please refer to Java/Sun publications on JSR 166. The Distributed Computing Laboratory at Emory University has further documentation as well as a back-ported version of the package for use with older versions of JVM.

To implement the concurrent/thread pool model to the ArrayOfThreads object we created in previous post, the ExecutorService is utilized to invoke a Callable object (introduced in the concurrent package). AThread happens to be a Callable object versus a simple sub-class of Thread object which just has a run() method without a return. Utilized the Callable even in the earlier examples as the end goal is to be able to gather data back from the execution of this process.

Consequently, after submission of threads for processing, the framework for concurrency provides a mechanism to retrieve returned data per thread using the Future object. The major difference this makes in the implementation is that the Future handles the waiting and gathering of thread data, so the main application thread spawning the new processes does not have to wait until each thread completes before moving on.


...

/* code snippet from ArrayOfThreadsPooled.java
* @see java.util.concurrent.Executors
* @see java.util.concurrent.ExecutorService
* @see java.util.concurrent.Future
*/
int NUM_THRDS = 100, NUM_THRDS_POOLED = 100;
ExecutorService tpes = Executors.newFixedThreadPool(NUM_THRDS_POOLED);
Future futures[] = new Future[NUM_THRDS];
AThread calculators[] = new AThread[NUM_THRDS];
Object status[] = new Object[NUM_THRDS];

...

Later on using a for loop to instantiate the threads, we submit each to pool (where i represents the integer index used in loop).

calculators[i] = new AThread(i);
futures[i] = tpes.submit(calculators[i]);

On the first implementation of ArrayOfThreadsPooled we see that the process impracticality of running multiple threads has been removed as the process time is about a hundredth of the time since all 100 threads execute simultaneously (see output example 1).


Don't Be A Resource Hog
However, the problem of scaling is still an issue as your business systems involved in processing and application server itself will probably not appreciate the execution of 10,000 threads. To remedy this per user, the ArrayOfThreadsPooledv2 is enhanced by illustrating the fact that the fixed thread pool does not need to have the same size as the required amount of Callables, Futures, et al. Therefore, the NUM_THRDS_POOLED value can be changed to a lower or higher number than 100.

For example I found that my application worked with a total number of 15 processors running at a time (see output example 2). To better understand the benefit, the AThread class provided includes the ability to provide variable processing times. With variance in execution times, the thread pool with stop and start threads appropriately thus adding even more efficiencies over executing x number of threads at a time with identical run times (see output example 3).


Back 2 The Future
Through the Future get() method wrapped with exception handling, we can retrieve the returned object/data from the Callable later on.

Since the Futures involved take care of getting data, as a developer you get the benefit of all the threads starting at once, but having a method of identifying which response goes to which request. With this information, you can then sort, analyze, manipulate or otherwise report the data however you see fit.

Some further enhancements to this would be o find the optimum amount of threads needed to support all requests from your client base and instantiate the ExecutorService as one static version defined globally for the application.

public static int NUM_THRDS_CONCURRENT = 60;
public static ExecutorService tpes =
Executors.newFixedThreadPool(NUM_THRDS_CONCURRENT);

Each instance of ArrayOfThreadsPooledv2 that were executed would, therefore, submit its number of Callable objects necessary into the shared pool.

futures[i] = ArrayOfThreadsPooledv2.tpes.submit(calculators[i]);

If no other users were on, the response may be as quick as first iteration of ArrayOfThreadsPooled while providing the stability and respect of resources provided by second implementation.

With the source code and detail to this point, I hope those of you on the journey towards creating a multi-threaded application in Java have found enough useful information to utilize the full power of Java's great addition. Keep in mind that further enhancements exist in JRE 1.6.

Keep evolving development!

10,000 Threads Are Better Than One

example source code, overview
In this post we will be following the business logic for using multi-threading and concurrency as we explore the expansion of the availability service from its basic form to one extended for more practical application presented in the previous posting.

[basic availability service] an application that communicated with business systems (enterprise resource planning - inventory management in particular) to read current stock information, including current demand allocations; compare to new single demand; ultimately, displaying lead time to fulfill customer requirement whether from stock, assembling from stocked components on a bill of material, or purchasing/fabricating item through the portions if not all of its cumulative lead time...

Since the business world does not exist in a parallel universe where customer demand and actual supply are always in constant harmony, the basic implementation for availability above is insufficient as a final answer to a demand requirement that has differing priorities (e.i. fulfill two or more needs at once). To clarify, please re-read the following excerpt:

[extended availability service] As a customer, you may have an immediate need for five light bulbs or else you will be in darkness for the entire day, but you get cost efficiencies purchasing light bulbs in quantities of 1,000. The purpose of this extension to the availability service is to allow you as the customer understand that you can get 50 from stock today...

The first need of concern is to restore service: get the lights on! The second priority is to ensure that the effort to restore service is done in a fashion that positively impacts the bottom-line long term as well as lessen the likelihood of future outage. At its most basic sense, the response to the above variation in need could involve two calls to availability code: one returning lead time of quantity acquirable through stock (e.g. same day); the other, time through procurement/manufacturing processes (e.g. 60 days).

However, if you add usage and more complex items an scenarios to your thought process over my simple light bulb shortage, the application requirement slowly becomes (or at least it did in my case) one that asks the question "what is the individual lead time of each quantity of the total 1,000 light bulbs requested?"


An Array of Threads
Following the thought above, to answer the question, a user of the availability tool would run sequential requests with required quantities from 1 - 1,000. The first step is to change the process of making the requests an automated one versus having to make 1,000 separate requests. The user would like to query the system once and get 1,000 responses. With the potential for thousands or tens of thousands, the first logical conclusion I got was "10,000 threads are better than one."

The example thread AThread does a simple wait(x milliseconds) to simulate some processing time, but can easily be modified to be a synchronized (depending on locking requirements) block of code executing logic necessary to determine a specific item and required quantity availability. For example:


...

/* call() method for AThread.java
* @see java.util.concurrent.Callable#call()
*/
public Object call() throws Exception {
// Define your own result object and availability processor
LeadTimeResultObject ltro = new LeadTimeResultObject();
AvailabilityProcessor process = new AvailabilityProcessor();
// Use synchronized blocks for thread safety
synchronized(this){
// Insert application logic to get lead time
ltro = process.getLeadTimeResult("some item id", "some quantity");
}
return ltro;
}

...

The ArrayOfThreads class illustrates how to implement calling the lead time availability code through use of an AThread array.

AThread calculators[] = new AThread[100];

Using a for loop to iterate from 0 to 99 index of array, you can use the Constructor of the AThread object to pass in data needed to make each thread calculation unique (e.g. item id and quantity equal to 1 to 100 - array index + 1). Although this accomplishes the first task of automating the requests, each thread must complete serially making the net result of the application performance exactly that of having the main thread executing one availability process 100 times (see output example 1: processing times go in sequence and so total processing time is long).

In addition, this will most likely crash your application and/or back-end systems as user quantity requirements scale upward.

I know this is not very practical for the use case we are exploring, but another component of the application I wrote was used to keep statistics on requests for availability. These statistics were needed for later analysis and did not need to be sent back to user. Applying the methodology of thread arrays for background processing that needs to be done at some point is a perfect fit. User leaves the site or at least is satisfied by a response, while additional business logic is applied as more data is gathered and stored in a business intelligence/reporting back-end.

Thread pooling or utilizing all the threads simultaneously would answer both the automation of user process and most efficient run time concerns. See next post in the concurrent threading series for a detail on the ArrayOfThreadsPooled object included in the source code available for download above.

The Adventure Begins (Concurrent Threading Series)

Multi-threading background processes is very cool for spawning large calculations that write to statistical/business intelligence data stores while responding other data quickly to user from the main thread; however, when you need to do some medium to long running calculations multiple times and return a sorted/combined result set to your user without feeding into the attention deficit disorder inside all of us, multi-threaded background processes become even cooler.


Background
A practical or real life application for this would be an application that communicated with business systems (enterprise resource planning - inventory management in particular) to read current stock information, including current demand allocations; compare to new single demand; ultimately, displaying lead time to fulfill customer requirement whether from stock, assembling from stocked components on a bill of material, or purchasing/fabricating item through the portions if not all of its cumulative lead time.

Furthermore, the delivery method on this application to client was via the web, so a few seconds processing time is a long wait to many, so response time must be considered in any alterations.

Now consider a secondary request interface for this application that needs to determine separate responses for the same demand above. This was used by Customer Service/Sales Representatives who need to know the breakdown of lead time(s) from satisfying quantity of one to the total requested. As a customer, you may have an immediate need for five light bulbs or else you will be in darkness for the entire day, but you get cost efficiencies purchasing light bulbs in quantities of 1000. The purpose of this extension to the availability service is to allow you as the customer understand that you can get 50 from stock today, 250 built within a week, and 700 available two months from now. This is much more useful to satisfying both business needs than getting one response that indicates to get a complete order 1000 would be two months.


The Problem
Several months ago, during a migration from one business enabling system to another, I was faced with a need to architect this concurrent programming accomplished in Java as a .NET application. My initial thought was to leverage and reuse as much of the Java code I had already written to lessen the development cycle for this change and to ensure that the logic was implemented as close to original as possible then improve from there once integration was established. What did this mean? Microsoft J# to the rescue! Or at least so I hoped.

Microsoft’s J# does not support Java 5.0 as it is based on previous JRE version. This road blocked the effort immensely as it was not clear to me if Microsoft .NET even had the capability to perform the same concurrency programming and/or how, and so began the adventure to acquire knowledge.


The Adventure
The following set of posts try to go through thought process behind the multi-threading being done in the original Java implementation and hopefully arrive at the most efficient means of transforming to a .NET implementation. Whether or not my journey to converting from Java to .NET is successful, I figured this information may be helpful for new (or at least new to threading) developers that are looking for methodologies to accomplish what I have done in either language.

So buckle up and let’s explore together the migration of concurrent processing code from Java technology to J# and .NET in general.

Monday, June 18, 2007

Deleting Multiple Views/Tables by Pattern in Microsoft SQL Server

If you have ever used Microsoft SQL Server tools to work with tables and/or views you will know that you cannot select multiple items for deletion (or at least not that I have found) through the UI. A good feature to ensure a unsuspecting user doesn't accidentally delete all the data from your production server, right?

Well, suppose you needed to delete 4000+ views from an application database that didn't belong there (long story).

The views all start with the same beginning text so they are right next to each other in the UI for SQL. Now the great feature is the bane of your existence as you hit delete, then confirm, delete, then confirm...

Here is one solution you can use with some simple VB.NET code.


...

Dim PATTERNS() As String = {"prefix1%", "prefix2%", "prefix3%"}
Dim strDatasource As String = _
"Data Source=dbsvr;Initial Catalog=db;User ID=usr;Password=pwd"
Dim strSqlStatement As String = _
"SELECT name FROM sys.sysobjects WHERE (name LIKE '" & PATTERNS(0) & _
"' OR name LIKE '" & PATTERNS(1) & "' OR name LIKE '" & PATTERNS(2) & _
"') AND (xtype = 'V')"
'To make this flexible,
'you can loop through count of patterns items instead of hard coded usage
Dim sqlConnection As New SqlClient.SqlConnection(strDatasource)
Dim sqlCommand As New SqlClient.SqlCommand(strSqlStatement, sqlConnection)

sqlConnection.Open()
Dim rs As SqlClient.SqlDataReader = sqlCommand.ExecuteReader()
Dim namelist As ArrayList = New ArrayList()

While rs.Read()
namelist.Add(rs("name").ToString)
End While

rs.Close()

Dim i As Integer = 0
Dim count As Integer = 0
Dim names() As String = _
CType(namelist.ToArray(GetType(String)), String())

For i = 0 To (names.Length - 1)
If names(i).ToString <> "" Then
sqlCommand.CommandText = _
"DROP VIEW [dbo].[" & names(i) & "]"
count += sqlCommand.ExecuteNonQuery()
End If
Next i

sqlConnection.Close()
sqlCommand = Nothing
rs = Nothing
sqlConnection = Nothing
System.Console.WriteLine(count & " affected!")
System.Threading.Thread.Sleep(5000)
Return

...

This was done quick and dirty as a command line application and so I used the console to write the number of objects actually affected by script. The Thread.Sleep(5000) was used to delay the console screen 5 secs on screen before going away so I could see the result.

This was quick and dirty, so for a production utility, you may want to add other features like exception handling, input validation, and backout capabilities. Input validation would be very key if the program is modified to take a user supplied pattern versus the hard-coded method above. The worry is that someone will send '%' and wipe entire set of views/tables by accident.

Anyway, the script uses the .NET objects for SQL connections to execute a bit of T-SQL which searches the sysobjects table for all the objects that match have a name like 'pattern' and xtype = 'V', restricting search to just views.

Since it is not known the total number to be found, I have opted to use an ArrayList (Vector) approach to allow me to add 0 to many results. Later, the ArrayList is converted to an array of strings which is in turn used in a loop to create dynamic drop statements that are executed on the same SQL server that was queried for the object list in the first place.

Hopefully this helps some poor DBA hitting the delete button 100 times at least once and I will be happy.